All of lore.kernel.org
 help / color / mirror / Atom feed
* Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-01 13:22 ` Coiby Xu
  0 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-01 13:22 UTC (permalink / raw)
  To: linux-gpio
  Cc: Linus Walleij, Nehal Shah, Shyam Sundar S K, linux-kernel-mentees

Hi,

I'm trying to fix broken touchpads [1] for a new laptop model Legion-5
15ARH05 which is shipped with two different touchpads, i.e., ElAN and
Synaptics. For the ELAN touchpad, the kernel receives no interrupts to
be informed of new data from the touchpad. For the Synaptics touchpad,
only 7 interrupts are received per second which makes the touchpad
completely unusable. Based on current observations, pinctrl-amd seems to
be the most suspicious cause.


Why do I think pinctrl-amd smells the most suspicious?
======================================================

This laptop model has the following hardware configurations specified
via ACPI,
  - The touchpad's data interrupt line is connected to pin#130 of a GPIO
    chip

         GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
                         "\\_SB.GPIO", 0x00, ResourceConsumer, ,
                         )
                         {   // Pin list
                             0x0082
                         }

  - This GPIO chip (HID: AMDI0030) which is assigned with IRQ#7 has its
    common interrupt output line connected to one IO-APIC's pin#7

         Interrupt (ResourceConsumer, Level, ActiveLow, Shared, ,, )
         {
             0x00000007,
         }

I add some code to kernel to poll the status of the GPIO chip's pin#130
and IO-APIc's pin#7 every 1ms when I move my finger on the surface of
the Synaptics touchpad continuously for about 1s. During the process of I
move my finger, most of the time,
  - GPIO chip's pin#130: low input, interrupt unmasked
  - IO-APIC's pin#7: IRR=0, interrupt unmasked (in fact mask/unmask_ioapic_irq
    have never been called by the IRQ follow controller handle_fasteoi_irq)

So the touchpad has been generating interrupts most of the time while
IO-APIC controller hasn't been masking the interrupt from the GPIO chip.
But somehow the kernel could only get ~7 interrupts each second while
the touchpad could generate 140 interrupts (time resolution of 7.2ms)
per second. Assuming IO-APIC (arch/x86/kernel/apic/io_apic.c) is fine,
then there's something wrong with the GPIO interrupt controller which
works fine for the touchpad under Windows. Besides if I poll the touchpad
data based on pin#130's status, the touchpad could also work under
Windows.

Ways to debug pinctrl-amd
=========================

I can't find any documentation about the AMDI0030 GPIO chip except for
the commit logs of drivers/pinctrl/pinctrl-amd. One commit
ba714a9c1dea85e0bf2899d02dfeb9c70040427c ("pinctrl/amd: Use regular interrupt instead of chained")
inspired me to bring back chained interrupt to see if "an interrupt storm"
would happen. The only change I noticed is that the interrupts arrive in
pairs. The time internal between two interrupts in a pair is ~0.0016s
but the time internal between interrupt pairs is still ~0.12s (~8Hz).
Unfortunately, I don't get any insight about the GPIO interrupt
controller from this tweaking. I wonder if there are any other ways
to debug drivers/pinctrl/pinctrl-amd?

Thank you!


[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1887190

--
Best regards,
Coiby


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-01 13:22 ` Coiby Xu
  0 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-01 13:22 UTC (permalink / raw)
  To: linux-gpio
  Cc: linux-kernel-mentees, Linus Walleij, Shyam Sundar S K, Nehal Shah

Hi,

I'm trying to fix broken touchpads [1] for a new laptop model Legion-5
15ARH05 which is shipped with two different touchpads, i.e., ElAN and
Synaptics. For the ELAN touchpad, the kernel receives no interrupts to
be informed of new data from the touchpad. For the Synaptics touchpad,
only 7 interrupts are received per second which makes the touchpad
completely unusable. Based on current observations, pinctrl-amd seems to
be the most suspicious cause.


Why do I think pinctrl-amd smells the most suspicious?
======================================================

This laptop model has the following hardware configurations specified
via ACPI,
  - The touchpad's data interrupt line is connected to pin#130 of a GPIO
    chip

         GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
                         "\\_SB.GPIO", 0x00, ResourceConsumer, ,
                         )
                         {   // Pin list
                             0x0082
                         }

  - This GPIO chip (HID: AMDI0030) which is assigned with IRQ#7 has its
    common interrupt output line connected to one IO-APIC's pin#7

         Interrupt (ResourceConsumer, Level, ActiveLow, Shared, ,, )
         {
             0x00000007,
         }

I add some code to kernel to poll the status of the GPIO chip's pin#130
and IO-APIc's pin#7 every 1ms when I move my finger on the surface of
the Synaptics touchpad continuously for about 1s. During the process of I
move my finger, most of the time,
  - GPIO chip's pin#130: low input, interrupt unmasked
  - IO-APIC's pin#7: IRR=0, interrupt unmasked (in fact mask/unmask_ioapic_irq
    have never been called by the IRQ follow controller handle_fasteoi_irq)

So the touchpad has been generating interrupts most of the time while
IO-APIC controller hasn't been masking the interrupt from the GPIO chip.
But somehow the kernel could only get ~7 interrupts each second while
the touchpad could generate 140 interrupts (time resolution of 7.2ms)
per second. Assuming IO-APIC (arch/x86/kernel/apic/io_apic.c) is fine,
then there's something wrong with the GPIO interrupt controller which
works fine for the touchpad under Windows. Besides if I poll the touchpad
data based on pin#130's status, the touchpad could also work under
Windows.

Ways to debug pinctrl-amd
=========================

I can't find any documentation about the AMDI0030 GPIO chip except for
the commit logs of drivers/pinctrl/pinctrl-amd. One commit
ba714a9c1dea85e0bf2899d02dfeb9c70040427c ("pinctrl/amd: Use regular interrupt instead of chained")
inspired me to bring back chained interrupt to see if "an interrupt storm"
would happen. The only change I noticed is that the interrupts arrive in
pairs. The time internal between two interrupts in a pair is ~0.0016s
but the time internal between interrupt pairs is still ~0.12s (~8Hz).
Unfortunately, I don't get any insight about the GPIO interrupt
controller from this tweaking. I wonder if there are any other ways
to debug drivers/pinctrl/pinctrl-amd?

Thank you!


[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1887190

--
Best regards,
Coiby

_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-01 13:22 ` [Linux-kernel-mentees] " Coiby Xu
@ 2020-10-01 20:57   ` Linus Walleij
  -1 siblings, 0 replies; 84+ messages in thread
From: Linus Walleij @ 2020-10-01 20:57 UTC (permalink / raw)
  To: Coiby Xu, Hans de Goede
  Cc: open list:GPIO SUBSYSTEM, Nehal Shah, Shyam Sundar S K,
	linux-kernel-mentees

Sorry for top posting, but I want to page some people.

I do not know anything about ACPI, but Hans de Goede is really
good with this kind of things and could possibly provide some
insight.

Yours,
Linus Walleij

On Thu, Oct 1, 2020 at 3:23 PM Coiby Xu <coiby.xu@gmail.com> wrote:
>
> Hi,
>
> I'm trying to fix broken touchpads [1] for a new laptop model Legion-5
> 15ARH05 which is shipped with two different touchpads, i.e., ElAN and
> Synaptics. For the ELAN touchpad, the kernel receives no interrupts to
> be informed of new data from the touchpad. For the Synaptics touchpad,
> only 7 interrupts are received per second which makes the touchpad
> completely unusable. Based on current observations, pinctrl-amd seems to
> be the most suspicious cause.
>
>
> Why do I think pinctrl-amd smells the most suspicious?
> ======================================================
>
> This laptop model has the following hardware configurations specified
> via ACPI,
>   - The touchpad's data interrupt line is connected to pin#130 of a GPIO
>     chip
>
>          GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
>                          "\\_SB.GPIO", 0x00, ResourceConsumer, ,
>                          )
>                          {   // Pin list
>                              0x0082
>                          }
>
>   - This GPIO chip (HID: AMDI0030) which is assigned with IRQ#7 has its
>     common interrupt output line connected to one IO-APIC's pin#7
>
>          Interrupt (ResourceConsumer, Level, ActiveLow, Shared, ,, )
>          {
>              0x00000007,
>          }
>
> I add some code to kernel to poll the status of the GPIO chip's pin#130
> and IO-APIc's pin#7 every 1ms when I move my finger on the surface of
> the Synaptics touchpad continuously for about 1s. During the process of I
> move my finger, most of the time,
>   - GPIO chip's pin#130: low input, interrupt unmasked
>   - IO-APIC's pin#7: IRR=0, interrupt unmasked (in fact mask/unmask_ioapic_irq
>     have never been called by the IRQ follow controller handle_fasteoi_irq)
>
> So the touchpad has been generating interrupts most of the time while
> IO-APIC controller hasn't been masking the interrupt from the GPIO chip.
> But somehow the kernel could only get ~7 interrupts each second while
> the touchpad could generate 140 interrupts (time resolution of 7.2ms)
> per second. Assuming IO-APIC (arch/x86/kernel/apic/io_apic.c) is fine,
> then there's something wrong with the GPIO interrupt controller which
> works fine for the touchpad under Windows. Besides if I poll the touchpad
> data based on pin#130's status, the touchpad could also work under
> Windows.
>
> Ways to debug pinctrl-amd
> =========================
>
> I can't find any documentation about the AMDI0030 GPIO chip except for
> the commit logs of drivers/pinctrl/pinctrl-amd. One commit
> ba714a9c1dea85e0bf2899d02dfeb9c70040427c ("pinctrl/amd: Use regular interrupt instead of chained")
> inspired me to bring back chained interrupt to see if "an interrupt storm"
> would happen. The only change I noticed is that the interrupts arrive in
> pairs. The time internal between two interrupts in a pair is ~0.0016s
> but the time internal between interrupt pairs is still ~0.12s (~8Hz).
> Unfortunately, I don't get any insight about the GPIO interrupt
> controller from this tweaking. I wonder if there are any other ways
> to debug drivers/pinctrl/pinctrl-amd?
>
> Thank you!
>
>
> [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1887190
>
> --
> Best regards,
> Coiby
>

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-01 20:57   ` Linus Walleij
  0 siblings, 0 replies; 84+ messages in thread
From: Linus Walleij @ 2020-10-01 20:57 UTC (permalink / raw)
  To: Coiby Xu, Hans de Goede
  Cc: open list:GPIO SUBSYSTEM, Shyam Sundar S K, Nehal Shah,
	linux-kernel-mentees

Sorry for top posting, but I want to page some people.

I do not know anything about ACPI, but Hans de Goede is really
good with this kind of things and could possibly provide some
insight.

Yours,
Linus Walleij

On Thu, Oct 1, 2020 at 3:23 PM Coiby Xu <coiby.xu@gmail.com> wrote:
>
> Hi,
>
> I'm trying to fix broken touchpads [1] for a new laptop model Legion-5
> 15ARH05 which is shipped with two different touchpads, i.e., ElAN and
> Synaptics. For the ELAN touchpad, the kernel receives no interrupts to
> be informed of new data from the touchpad. For the Synaptics touchpad,
> only 7 interrupts are received per second which makes the touchpad
> completely unusable. Based on current observations, pinctrl-amd seems to
> be the most suspicious cause.
>
>
> Why do I think pinctrl-amd smells the most suspicious?
> ======================================================
>
> This laptop model has the following hardware configurations specified
> via ACPI,
>   - The touchpad's data interrupt line is connected to pin#130 of a GPIO
>     chip
>
>          GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
>                          "\\_SB.GPIO", 0x00, ResourceConsumer, ,
>                          )
>                          {   // Pin list
>                              0x0082
>                          }
>
>   - This GPIO chip (HID: AMDI0030) which is assigned with IRQ#7 has its
>     common interrupt output line connected to one IO-APIC's pin#7
>
>          Interrupt (ResourceConsumer, Level, ActiveLow, Shared, ,, )
>          {
>              0x00000007,
>          }
>
> I add some code to kernel to poll the status of the GPIO chip's pin#130
> and IO-APIc's pin#7 every 1ms when I move my finger on the surface of
> the Synaptics touchpad continuously for about 1s. During the process of I
> move my finger, most of the time,
>   - GPIO chip's pin#130: low input, interrupt unmasked
>   - IO-APIC's pin#7: IRR=0, interrupt unmasked (in fact mask/unmask_ioapic_irq
>     have never been called by the IRQ follow controller handle_fasteoi_irq)
>
> So the touchpad has been generating interrupts most of the time while
> IO-APIC controller hasn't been masking the interrupt from the GPIO chip.
> But somehow the kernel could only get ~7 interrupts each second while
> the touchpad could generate 140 interrupts (time resolution of 7.2ms)
> per second. Assuming IO-APIC (arch/x86/kernel/apic/io_apic.c) is fine,
> then there's something wrong with the GPIO interrupt controller which
> works fine for the touchpad under Windows. Besides if I poll the touchpad
> data based on pin#130's status, the touchpad could also work under
> Windows.
>
> Ways to debug pinctrl-amd
> =========================
>
> I can't find any documentation about the AMDI0030 GPIO chip except for
> the commit logs of drivers/pinctrl/pinctrl-amd. One commit
> ba714a9c1dea85e0bf2899d02dfeb9c70040427c ("pinctrl/amd: Use regular interrupt instead of chained")
> inspired me to bring back chained interrupt to see if "an interrupt storm"
> would happen. The only change I noticed is that the interrupts arrive in
> pairs. The time internal between two interrupts in a pair is ~0.0016s
> but the time internal between interrupt pairs is still ~0.12s (~8Hz).
> Unfortunately, I don't get any insight about the GPIO interrupt
> controller from this tweaking. I wonder if there are any other ways
> to debug drivers/pinctrl/pinctrl-amd?
>
> Thank you!
>
>
> [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1887190
>
> --
> Best regards,
> Coiby
>
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-01 20:57   ` [Linux-kernel-mentees] " Linus Walleij
@ 2020-10-02  9:40     ` Hans de Goede
  -1 siblings, 0 replies; 84+ messages in thread
From: Hans de Goede @ 2020-10-02  9:40 UTC (permalink / raw)
  To: Linus Walleij, Coiby Xu
  Cc: open list:GPIO SUBSYSTEM, Nehal Shah, Shyam Sundar S K,
	linux-kernel-mentees

Hi,

On 10/1/20 10:57 PM, Linus Walleij wrote:
> Sorry for top posting, but I want to page some people.
> 
> I do not know anything about ACPI, but Hans de Goede is really
> good with this kind of things and could possibly provide some
> insight.

Thanks, although I'm honored to be considered the go to person
for these kinda things my specialty really lies with these
kinda issues with intel Bay Trail and Cherry Trail SoCs
never the less let me take a look.

> On Thu, Oct 1, 2020 at 3:23 PM Coiby Xu <coiby.xu@gmail.com> wrote:
>>
>> Hi,
>>
>> I'm trying to fix broken touchpads [1] for a new laptop model Legion-5
>> 15ARH05 which is shipped with two different touchpads, i.e., ElAN and
>> Synaptics. For the ELAN touchpad, the kernel receives no interrupts to
>> be informed of new data from the touchpad. For the Synaptics touchpad,
>> only 7 interrupts are received per second which makes the touchpad
>> completely unusable. Based on current observations, pinctrl-amd seems to
>> be the most suspicious cause.
>>
>>
>> Why do I think pinctrl-amd smells the most suspicious?
>> ======================================================
>>
>> This laptop model has the following hardware configurations specified
>> via ACPI,
>>    - The touchpad's data interrupt line is connected to pin#130 of a GPIO
>>      chip
>>
>>           GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
>>                           "\\_SB.GPIO", 0x00, ResourceConsumer, ,
>>                           )
>>                           {   // Pin list
>>                               0x0082
>>                           }
>>
>>    - This GPIO chip (HID: AMDI0030) which is assigned with IRQ#7 has its
>>      common interrupt output line connected to one IO-APIC's pin#7
>>
>>           Interrupt (ResourceConsumer, Level, ActiveLow, Shared, ,, )
>>           {
>>               0x00000007,
>>           }

So these both look fine.

>> I add some code to kernel to poll the status of the GPIO chip's pin#130
>> and IO-APIc's pin#7 every 1ms when I move my finger on the surface of
>> the Synaptics touchpad continuously for about 1s. During the process of I
>> move my finger, most of the time,
>>    - GPIO chip's pin#130: low input, interrupt unmasked
>>    - IO-APIC's pin#7: IRR=0, interrupt unmasked (in fact mask/unmask_ioapic_irq
>>      have never been called by the IRQ follow controller handle_fasteoi_irq)
>>
>> So the touchpad has been generating interrupts most of the time while
>> IO-APIC controller hasn't been masking the interrupt from the GPIO chip.
>> But somehow the kernel could only get ~7 interrupts each second

So are you seeing these 7 interrupts / second for the touchpad irq or for
the GPIO controllers parent irq ?

Also to these 7 interrupts/sec stop happening when you do not touch the
touchpad ?

To me this sounds like the interrupt is configured as being triggered on
a negative edge so that it only fires once when the line from the touchpad
goes low, and for some reason 7 times a second the touchpad controller
briefly releases the line (sorta gives up to signal the irq and then
tries again?).

>> while
>> the touchpad could generate 140 interrupts (time resolution of 7.2ms)
>> per second. Assuming IO-APIC (arch/x86/kernel/apic/io_apic.c) is fine,
>> then there's something wrong with the GPIO interrupt controller which
>> works fine for the touchpad under Windows. Besides if I poll the touchpad
>> data based on pin#130's status, the touchpad could also work under
>> Windows.

I agree that this sounds like a problem with the GpioInt handling.

>> Ways to debug pinctrl-amd
>> =========================
>>
>> I can't find any documentation about the AMDI0030 GPIO chip except for
>> the commit logs of drivers/pinctrl/pinctrl-amd. One commit
>> ba714a9c1dea85e0bf2899d02dfeb9c70040427c ("pinctrl/amd: Use regular interrupt instead of chained")
>> inspired me to bring back chained interrupt to see if "an interrupt storm"
>> would happen. The only change I noticed is that the interrupts arrive in
>> pairs. The time internal between two interrupts in a pair is ~0.0016s
>> but the time internal between interrupt pairs is still ~0.12s (~8Hz).
>> Unfortunately, I don't get any insight about the GPIO interrupt
>> controller from this tweaking. I wonder if there are any other ways
>> to debug drivers/pinctrl/pinctrl-amd?

The way I would try to debug this (with access to the hardware) is
to try an verify the interrupt trigger (level vs edge) settings inside
pinctrl/amd by adding a bunch of printks printing them whenever the
relevant register bits are touched.

So I'm going to guess here that these touchpads use i2c-hid, so I
took a quick peak at the i2c-hid irq request code from
drivers/hid/i2c-hid/i2c-hid-core.c:

         unsigned long irqflags = 0;
         int ret;

         dev_dbg(&client->dev, "Requesting IRQ: %d\n", client->irq);

         if (!irq_get_trigger_type(client->irq))
                 irqflags = IRQF_TRIGGER_LOW;

         ret = request_threaded_irq(client->irq, NULL, i2c_hid_irq,
                                    irqflags | IRQF_ONESHOT, client->name, ihid);

So this tries to preserve the pre-configured irq-type on the irq
line and if no irq-type is set then it overrides the trigger-type
to IRQF_TRIGGER_LOW, which means level-low.

One quick hack you can try is ommenting out the "if (!irq_get_trigger_type(client->irq))"
type, I guess maybe the pinctrl-amd code is defaulting all IRQs to some
edge trigger type? This should override it and recontrol it to
a level trigger type.

###

As you said hopefully the IOApic code is fine. Notice that the ioapic
irqchip driver does not allow configuring the trigger type. I guess
this is not part of the ioapic spec and that the BIOS/firmware is setting
the triggerlevel in a io-apic implementation specific way, so we better hope
it is right. I have had the unfortunate experience to try and debug a wrong
io-apic irq-pin trigger-type issue with TPMs in some Lenovo thinkpads and
in the end only the Lenovo BIOS team could fix this.

Regards,

Hans


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-02  9:40     ` Hans de Goede
  0 siblings, 0 replies; 84+ messages in thread
From: Hans de Goede @ 2020-10-02  9:40 UTC (permalink / raw)
  To: Linus Walleij, Coiby Xu
  Cc: open list:GPIO SUBSYSTEM, Shyam Sundar S K, Nehal Shah,
	linux-kernel-mentees

Hi,

On 10/1/20 10:57 PM, Linus Walleij wrote:
> Sorry for top posting, but I want to page some people.
> 
> I do not know anything about ACPI, but Hans de Goede is really
> good with this kind of things and could possibly provide some
> insight.

Thanks, although I'm honored to be considered the go to person
for these kinda things my specialty really lies with these
kinda issues with intel Bay Trail and Cherry Trail SoCs
never the less let me take a look.

> On Thu, Oct 1, 2020 at 3:23 PM Coiby Xu <coiby.xu@gmail.com> wrote:
>>
>> Hi,
>>
>> I'm trying to fix broken touchpads [1] for a new laptop model Legion-5
>> 15ARH05 which is shipped with two different touchpads, i.e., ElAN and
>> Synaptics. For the ELAN touchpad, the kernel receives no interrupts to
>> be informed of new data from the touchpad. For the Synaptics touchpad,
>> only 7 interrupts are received per second which makes the touchpad
>> completely unusable. Based on current observations, pinctrl-amd seems to
>> be the most suspicious cause.
>>
>>
>> Why do I think pinctrl-amd smells the most suspicious?
>> ======================================================
>>
>> This laptop model has the following hardware configurations specified
>> via ACPI,
>>    - The touchpad's data interrupt line is connected to pin#130 of a GPIO
>>      chip
>>
>>           GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
>>                           "\\_SB.GPIO", 0x00, ResourceConsumer, ,
>>                           )
>>                           {   // Pin list
>>                               0x0082
>>                           }
>>
>>    - This GPIO chip (HID: AMDI0030) which is assigned with IRQ#7 has its
>>      common interrupt output line connected to one IO-APIC's pin#7
>>
>>           Interrupt (ResourceConsumer, Level, ActiveLow, Shared, ,, )
>>           {
>>               0x00000007,
>>           }

So these both look fine.

>> I add some code to kernel to poll the status of the GPIO chip's pin#130
>> and IO-APIc's pin#7 every 1ms when I move my finger on the surface of
>> the Synaptics touchpad continuously for about 1s. During the process of I
>> move my finger, most of the time,
>>    - GPIO chip's pin#130: low input, interrupt unmasked
>>    - IO-APIC's pin#7: IRR=0, interrupt unmasked (in fact mask/unmask_ioapic_irq
>>      have never been called by the IRQ follow controller handle_fasteoi_irq)
>>
>> So the touchpad has been generating interrupts most of the time while
>> IO-APIC controller hasn't been masking the interrupt from the GPIO chip.
>> But somehow the kernel could only get ~7 interrupts each second

So are you seeing these 7 interrupts / second for the touchpad irq or for
the GPIO controllers parent irq ?

Also to these 7 interrupts/sec stop happening when you do not touch the
touchpad ?

To me this sounds like the interrupt is configured as being triggered on
a negative edge so that it only fires once when the line from the touchpad
goes low, and for some reason 7 times a second the touchpad controller
briefly releases the line (sorta gives up to signal the irq and then
tries again?).

>> while
>> the touchpad could generate 140 interrupts (time resolution of 7.2ms)
>> per second. Assuming IO-APIC (arch/x86/kernel/apic/io_apic.c) is fine,
>> then there's something wrong with the GPIO interrupt controller which
>> works fine for the touchpad under Windows. Besides if I poll the touchpad
>> data based on pin#130's status, the touchpad could also work under
>> Windows.

I agree that this sounds like a problem with the GpioInt handling.

>> Ways to debug pinctrl-amd
>> =========================
>>
>> I can't find any documentation about the AMDI0030 GPIO chip except for
>> the commit logs of drivers/pinctrl/pinctrl-amd. One commit
>> ba714a9c1dea85e0bf2899d02dfeb9c70040427c ("pinctrl/amd: Use regular interrupt instead of chained")
>> inspired me to bring back chained interrupt to see if "an interrupt storm"
>> would happen. The only change I noticed is that the interrupts arrive in
>> pairs. The time internal between two interrupts in a pair is ~0.0016s
>> but the time internal between interrupt pairs is still ~0.12s (~8Hz).
>> Unfortunately, I don't get any insight about the GPIO interrupt
>> controller from this tweaking. I wonder if there are any other ways
>> to debug drivers/pinctrl/pinctrl-amd?

The way I would try to debug this (with access to the hardware) is
to try an verify the interrupt trigger (level vs edge) settings inside
pinctrl/amd by adding a bunch of printks printing them whenever the
relevant register bits are touched.

So I'm going to guess here that these touchpads use i2c-hid, so I
took a quick peak at the i2c-hid irq request code from
drivers/hid/i2c-hid/i2c-hid-core.c:

         unsigned long irqflags = 0;
         int ret;

         dev_dbg(&client->dev, "Requesting IRQ: %d\n", client->irq);

         if (!irq_get_trigger_type(client->irq))
                 irqflags = IRQF_TRIGGER_LOW;

         ret = request_threaded_irq(client->irq, NULL, i2c_hid_irq,
                                    irqflags | IRQF_ONESHOT, client->name, ihid);

So this tries to preserve the pre-configured irq-type on the irq
line and if no irq-type is set then it overrides the trigger-type
to IRQF_TRIGGER_LOW, which means level-low.

One quick hack you can try is ommenting out the "if (!irq_get_trigger_type(client->irq))"
type, I guess maybe the pinctrl-amd code is defaulting all IRQs to some
edge trigger type? This should override it and recontrol it to
a level trigger type.

###

As you said hopefully the IOApic code is fine. Notice that the ioapic
irqchip driver does not allow configuring the trigger type. I guess
this is not part of the ioapic spec and that the BIOS/firmware is setting
the triggerlevel in a io-apic implementation specific way, so we better hope
it is right. I have had the unfortunate experience to try and debug a wrong
io-apic irq-pin trigger-type issue with TPMs in some Lenovo thinkpads and
in the end only the Lenovo BIOS team could fix this.

Regards,

Hans

_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-01 20:57   ` [Linux-kernel-mentees] " Linus Walleij
@ 2020-10-02 10:59     ` Coiby Xu
  -1 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-02 10:59 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Hans de Goede, open list:GPIO SUBSYSTEM, Nehal Shah,
	Shyam Sundar S K, linux-kernel-mentees

On Thu, Oct 01, 2020 at 10:57:40PM +0200, Linus Walleij wrote:
>Sorry for top posting, but I want to page some people.
>
>I do not know anything about ACPI, but Hans de Goede is really
>good with this kind of things and could possibly provide some
>insight.

Thank you for introducing Hans de Goede to me!

>
>Yours,
>Linus Walleij
>
>On Thu, Oct 1, 2020 at 3:23 PM Coiby Xu <coiby.xu@gmail.com> wrote:
>>
>> Hi,
>>
>> I'm trying to fix broken touchpads [1] for a new laptop model Legion-5
>> 15ARH05 which is shipped with two different touchpads, i.e., ElAN and
>> Synaptics. For the ELAN touchpad, the kernel receives no interrupts to
>> be informed of new data from the touchpad. For the Synaptics touchpad,
>> only 7 interrupts are received per second which makes the touchpad
>> completely unusable. Based on current observations, pinctrl-amd seems to
>> be the most suspicious cause.
>>
>>
>> Why do I think pinctrl-amd smells the most suspicious?
>> ======================================================
>>
>> This laptop model has the following hardware configurations specified
>> via ACPI,
>>   - The touchpad's data interrupt line is connected to pin#130 of a GPIO
>>     chip
>>
>>          GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
>>                          "\\_SB.GPIO", 0x00, ResourceConsumer, ,
>>                          )
>>                          {   // Pin list
>>                              0x0082
>>                          }
>>
>>   - This GPIO chip (HID: AMDI0030) which is assigned with IRQ#7 has its
>>     common interrupt output line connected to one IO-APIC's pin#7
>>
>>          Interrupt (ResourceConsumer, Level, ActiveLow, Shared, ,, )
>>          {
>>              0x00000007,
>>          }
>>
>> I add some code to kernel to poll the status of the GPIO chip's pin#130
>> and IO-APIc's pin#7 every 1ms when I move my finger on the surface of
>> the Synaptics touchpad continuously for about 1s. During the process of I
>> move my finger, most of the time,
>>   - GPIO chip's pin#130: low input, interrupt unmasked
>>   - IO-APIC's pin#7: IRR=0, interrupt unmasked (in fact mask/unmask_ioapic_irq
>>     have never been called by the IRQ follow controller handle_fasteoi_irq)
>>
>> So the touchpad has been generating interrupts most of the time while
>> IO-APIC controller hasn't been masking the interrupt from the GPIO chip.
>> But somehow the kernel could only get ~7 interrupts each second while
>> the touchpad could generate 140 interrupts (time resolution of 7.2ms)
>> per second. Assuming IO-APIC (arch/x86/kernel/apic/io_apic.c) is fine,
>> then there's something wrong with the GPIO interrupt controller which
>> works fine for the touchpad under Windows. Besides if I poll the touchpad
>> data based on pin#130's status, the touchpad could also work under
>> Windows.
>>
>> Ways to debug pinctrl-amd
>> =========================
>>
>> I can't find any documentation about the AMDI0030 GPIO chip except for
>> the commit logs of drivers/pinctrl/pinctrl-amd. One commit
>> ba714a9c1dea85e0bf2899d02dfeb9c70040427c ("pinctrl/amd: Use regular interrupt instead of chained")
>> inspired me to bring back chained interrupt to see if "an interrupt storm"
>> would happen. The only change I noticed is that the interrupts arrive in
>> pairs. The time internal between two interrupts in a pair is ~0.0016s
>> but the time internal between interrupt pairs is still ~0.12s (~8Hz).
>> Unfortunately, I don't get any insight about the GPIO interrupt
>> controller from this tweaking. I wonder if there are any other ways
>> to debug drivers/pinctrl/pinctrl-amd?
>>
>> Thank you!
>>
>>
>> [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1887190
>>
>> --
>> Best regards,
>> Coiby
>>

--
Best regards,
Coiby

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-02 10:59     ` Coiby Xu
  0 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-02 10:59 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Hans de Goede, Shyam Sundar S K, Nehal Shah,
	linux-kernel-mentees, open list:GPIO SUBSYSTEM

On Thu, Oct 01, 2020 at 10:57:40PM +0200, Linus Walleij wrote:
>Sorry for top posting, but I want to page some people.
>
>I do not know anything about ACPI, but Hans de Goede is really
>good with this kind of things and could possibly provide some
>insight.

Thank you for introducing Hans de Goede to me!

>
>Yours,
>Linus Walleij
>
>On Thu, Oct 1, 2020 at 3:23 PM Coiby Xu <coiby.xu@gmail.com> wrote:
>>
>> Hi,
>>
>> I'm trying to fix broken touchpads [1] for a new laptop model Legion-5
>> 15ARH05 which is shipped with two different touchpads, i.e., ElAN and
>> Synaptics. For the ELAN touchpad, the kernel receives no interrupts to
>> be informed of new data from the touchpad. For the Synaptics touchpad,
>> only 7 interrupts are received per second which makes the touchpad
>> completely unusable. Based on current observations, pinctrl-amd seems to
>> be the most suspicious cause.
>>
>>
>> Why do I think pinctrl-amd smells the most suspicious?
>> ======================================================
>>
>> This laptop model has the following hardware configurations specified
>> via ACPI,
>>   - The touchpad's data interrupt line is connected to pin#130 of a GPIO
>>     chip
>>
>>          GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
>>                          "\\_SB.GPIO", 0x00, ResourceConsumer, ,
>>                          )
>>                          {   // Pin list
>>                              0x0082
>>                          }
>>
>>   - This GPIO chip (HID: AMDI0030) which is assigned with IRQ#7 has its
>>     common interrupt output line connected to one IO-APIC's pin#7
>>
>>          Interrupt (ResourceConsumer, Level, ActiveLow, Shared, ,, )
>>          {
>>              0x00000007,
>>          }
>>
>> I add some code to kernel to poll the status of the GPIO chip's pin#130
>> and IO-APIc's pin#7 every 1ms when I move my finger on the surface of
>> the Synaptics touchpad continuously for about 1s. During the process of I
>> move my finger, most of the time,
>>   - GPIO chip's pin#130: low input, interrupt unmasked
>>   - IO-APIC's pin#7: IRR=0, interrupt unmasked (in fact mask/unmask_ioapic_irq
>>     have never been called by the IRQ follow controller handle_fasteoi_irq)
>>
>> So the touchpad has been generating interrupts most of the time while
>> IO-APIC controller hasn't been masking the interrupt from the GPIO chip.
>> But somehow the kernel could only get ~7 interrupts each second while
>> the touchpad could generate 140 interrupts (time resolution of 7.2ms)
>> per second. Assuming IO-APIC (arch/x86/kernel/apic/io_apic.c) is fine,
>> then there's something wrong with the GPIO interrupt controller which
>> works fine for the touchpad under Windows. Besides if I poll the touchpad
>> data based on pin#130's status, the touchpad could also work under
>> Windows.
>>
>> Ways to debug pinctrl-amd
>> =========================
>>
>> I can't find any documentation about the AMDI0030 GPIO chip except for
>> the commit logs of drivers/pinctrl/pinctrl-amd. One commit
>> ba714a9c1dea85e0bf2899d02dfeb9c70040427c ("pinctrl/amd: Use regular interrupt instead of chained")
>> inspired me to bring back chained interrupt to see if "an interrupt storm"
>> would happen. The only change I noticed is that the interrupts arrive in
>> pairs. The time internal between two interrupts in a pair is ~0.0016s
>> but the time internal between interrupt pairs is still ~0.12s (~8Hz).
>> Unfortunately, I don't get any insight about the GPIO interrupt
>> controller from this tweaking. I wonder if there are any other ways
>> to debug drivers/pinctrl/pinctrl-amd?
>>
>> Thank you!
>>
>>
>> [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1887190
>>
>> --
>> Best regards,
>> Coiby
>>

--
Best regards,
Coiby
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-02  9:40     ` [Linux-kernel-mentees] " Hans de Goede
@ 2020-10-02 12:42       ` Coiby Xu
  -1 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-02 12:42 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Linus Walleij, open list:GPIO SUBSYSTEM, Nehal Shah,
	Shyam Sundar S K, linux-kernel-mentees

On Fri, Oct 02, 2020 at 11:40:12AM +0200, Hans de Goede wrote:
>Hi,
>
>On 10/1/20 10:57 PM, Linus Walleij wrote:
>>Sorry for top posting, but I want to page some people.
>>
>>I do not know anything about ACPI, but Hans de Goede is really
>>good with this kind of things and could possibly provide some
>>insight.
>
>Thanks, although I'm honored to be considered the go to person
>for these kinda things my specialty really lies with these
>kinda issues with intel Bay Trail and Cherry Trail SoCs
>never the less let me take a look.

Thank you for taking time to examine this touchpad issue!

>
>>On Thu, Oct 1, 2020 at 3:23 PM Coiby Xu <coiby.xu@gmail.com> wrote:
>>>
>>>Hi,
>>>
>>>I'm trying to fix broken touchpads [1] for a new laptop model Legion-5
>>>15ARH05 which is shipped with two different touchpads, i.e., ElAN and
>>>Synaptics. For the ELAN touchpad, the kernel receives no interrupts to
>>>be informed of new data from the touchpad. For the Synaptics touchpad,
>>>only 7 interrupts are received per second which makes the touchpad
>>>completely unusable. Based on current observations, pinctrl-amd seems to
>>>be the most suspicious cause.
>>>
>>>
>>>Why do I think pinctrl-amd smells the most suspicious?
>>>======================================================
>>>
>>>This laptop model has the following hardware configurations specified
>>>via ACPI,
>>>   - The touchpad's data interrupt line is connected to pin#130 of a GPIO
>>>     chip
>>>
>>>          GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
>>>                          "\\_SB.GPIO", 0x00, ResourceConsumer, ,
>>>                          )
>>>                          {   // Pin list
>>>                              0x0082
>>>                          }
>>>
>>>   - This GPIO chip (HID: AMDI0030) which is assigned with IRQ#7 has its
>>>     common interrupt output line connected to one IO-APIC's pin#7
>>>
>>>          Interrupt (ResourceConsumer, Level, ActiveLow, Shared, ,, )
>>>          {
>>>              0x00000007,
>>>          }
>
>So these both look fine.
>
>>>I add some code to kernel to poll the status of the GPIO chip's pin#130
>>>and IO-APIc's pin#7 every 1ms when I move my finger on the surface of
>>>the Synaptics touchpad continuously for about 1s. During the process of I
>>>move my finger, most of the time,
>>>   - GPIO chip's pin#130: low input, interrupt unmasked
>>>   - IO-APIC's pin#7: IRR=0, interrupt unmasked (in fact mask/unmask_ioapic_irq
>>>     have never been called by the IRQ follow controller handle_fasteoi_irq)
>>>
>>>So the touchpad has been generating interrupts most of the time while
>>>IO-APIC controller hasn't been masking the interrupt from the GPIO chip.
>>>But somehow the kernel could only get ~7 interrupts each second
>
>So are you seeing these 7 interrupts / second for the touchpad irq or for
>the GPIO controllers parent irq ?
>
>Also to these 7 interrupts/sec stop happening when you do not touch the
>touchpad ?
>
I see these 7 interrupts / second for the GPIO controller's parent irq.
And they stop happening when I don't touch the touchpad.

>To me this sounds like the interrupt is configured as being triggered on
>a negative edge so that it only fires once when the line from the touchpad
>goes low, and for some reason 7 times a second the touchpad controller
>briefly releases the line (sorta gives up to signal the irq and then
>tries again?).
>
>>>while
>>>the touchpad could generate 140 interrupts (time resolution of 7.2ms)
>>>per second. Assuming IO-APIC (arch/x86/kernel/apic/io_apic.c) is fine,
>>>then there's something wrong with the GPIO interrupt controller which
>>>works fine for the touchpad under Windows. Besides if I poll the touchpad
>>>data based on pin#130's status, the touchpad could also work under
>>>Windows.
>
>I agree that this sounds like a problem with the GpioInt handling.
>
>>>Ways to debug pinctrl-amd
>>>=========================
>>>
>>>I can't find any documentation about the AMDI0030 GPIO chip except for
>>>the commit logs of drivers/pinctrl/pinctrl-amd. One commit
>>>ba714a9c1dea85e0bf2899d02dfeb9c70040427c ("pinctrl/amd: Use regular interrupt instead of chained")
>>>inspired me to bring back chained interrupt to see if "an interrupt storm"
>>>would happen. The only change I noticed is that the interrupts arrive in
>>>pairs. The time internal between two interrupts in a pair is ~0.0016s
>>>but the time internal between interrupt pairs is still ~0.12s (~8Hz).
>>>Unfortunately, I don't get any insight about the GPIO interrupt
>>>controller from this tweaking. I wonder if there are any other ways
>>>to debug drivers/pinctrl/pinctrl-amd?
>
>The way I would try to debug this (with access to the hardware) is
>to try an verify the interrupt trigger (level vs edge) settings inside
>pinctrl/amd by adding a bunch of printks printing them whenever the
>relevant register bits are touched.
>
>So I'm going to guess here that these touchpads use i2c-hid, so I
>took a quick peak at the i2c-hid irq request code from
>drivers/hid/i2c-hid/i2c-hid-core.c:
>
>        unsigned long irqflags = 0;
>        int ret;
>
>        dev_dbg(&client->dev, "Requesting IRQ: %d\n", client->irq);
>
>        if (!irq_get_trigger_type(client->irq))
>                irqflags = IRQF_TRIGGER_LOW;
>
>        ret = request_threaded_irq(client->irq, NULL, i2c_hid_irq,
>                                   irqflags | IRQF_ONESHOT, client->name, ihid);
>
>So this tries to preserve the pre-configured irq-type on the irq
>line and if no irq-type is set then it overrides the trigger-type
>to IRQF_TRIGGER_LOW, which means level-low.
>
>One quick hack you can try is ommenting out the "if (!irq_get_trigger_type(client->irq))"
>type, I guess maybe the pinctrl-amd code is defaulting all IRQs to some
>edge trigger type? This should override it and recontrol it to
>a level trigger type.
>
Yes, "these touchpads use i2c-hid". I have examined the configuration of
irq-type in drivers/hid/i2c-hid/i2c-hid-core.c and can confirm it's been
configured to be level-low.

$ sudo cat /sys/kernel/debug/gpio|grep -A1 pin130
260:pin130      Level trigger| Active low| interrupt is enabled| interrupt is unmasked| disable wakeup in S0i3 state| disable wakeup in S3 state|

(Of course we rely on drivers/pinctrl/pinctrl-amd.c to read&interpret
data from the corresponding registers. If pinctrl-amd is return false
reports, we can do nothing about this)

Btw, we can't make any change in i2c-hid because they will be overridden
by drivers/pinctrl/pinctrl-amd.c which use the values from the ACPI tables
instead,

static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
{

	/* Ignore the settings coming from the client and
	 * read the values from the ACPI tables
	 * while setting the trigger type
	 */

	irq_flags = irq_get_trigger_type(d->irq);
	if (irq_flags != IRQ_TYPE_NONE)
		type = irq_flags;
}


Also, With CONFIG_GENERIC_IRQ_DEBUGFS enabled, `cat /sys/kernel/debug/irq/irqs/72`
also shows irq#72 (#72 is requested IRQ of this touchpad device) has the
expected irq-type,

$ cat /sys/kernel/debug/irq/irqs/72
handler:  handle_level_irq
device:   (null)
status:   0x00000508
             _IRQ_NOPROBE
istate:   0x00000020
             IRQS_ONESHOT
ddepth:   0
wdepth:   0
dstate:   0x00402208
             IRQ_TYPE_LEVEL_LOW
             IRQD_LEVEL
             IRQD_ACTIVATED
             IRQD_IRQ_STARTED`

>###
>
>As you said hopefully the IOApic code is fine. Notice that the ioapic
>irqchip driver does not allow configuring the trigger type.
>

Yes. unlike pinctrl-amd, arch/x86/kernel/apic/io_apic.c doesn't provide
`(struct irq_chip*)->irq_set_type`. I notice during the setting-up of
ia-apic, all pins are configured with edge-high according to the IRQ
redirection table which can be printed out with the "apic=debug" kernel
parameter,

     .... IRQ redirection table:
     IOAPIC 0:
      pin00, disabled, edge , high, V(00), IRR(0), S(0), physical, D(00), M(0)

      pin06, enabled , edge , high, V(06), IRR(0), S(0), physical, D(00), M(0)
      pin07, disabled, edge , high, V(00), IRR(0), S(0), physical, D(00), M(0)

Later, I manually printed out the IRQ redirection table when processing
touchpad HID reports, pin07 (which is connected with the GPIO's common
interrupt output line) has adopted the expected configuration,

     pin07, enabled , level, low , V(07), IRR(1), S(0), physical, D(00), M(0)

Today I played with the "noapic" kernel parameter to use PIC mode
so we can confirm there is nothing wrong with io-apic. Unfortunately
the I2C adapter can't be set-up (the error is "controller timed out").
As a consequence, the touchpad as an I2C client won't work either.

And I can't find a way to disable APIC for Windows either.

>I guess
>this is not part of the ioapic spec and that the BIOS/firmware is setting
>the triggerlevel in a io-apic implementation specific way, so we better hope
>it is right. I have had the unfortunate experience to try and debug a wrong
>io-apic irq-pin trigger-type issue with TPMs in some Lenovo thinkpads and
>in the end only the Lenovo BIOS team could fix this.

If the same BIOS/firmware is setting the trigger level in a wrong way,
shouldn't we find the same issue under Windows? Btw, I've set
'acpi_osi="Windows 2015"'
as the kernel parameter before but I didn't notice any change.

>Regards,
>
>Hans
>

--
Best regards,
Coiby

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-02 12:42       ` Coiby Xu
  0 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-02 12:42 UTC (permalink / raw)
  To: Hans de Goede
  Cc: open list:GPIO SUBSYSTEM, Linus Walleij, Shyam Sundar S K,
	Nehal Shah, linux-kernel-mentees

On Fri, Oct 02, 2020 at 11:40:12AM +0200, Hans de Goede wrote:
>Hi,
>
>On 10/1/20 10:57 PM, Linus Walleij wrote:
>>Sorry for top posting, but I want to page some people.
>>
>>I do not know anything about ACPI, but Hans de Goede is really
>>good with this kind of things and could possibly provide some
>>insight.
>
>Thanks, although I'm honored to be considered the go to person
>for these kinda things my specialty really lies with these
>kinda issues with intel Bay Trail and Cherry Trail SoCs
>never the less let me take a look.

Thank you for taking time to examine this touchpad issue!

>
>>On Thu, Oct 1, 2020 at 3:23 PM Coiby Xu <coiby.xu@gmail.com> wrote:
>>>
>>>Hi,
>>>
>>>I'm trying to fix broken touchpads [1] for a new laptop model Legion-5
>>>15ARH05 which is shipped with two different touchpads, i.e., ElAN and
>>>Synaptics. For the ELAN touchpad, the kernel receives no interrupts to
>>>be informed of new data from the touchpad. For the Synaptics touchpad,
>>>only 7 interrupts are received per second which makes the touchpad
>>>completely unusable. Based on current observations, pinctrl-amd seems to
>>>be the most suspicious cause.
>>>
>>>
>>>Why do I think pinctrl-amd smells the most suspicious?
>>>======================================================
>>>
>>>This laptop model has the following hardware configurations specified
>>>via ACPI,
>>>   - The touchpad's data interrupt line is connected to pin#130 of a GPIO
>>>     chip
>>>
>>>          GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
>>>                          "\\_SB.GPIO", 0x00, ResourceConsumer, ,
>>>                          )
>>>                          {   // Pin list
>>>                              0x0082
>>>                          }
>>>
>>>   - This GPIO chip (HID: AMDI0030) which is assigned with IRQ#7 has its
>>>     common interrupt output line connected to one IO-APIC's pin#7
>>>
>>>          Interrupt (ResourceConsumer, Level, ActiveLow, Shared, ,, )
>>>          {
>>>              0x00000007,
>>>          }
>
>So these both look fine.
>
>>>I add some code to kernel to poll the status of the GPIO chip's pin#130
>>>and IO-APIc's pin#7 every 1ms when I move my finger on the surface of
>>>the Synaptics touchpad continuously for about 1s. During the process of I
>>>move my finger, most of the time,
>>>   - GPIO chip's pin#130: low input, interrupt unmasked
>>>   - IO-APIC's pin#7: IRR=0, interrupt unmasked (in fact mask/unmask_ioapic_irq
>>>     have never been called by the IRQ follow controller handle_fasteoi_irq)
>>>
>>>So the touchpad has been generating interrupts most of the time while
>>>IO-APIC controller hasn't been masking the interrupt from the GPIO chip.
>>>But somehow the kernel could only get ~7 interrupts each second
>
>So are you seeing these 7 interrupts / second for the touchpad irq or for
>the GPIO controllers parent irq ?
>
>Also to these 7 interrupts/sec stop happening when you do not touch the
>touchpad ?
>
I see these 7 interrupts / second for the GPIO controller's parent irq.
And they stop happening when I don't touch the touchpad.

>To me this sounds like the interrupt is configured as being triggered on
>a negative edge so that it only fires once when the line from the touchpad
>goes low, and for some reason 7 times a second the touchpad controller
>briefly releases the line (sorta gives up to signal the irq and then
>tries again?).
>
>>>while
>>>the touchpad could generate 140 interrupts (time resolution of 7.2ms)
>>>per second. Assuming IO-APIC (arch/x86/kernel/apic/io_apic.c) is fine,
>>>then there's something wrong with the GPIO interrupt controller which
>>>works fine for the touchpad under Windows. Besides if I poll the touchpad
>>>data based on pin#130's status, the touchpad could also work under
>>>Windows.
>
>I agree that this sounds like a problem with the GpioInt handling.
>
>>>Ways to debug pinctrl-amd
>>>=========================
>>>
>>>I can't find any documentation about the AMDI0030 GPIO chip except for
>>>the commit logs of drivers/pinctrl/pinctrl-amd. One commit
>>>ba714a9c1dea85e0bf2899d02dfeb9c70040427c ("pinctrl/amd: Use regular interrupt instead of chained")
>>>inspired me to bring back chained interrupt to see if "an interrupt storm"
>>>would happen. The only change I noticed is that the interrupts arrive in
>>>pairs. The time internal between two interrupts in a pair is ~0.0016s
>>>but the time internal between interrupt pairs is still ~0.12s (~8Hz).
>>>Unfortunately, I don't get any insight about the GPIO interrupt
>>>controller from this tweaking. I wonder if there are any other ways
>>>to debug drivers/pinctrl/pinctrl-amd?
>
>The way I would try to debug this (with access to the hardware) is
>to try an verify the interrupt trigger (level vs edge) settings inside
>pinctrl/amd by adding a bunch of printks printing them whenever the
>relevant register bits are touched.
>
>So I'm going to guess here that these touchpads use i2c-hid, so I
>took a quick peak at the i2c-hid irq request code from
>drivers/hid/i2c-hid/i2c-hid-core.c:
>
>        unsigned long irqflags = 0;
>        int ret;
>
>        dev_dbg(&client->dev, "Requesting IRQ: %d\n", client->irq);
>
>        if (!irq_get_trigger_type(client->irq))
>                irqflags = IRQF_TRIGGER_LOW;
>
>        ret = request_threaded_irq(client->irq, NULL, i2c_hid_irq,
>                                   irqflags | IRQF_ONESHOT, client->name, ihid);
>
>So this tries to preserve the pre-configured irq-type on the irq
>line and if no irq-type is set then it overrides the trigger-type
>to IRQF_TRIGGER_LOW, which means level-low.
>
>One quick hack you can try is ommenting out the "if (!irq_get_trigger_type(client->irq))"
>type, I guess maybe the pinctrl-amd code is defaulting all IRQs to some
>edge trigger type? This should override it and recontrol it to
>a level trigger type.
>
Yes, "these touchpads use i2c-hid". I have examined the configuration of
irq-type in drivers/hid/i2c-hid/i2c-hid-core.c and can confirm it's been
configured to be level-low.

$ sudo cat /sys/kernel/debug/gpio|grep -A1 pin130
260:pin130      Level trigger| Active low| interrupt is enabled| interrupt is unmasked| disable wakeup in S0i3 state| disable wakeup in S3 state|

(Of course we rely on drivers/pinctrl/pinctrl-amd.c to read&interpret
data from the corresponding registers. If pinctrl-amd is return false
reports, we can do nothing about this)

Btw, we can't make any change in i2c-hid because they will be overridden
by drivers/pinctrl/pinctrl-amd.c which use the values from the ACPI tables
instead,

static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
{

	/* Ignore the settings coming from the client and
	 * read the values from the ACPI tables
	 * while setting the trigger type
	 */

	irq_flags = irq_get_trigger_type(d->irq);
	if (irq_flags != IRQ_TYPE_NONE)
		type = irq_flags;
}


Also, With CONFIG_GENERIC_IRQ_DEBUGFS enabled, `cat /sys/kernel/debug/irq/irqs/72`
also shows irq#72 (#72 is requested IRQ of this touchpad device) has the
expected irq-type,

$ cat /sys/kernel/debug/irq/irqs/72
handler:  handle_level_irq
device:   (null)
status:   0x00000508
             _IRQ_NOPROBE
istate:   0x00000020
             IRQS_ONESHOT
ddepth:   0
wdepth:   0
dstate:   0x00402208
             IRQ_TYPE_LEVEL_LOW
             IRQD_LEVEL
             IRQD_ACTIVATED
             IRQD_IRQ_STARTED`

>###
>
>As you said hopefully the IOApic code is fine. Notice that the ioapic
>irqchip driver does not allow configuring the trigger type.
>

Yes. unlike pinctrl-amd, arch/x86/kernel/apic/io_apic.c doesn't provide
`(struct irq_chip*)->irq_set_type`. I notice during the setting-up of
ia-apic, all pins are configured with edge-high according to the IRQ
redirection table which can be printed out with the "apic=debug" kernel
parameter,

     .... IRQ redirection table:
     IOAPIC 0:
      pin00, disabled, edge , high, V(00), IRR(0), S(0), physical, D(00), M(0)

      pin06, enabled , edge , high, V(06), IRR(0), S(0), physical, D(00), M(0)
      pin07, disabled, edge , high, V(00), IRR(0), S(0), physical, D(00), M(0)

Later, I manually printed out the IRQ redirection table when processing
touchpad HID reports, pin07 (which is connected with the GPIO's common
interrupt output line) has adopted the expected configuration,

     pin07, enabled , level, low , V(07), IRR(1), S(0), physical, D(00), M(0)

Today I played with the "noapic" kernel parameter to use PIC mode
so we can confirm there is nothing wrong with io-apic. Unfortunately
the I2C adapter can't be set-up (the error is "controller timed out").
As a consequence, the touchpad as an I2C client won't work either.

And I can't find a way to disable APIC for Windows either.

>I guess
>this is not part of the ioapic spec and that the BIOS/firmware is setting
>the triggerlevel in a io-apic implementation specific way, so we better hope
>it is right. I have had the unfortunate experience to try and debug a wrong
>io-apic irq-pin trigger-type issue with TPMs in some Lenovo thinkpads and
>in the end only the Lenovo BIOS team could fix this.

If the same BIOS/firmware is setting the trigger level in a wrong way,
shouldn't we find the same issue under Windows? Btw, I've set
'acpi_osi="Windows 2015"'
as the kernel parameter before but I didn't notice any change.

>Regards,
>
>Hans
>

--
Best regards,
Coiby
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-02 12:42       ` [Linux-kernel-mentees] " Coiby Xu
@ 2020-10-02 13:36         ` Hans de Goede
  -1 siblings, 0 replies; 84+ messages in thread
From: Hans de Goede @ 2020-10-02 13:36 UTC (permalink / raw)
  To: Coiby Xu
  Cc: Linus Walleij, open list:GPIO SUBSYSTEM, Nehal Shah,
	Shyam Sundar S K, linux-kernel-mentees

Hi,

On 10/2/20 2:42 PM, Coiby Xu wrote:
> On Fri, Oct 02, 2020 at 11:40:12AM +0200, Hans de Goede wrote:
>> Hi,
>>
>> On 10/1/20 10:57 PM, Linus Walleij wrote:
>>> Sorry for top posting, but I want to page some people.
>>>
>>> I do not know anything about ACPI, but Hans de Goede is really
>>> good with this kind of things and could possibly provide some
>>> insight.
>>
>> Thanks, although I'm honored to be considered the go to person
>> for these kinda things my specialty really lies with these
>> kinda issues with intel Bay Trail and Cherry Trail SoCs
>> never the less let me take a look.
> 
> Thank you for taking time to examine this touchpad issue!
> 
>>
>>> On Thu, Oct 1, 2020 at 3:23 PM Coiby Xu <coiby.xu@gmail.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I'm trying to fix broken touchpads [1] for a new laptop model Legion-5
>>>> 15ARH05 which is shipped with two different touchpads, i.e., ElAN and
>>>> Synaptics. For the ELAN touchpad, the kernel receives no interrupts to
>>>> be informed of new data from the touchpad. For the Synaptics touchpad,
>>>> only 7 interrupts are received per second which makes the touchpad
>>>> completely unusable. Based on current observations, pinctrl-amd seems to
>>>> be the most suspicious cause.
>>>>
>>>>
>>>> Why do I think pinctrl-amd smells the most suspicious?
>>>> ======================================================
>>>>
>>>> This laptop model has the following hardware configurations specified
>>>> via ACPI,
>>>>   - The touchpad's data interrupt line is connected to pin#130 of a GPIO
>>>>     chip
>>>>
>>>>          GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
>>>>                          "\\_SB.GPIO", 0x00, ResourceConsumer, ,
>>>>                          )
>>>>                          {   // Pin list
>>>>                              0x0082
>>>>                          }
>>>>
>>>>   - This GPIO chip (HID: AMDI0030) which is assigned with IRQ#7 has its
>>>>     common interrupt output line connected to one IO-APIC's pin#7
>>>>
>>>>          Interrupt (ResourceConsumer, Level, ActiveLow, Shared, ,, )
>>>>          {
>>>>              0x00000007,
>>>>          }
>>
>> So these both look fine.
>>
>>>> I add some code to kernel to poll the status of the GPIO chip's pin#130
>>>> and IO-APIc's pin#7 every 1ms when I move my finger on the surface of
>>>> the Synaptics touchpad continuously for about 1s. During the process of I
>>>> move my finger, most of the time,
>>>>   - GPIO chip's pin#130: low input, interrupt unmasked
>>>>   - IO-APIC's pin#7: IRR=0, interrupt unmasked (in fact mask/unmask_ioapic_irq
>>>>     have never been called by the IRQ follow controller handle_fasteoi_irq)
>>>>
>>>> So the touchpad has been generating interrupts most of the time while
>>>> IO-APIC controller hasn't been masking the interrupt from the GPIO chip.
>>>> But somehow the kernel could only get ~7 interrupts each second
>>
>> So are you seeing these 7 interrupts / second for the touchpad irq or for
>> the GPIO controllers parent irq ?
>>
>> Also to these 7 interrupts/sec stop happening when you do not touch the
>> touchpad ?
>>
> I see these 7 interrupts / second for the GPIO controller's parent irq.
> And they stop happening when I don't touch the touchpad.

Only from the parent irq, or also on the touchpad irq itself ?

If this only happens on the parent irq, then I would start looking at the
amd-pinctrl code which determines which of its "child" irqs to fire.

>> To me this sounds like the interrupt is configured as being triggered on
>> a negative edge so that it only fires once when the line from the touchpad
>> goes low, and for some reason 7 times a second the touchpad controller
>> briefly releases the line (sorta gives up to signal the irq and then
>> tries again?).
>>
>>>> while
>>>> the touchpad could generate 140 interrupts (time resolution of 7.2ms)
>>>> per second. Assuming IO-APIC (arch/x86/kernel/apic/io_apic.c) is fine,
>>>> then there's something wrong with the GPIO interrupt controller which
>>>> works fine for the touchpad under Windows. Besides if I poll the touchpad
>>>> data based on pin#130's status, the touchpad could also work under
>>>> Windows.
>>
>> I agree that this sounds like a problem with the GpioInt handling.
>>
>>>> Ways to debug pinctrl-amd
>>>> =========================
>>>>
>>>> I can't find any documentation about the AMDI0030 GPIO chip except for
>>>> the commit logs of drivers/pinctrl/pinctrl-amd. One commit
>>>> ba714a9c1dea85e0bf2899d02dfeb9c70040427c ("pinctrl/amd: Use regular interrupt instead of chained")
>>>> inspired me to bring back chained interrupt to see if "an interrupt storm"
>>>> would happen. The only change I noticed is that the interrupts arrive in
>>>> pairs. The time internal between two interrupts in a pair is ~0.0016s
>>>> but the time internal between interrupt pairs is still ~0.12s (~8Hz).
>>>> Unfortunately, I don't get any insight about the GPIO interrupt
>>>> controller from this tweaking. I wonder if there are any other ways
>>>> to debug drivers/pinctrl/pinctrl-amd?
>>
>> The way I would try to debug this (with access to the hardware) is
>> to try an verify the interrupt trigger (level vs edge) settings inside
>> pinctrl/amd by adding a bunch of printks printing them whenever the
>> relevant register bits are touched.
>>
>> So I'm going to guess here that these touchpads use i2c-hid, so I
>> took a quick peak at the i2c-hid irq request code from
>> drivers/hid/i2c-hid/i2c-hid-core.c:
>>
>>        unsigned long irqflags = 0;
>>        int ret;
>>
>>        dev_dbg(&client->dev, "Requesting IRQ: %d\n", client->irq);
>>
>>        if (!irq_get_trigger_type(client->irq))
>>                irqflags = IRQF_TRIGGER_LOW;
>>
>>        ret = request_threaded_irq(client->irq, NULL, i2c_hid_irq,
>>                                   irqflags | IRQF_ONESHOT, client->name, ihid);
>>
>> So this tries to preserve the pre-configured irq-type on the irq
>> line and if no irq-type is set then it overrides the trigger-type
>> to IRQF_TRIGGER_LOW, which means level-low.
>>
>> One quick hack you can try is ommenting out the "if (!irq_get_trigger_type(client->irq))"
>> type, I guess maybe the pinctrl-amd code is defaulting all IRQs to some
>> edge trigger type? This should override it and recontrol it to
>> a level trigger type.
>>
> Yes, "these touchpads use i2c-hid". I have examined the configuration of
> irq-type in drivers/hid/i2c-hid/i2c-hid-core.c and can confirm it's been
> configured to be level-low.
> 
> $ sudo cat /sys/kernel/debug/gpio|grep -A1 pin130
> 260:pin130      Level trigger| Active low| interrupt is enabled| interrupt is unmasked| disable wakeup in S0i3 state| disable wakeup in S3 state|
> 
> (Of course we rely on drivers/pinctrl/pinctrl-amd.c to read&interpret
> data from the corresponding registers. If pinctrl-amd is return false
> reports, we can do nothing about this)

Well you could review the code printing this vs say the code setting
the trigger type. If those don't match then something is definitely
wrong somewhere.

> Btw, we can't make any change in i2c-hid because they will be overridden
> by drivers/pinctrl/pinctrl-amd.c which use the values from the ACPI tables
> instead,
> 
> static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
> {
> 
>      /* Ignore the settings coming from the client and
>       * read the values from the ACPI tables
>       * while setting the trigger type
>       */
> 
>      irq_flags = irq_get_trigger_type(d->irq);
>      if (irq_flags != IRQ_TYPE_NONE)
>          type = irq_flags;
> }

That looks a bit fishy, sometimes we need to override the irq-type from
a driver because the ACPI tables of various devices are often of
dubious quality. AFAIK non of the Intel GPIO drivers do something like
this...

Also I'm not seeing this in the latest upstream code, so I guess this
bit got recently dropped ... ?

What kernel version are you testing with? You really should always test
things like this with Linus' latest master branch.

Hmm, I wonder if this is not an i2c-controller issue instead. But you should
that you tried to modify the i2c-hid code to poll the GPIO and then run its
threaded-irq handler on a successfull poll instead works around things, right ?

Still it would be interesting to add a printk to the begin + end of the
i2c-hid threaded-irq-handler to see how long it takes to run.

Regards,

Hans



> Also, With CONFIG_GENERIC_IRQ_DEBUGFS enabled, `cat /sys/kernel/debug/irq/irqs/72`
> also shows irq#72 (#72 is requested IRQ of this touchpad device) has the
> expected irq-type,
> 
> $ cat /sys/kernel/debug/irq/irqs/72
> handler:  handle_level_irq
> device:   (null)
> status:   0x00000508
>              _IRQ_NOPROBE
> istate:   0x00000020
>              IRQS_ONESHOT
> ddepth:   0
> wdepth:   0
> dstate:   0x00402208
>              IRQ_TYPE_LEVEL_LOW
>              IRQD_LEVEL
>              IRQD_ACTIVATED
>              IRQD_IRQ_STARTED`
> 
>> ###
>>
>> As you said hopefully the IOApic code is fine. Notice that the ioapic
>> irqchip driver does not allow configuring the trigger type.
>>
> 
> Yes. unlike pinctrl-amd, arch/x86/kernel/apic/io_apic.c doesn't provide
> `(struct irq_chip*)->irq_set_type`. I notice during the setting-up of
> ia-apic, all pins are configured with edge-high according to the IRQ
> redirection table which can be printed out with the "apic=debug" kernel
> parameter,
> 
>      .... IRQ redirection table:
>      IOAPIC 0:
>       pin00, disabled, edge , high, V(00), IRR(0), S(0), physical, D(00), M(0)
> 
>       pin06, enabled , edge , high, V(06), IRR(0), S(0), physical, D(00), M(0)
>       pin07, disabled, edge , high, V(00), IRR(0), S(0), physical, D(00), M(0)
> 
> Later, I manually printed out the IRQ redirection table when processing
> touchpad HID reports, pin07 (which is connected with the GPIO's common
> interrupt output line) has adopted the expected configuration,
> 
>      pin07, enabled , level, low , V(07), IRR(1), S(0), physical, D(00), M(0)
> 
> Today I played with the "noapic" kernel parameter to use PIC mode
> so we can confirm there is nothing wrong with io-apic. Unfortunately
> the I2C adapter can't be set-up (the error is "controller timed out").
> As a consequence, the touchpad as an I2C client won't work either.
> 
> And I can't find a way to disable APIC for Windows either.
> 
>> I guess
>> this is not part of the ioapic spec and that the BIOS/firmware is setting
>> the triggerlevel in a io-apic implementation specific way, so we better hope
>> it is right. I have had the unfortunate experience to try and debug a wrong
>> io-apic irq-pin trigger-type issue with TPMs in some Lenovo thinkpads and
>> in the end only the Lenovo BIOS team could fix this.
> 
> If the same BIOS/firmware is setting the trigger level in a wrong way,
> shouldn't we find the same issue under Windows? Btw, I've set
> 'acpi_osi="Windows 2015"'
> as the kernel parameter before but I didn't notice any change.
> 
>> Regards,
>>
>> Hans
>>
> 
> -- 
> Best regards,
> Coiby
> 


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-02 13:36         ` Hans de Goede
  0 siblings, 0 replies; 84+ messages in thread
From: Hans de Goede @ 2020-10-02 13:36 UTC (permalink / raw)
  To: Coiby Xu
  Cc: open list:GPIO SUBSYSTEM, Linus Walleij, Shyam Sundar S K,
	Nehal Shah, linux-kernel-mentees

Hi,

On 10/2/20 2:42 PM, Coiby Xu wrote:
> On Fri, Oct 02, 2020 at 11:40:12AM +0200, Hans de Goede wrote:
>> Hi,
>>
>> On 10/1/20 10:57 PM, Linus Walleij wrote:
>>> Sorry for top posting, but I want to page some people.
>>>
>>> I do not know anything about ACPI, but Hans de Goede is really
>>> good with this kind of things and could possibly provide some
>>> insight.
>>
>> Thanks, although I'm honored to be considered the go to person
>> for these kinda things my specialty really lies with these
>> kinda issues with intel Bay Trail and Cherry Trail SoCs
>> never the less let me take a look.
> 
> Thank you for taking time to examine this touchpad issue!
> 
>>
>>> On Thu, Oct 1, 2020 at 3:23 PM Coiby Xu <coiby.xu@gmail.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I'm trying to fix broken touchpads [1] for a new laptop model Legion-5
>>>> 15ARH05 which is shipped with two different touchpads, i.e., ElAN and
>>>> Synaptics. For the ELAN touchpad, the kernel receives no interrupts to
>>>> be informed of new data from the touchpad. For the Synaptics touchpad,
>>>> only 7 interrupts are received per second which makes the touchpad
>>>> completely unusable. Based on current observations, pinctrl-amd seems to
>>>> be the most suspicious cause.
>>>>
>>>>
>>>> Why do I think pinctrl-amd smells the most suspicious?
>>>> ======================================================
>>>>
>>>> This laptop model has the following hardware configurations specified
>>>> via ACPI,
>>>>   - The touchpad's data interrupt line is connected to pin#130 of a GPIO
>>>>     chip
>>>>
>>>>          GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
>>>>                          "\\_SB.GPIO", 0x00, ResourceConsumer, ,
>>>>                          )
>>>>                          {   // Pin list
>>>>                              0x0082
>>>>                          }
>>>>
>>>>   - This GPIO chip (HID: AMDI0030) which is assigned with IRQ#7 has its
>>>>     common interrupt output line connected to one IO-APIC's pin#7
>>>>
>>>>          Interrupt (ResourceConsumer, Level, ActiveLow, Shared, ,, )
>>>>          {
>>>>              0x00000007,
>>>>          }
>>
>> So these both look fine.
>>
>>>> I add some code to kernel to poll the status of the GPIO chip's pin#130
>>>> and IO-APIc's pin#7 every 1ms when I move my finger on the surface of
>>>> the Synaptics touchpad continuously for about 1s. During the process of I
>>>> move my finger, most of the time,
>>>>   - GPIO chip's pin#130: low input, interrupt unmasked
>>>>   - IO-APIC's pin#7: IRR=0, interrupt unmasked (in fact mask/unmask_ioapic_irq
>>>>     have never been called by the IRQ follow controller handle_fasteoi_irq)
>>>>
>>>> So the touchpad has been generating interrupts most of the time while
>>>> IO-APIC controller hasn't been masking the interrupt from the GPIO chip.
>>>> But somehow the kernel could only get ~7 interrupts each second
>>
>> So are you seeing these 7 interrupts / second for the touchpad irq or for
>> the GPIO controllers parent irq ?
>>
>> Also to these 7 interrupts/sec stop happening when you do not touch the
>> touchpad ?
>>
> I see these 7 interrupts / second for the GPIO controller's parent irq.
> And they stop happening when I don't touch the touchpad.

Only from the parent irq, or also on the touchpad irq itself ?

If this only happens on the parent irq, then I would start looking at the
amd-pinctrl code which determines which of its "child" irqs to fire.

>> To me this sounds like the interrupt is configured as being triggered on
>> a negative edge so that it only fires once when the line from the touchpad
>> goes low, and for some reason 7 times a second the touchpad controller
>> briefly releases the line (sorta gives up to signal the irq and then
>> tries again?).
>>
>>>> while
>>>> the touchpad could generate 140 interrupts (time resolution of 7.2ms)
>>>> per second. Assuming IO-APIC (arch/x86/kernel/apic/io_apic.c) is fine,
>>>> then there's something wrong with the GPIO interrupt controller which
>>>> works fine for the touchpad under Windows. Besides if I poll the touchpad
>>>> data based on pin#130's status, the touchpad could also work under
>>>> Windows.
>>
>> I agree that this sounds like a problem with the GpioInt handling.
>>
>>>> Ways to debug pinctrl-amd
>>>> =========================
>>>>
>>>> I can't find any documentation about the AMDI0030 GPIO chip except for
>>>> the commit logs of drivers/pinctrl/pinctrl-amd. One commit
>>>> ba714a9c1dea85e0bf2899d02dfeb9c70040427c ("pinctrl/amd: Use regular interrupt instead of chained")
>>>> inspired me to bring back chained interrupt to see if "an interrupt storm"
>>>> would happen. The only change I noticed is that the interrupts arrive in
>>>> pairs. The time internal between two interrupts in a pair is ~0.0016s
>>>> but the time internal between interrupt pairs is still ~0.12s (~8Hz).
>>>> Unfortunately, I don't get any insight about the GPIO interrupt
>>>> controller from this tweaking. I wonder if there are any other ways
>>>> to debug drivers/pinctrl/pinctrl-amd?
>>
>> The way I would try to debug this (with access to the hardware) is
>> to try an verify the interrupt trigger (level vs edge) settings inside
>> pinctrl/amd by adding a bunch of printks printing them whenever the
>> relevant register bits are touched.
>>
>> So I'm going to guess here that these touchpads use i2c-hid, so I
>> took a quick peak at the i2c-hid irq request code from
>> drivers/hid/i2c-hid/i2c-hid-core.c:
>>
>>        unsigned long irqflags = 0;
>>        int ret;
>>
>>        dev_dbg(&client->dev, "Requesting IRQ: %d\n", client->irq);
>>
>>        if (!irq_get_trigger_type(client->irq))
>>                irqflags = IRQF_TRIGGER_LOW;
>>
>>        ret = request_threaded_irq(client->irq, NULL, i2c_hid_irq,
>>                                   irqflags | IRQF_ONESHOT, client->name, ihid);
>>
>> So this tries to preserve the pre-configured irq-type on the irq
>> line and if no irq-type is set then it overrides the trigger-type
>> to IRQF_TRIGGER_LOW, which means level-low.
>>
>> One quick hack you can try is ommenting out the "if (!irq_get_trigger_type(client->irq))"
>> type, I guess maybe the pinctrl-amd code is defaulting all IRQs to some
>> edge trigger type? This should override it and recontrol it to
>> a level trigger type.
>>
> Yes, "these touchpads use i2c-hid". I have examined the configuration of
> irq-type in drivers/hid/i2c-hid/i2c-hid-core.c and can confirm it's been
> configured to be level-low.
> 
> $ sudo cat /sys/kernel/debug/gpio|grep -A1 pin130
> 260:pin130      Level trigger| Active low| interrupt is enabled| interrupt is unmasked| disable wakeup in S0i3 state| disable wakeup in S3 state|
> 
> (Of course we rely on drivers/pinctrl/pinctrl-amd.c to read&interpret
> data from the corresponding registers. If pinctrl-amd is return false
> reports, we can do nothing about this)

Well you could review the code printing this vs say the code setting
the trigger type. If those don't match then something is definitely
wrong somewhere.

> Btw, we can't make any change in i2c-hid because they will be overridden
> by drivers/pinctrl/pinctrl-amd.c which use the values from the ACPI tables
> instead,
> 
> static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
> {
> 
>      /* Ignore the settings coming from the client and
>       * read the values from the ACPI tables
>       * while setting the trigger type
>       */
> 
>      irq_flags = irq_get_trigger_type(d->irq);
>      if (irq_flags != IRQ_TYPE_NONE)
>          type = irq_flags;
> }

That looks a bit fishy, sometimes we need to override the irq-type from
a driver because the ACPI tables of various devices are often of
dubious quality. AFAIK non of the Intel GPIO drivers do something like
this...

Also I'm not seeing this in the latest upstream code, so I guess this
bit got recently dropped ... ?

What kernel version are you testing with? You really should always test
things like this with Linus' latest master branch.

Hmm, I wonder if this is not an i2c-controller issue instead. But you should
that you tried to modify the i2c-hid code to poll the GPIO and then run its
threaded-irq handler on a successfull poll instead works around things, right ?

Still it would be interesting to add a printk to the begin + end of the
i2c-hid threaded-irq-handler to see how long it takes to run.

Regards,

Hans



> Also, With CONFIG_GENERIC_IRQ_DEBUGFS enabled, `cat /sys/kernel/debug/irq/irqs/72`
> also shows irq#72 (#72 is requested IRQ of this touchpad device) has the
> expected irq-type,
> 
> $ cat /sys/kernel/debug/irq/irqs/72
> handler:  handle_level_irq
> device:   (null)
> status:   0x00000508
>              _IRQ_NOPROBE
> istate:   0x00000020
>              IRQS_ONESHOT
> ddepth:   0
> wdepth:   0
> dstate:   0x00402208
>              IRQ_TYPE_LEVEL_LOW
>              IRQD_LEVEL
>              IRQD_ACTIVATED
>              IRQD_IRQ_STARTED`
> 
>> ###
>>
>> As you said hopefully the IOApic code is fine. Notice that the ioapic
>> irqchip driver does not allow configuring the trigger type.
>>
> 
> Yes. unlike pinctrl-amd, arch/x86/kernel/apic/io_apic.c doesn't provide
> `(struct irq_chip*)->irq_set_type`. I notice during the setting-up of
> ia-apic, all pins are configured with edge-high according to the IRQ
> redirection table which can be printed out with the "apic=debug" kernel
> parameter,
> 
>      .... IRQ redirection table:
>      IOAPIC 0:
>       pin00, disabled, edge , high, V(00), IRR(0), S(0), physical, D(00), M(0)
> 
>       pin06, enabled , edge , high, V(06), IRR(0), S(0), physical, D(00), M(0)
>       pin07, disabled, edge , high, V(00), IRR(0), S(0), physical, D(00), M(0)
> 
> Later, I manually printed out the IRQ redirection table when processing
> touchpad HID reports, pin07 (which is connected with the GPIO's common
> interrupt output line) has adopted the expected configuration,
> 
>      pin07, enabled , level, low , V(07), IRR(1), S(0), physical, D(00), M(0)
> 
> Today I played with the "noapic" kernel parameter to use PIC mode
> so we can confirm there is nothing wrong with io-apic. Unfortunately
> the I2C adapter can't be set-up (the error is "controller timed out").
> As a consequence, the touchpad as an I2C client won't work either.
> 
> And I can't find a way to disable APIC for Windows either.
> 
>> I guess
>> this is not part of the ioapic spec and that the BIOS/firmware is setting
>> the triggerlevel in a io-apic implementation specific way, so we better hope
>> it is right. I have had the unfortunate experience to try and debug a wrong
>> io-apic irq-pin trigger-type issue with TPMs in some Lenovo thinkpads and
>> in the end only the Lenovo BIOS team could fix this.
> 
> If the same BIOS/firmware is setting the trigger level in a wrong way,
> shouldn't we find the same issue under Windows? Btw, I've set
> 'acpi_osi="Windows 2015"'
> as the kernel parameter before but I didn't notice any change.
> 
>> Regards,
>>
>> Hans
>>
> 
> -- 
> Best regards,
> Coiby
> 

_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-02 13:36         ` [Linux-kernel-mentees] " Hans de Goede
@ 2020-10-02 14:51           ` Coiby Xu
  -1 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-02 14:51 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Linus Walleij, open list:GPIO SUBSYSTEM, Nehal Shah,
	Shyam Sundar S K, linux-kernel-mentees

On Fri, Oct 02, 2020 at 03:36:29PM +0200, Hans de Goede wrote:
>Hi,
>
>On 10/2/20 2:42 PM, Coiby Xu wrote:
>>On Fri, Oct 02, 2020 at 11:40:12AM +0200, Hans de Goede wrote:
>>>Hi,
>>>
>>>On 10/1/20 10:57 PM, Linus Walleij wrote:
>>>>Sorry for top posting, but I want to page some people.
>>>>
>>>>I do not know anything about ACPI, but Hans de Goede is really
>>>>good with this kind of things and could possibly provide some
>>>>insight.
>>>
>>>Thanks, although I'm honored to be considered the go to person
>>>for these kinda things my specialty really lies with these
>>>kinda issues with intel Bay Trail and Cherry Trail SoCs
>>>never the less let me take a look.
>>
>>Thank you for taking time to examine this touchpad issue!
>>
>>>
>>>>On Thu, Oct 1, 2020 at 3:23 PM Coiby Xu <coiby.xu@gmail.com> wrote:
>>>>>
>>>>>Hi,
>>>>>
>>>>>I'm trying to fix broken touchpads [1] for a new laptop model Legion-5
>>>>>15ARH05 which is shipped with two different touchpads, i.e., ElAN and
>>>>>Synaptics. For the ELAN touchpad, the kernel receives no interrupts to
>>>>>be informed of new data from the touchpad. For the Synaptics touchpad,
>>>>>only 7 interrupts are received per second which makes the touchpad
>>>>>completely unusable. Based on current observations, pinctrl-amd seems to
>>>>>be the most suspicious cause.
>>>>>
>>>>>
>>>>>Why do I think pinctrl-amd smells the most suspicious?
>>>>>======================================================
>>>>>
>>>>>This laptop model has the following hardware configurations specified
>>>>>via ACPI,
>>>>>  - The touchpad's data interrupt line is connected to pin#130 of a GPIO
>>>>>    chip
>>>>>
>>>>>         GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
>>>>>                         "\\_SB.GPIO", 0x00, ResourceConsumer, ,
>>>>>                         )
>>>>>                         {   // Pin list
>>>>>                             0x0082
>>>>>                         }
>>>>>
>>>>>  - This GPIO chip (HID: AMDI0030) which is assigned with IRQ#7 has its
>>>>>    common interrupt output line connected to one IO-APIC's pin#7
>>>>>
>>>>>         Interrupt (ResourceConsumer, Level, ActiveLow, Shared, ,, )
>>>>>         {
>>>>>             0x00000007,
>>>>>         }
>>>
>>>So these both look fine.
>>>
>>>>>I add some code to kernel to poll the status of the GPIO chip's pin#130
>>>>>and IO-APIc's pin#7 every 1ms when I move my finger on the surface of
>>>>>the Synaptics touchpad continuously for about 1s. During the process of I
>>>>>move my finger, most of the time,
>>>>>  - GPIO chip's pin#130: low input, interrupt unmasked
>>>>>  - IO-APIC's pin#7: IRR=0, interrupt unmasked (in fact mask/unmask_ioapic_irq
>>>>>    have never been called by the IRQ follow controller handle_fasteoi_irq)
>>>>>
>>>>>So the touchpad has been generating interrupts most of the time while
>>>>>IO-APIC controller hasn't been masking the interrupt from the GPIO chip.
>>>>>But somehow the kernel could only get ~7 interrupts each second
>>>
>>>So are you seeing these 7 interrupts / second for the touchpad irq or for
>>>the GPIO controllers parent irq ?
>>>
>>>Also to these 7 interrupts/sec stop happening when you do not touch the
>>>touchpad ?
>>>
>>I see these 7 interrupts / second for the GPIO controller's parent irq.
>>And they stop happening when I don't touch the touchpad.
>
>Only from the parent irq, or also on the touchpad irq itself ?
>
>If this only happens on the parent irq, then I would start looking at the
>amd-pinctrl code which determines which of its "child" irqs to fire.

This only happens on the parent irq. The input's pin#130 of the GIPO
chip is low most of the time and pin#130.
>
>>>To me this sounds like the interrupt is configured as being triggered on
>>>a negative edge so that it only fires once when the line from the touchpad
>>>goes low, and for some reason 7 times a second the touchpad controller
>>>briefly releases the line (sorta gives up to signal the irq and then
>>>tries again?).
>>>
>>>>>while
>>>>>the touchpad could generate 140 interrupts (time resolution of 7.2ms)
>>>>>per second. Assuming IO-APIC (arch/x86/kernel/apic/io_apic.c) is fine,
>>>>>then there's something wrong with the GPIO interrupt controller which
>>>>>works fine for the touchpad under Windows. Besides if I poll the touchpad
>>>>>data based on pin#130's status, the touchpad could also work under
>>>>>Windows.
>>>
>>>I agree that this sounds like a problem with the GpioInt handling.
>>>
>>>>>Ways to debug pinctrl-amd
>>>>>=========================
>>>>>
>>>>>I can't find any documentation about the AMDI0030 GPIO chip except for
>>>>>the commit logs of drivers/pinctrl/pinctrl-amd. One commit
>>>>>ba714a9c1dea85e0bf2899d02dfeb9c70040427c ("pinctrl/amd: Use regular interrupt instead of chained")
>>>>>inspired me to bring back chained interrupt to see if "an interrupt storm"
>>>>>would happen. The only change I noticed is that the interrupts arrive in
>>>>>pairs. The time internal between two interrupts in a pair is ~0.0016s
>>>>>but the time internal between interrupt pairs is still ~0.12s (~8Hz).
>>>>>Unfortunately, I don't get any insight about the GPIO interrupt
>>>>>controller from this tweaking. I wonder if there are any other ways
>>>>>to debug drivers/pinctrl/pinctrl-amd?
>>>
>>>The way I would try to debug this (with access to the hardware) is
>>>to try an verify the interrupt trigger (level vs edge) settings inside
>>>pinctrl/amd by adding a bunch of printks printing them whenever the
>>>relevant register bits are touched.
>>>
>>>So I'm going to guess here that these touchpads use i2c-hid, so I
>>>took a quick peak at the i2c-hid irq request code from
>>>drivers/hid/i2c-hid/i2c-hid-core.c:
>>>
>>>       unsigned long irqflags = 0;
>>>       int ret;
>>>
>>>       dev_dbg(&client->dev, "Requesting IRQ: %d\n", client->irq);
>>>
>>>       if (!irq_get_trigger_type(client->irq))
>>>               irqflags = IRQF_TRIGGER_LOW;
>>>
>>>       ret = request_threaded_irq(client->irq, NULL, i2c_hid_irq,
>>>                                  irqflags | IRQF_ONESHOT, client->name, ihid);
>>>
>>>So this tries to preserve the pre-configured irq-type on the irq
>>>line and if no irq-type is set then it overrides the trigger-type
>>>to IRQF_TRIGGER_LOW, which means level-low.
>>>
>>>One quick hack you can try is ommenting out the "if (!irq_get_trigger_type(client->irq))"
>>>type, I guess maybe the pinctrl-amd code is defaulting all IRQs to some
>>>edge trigger type? This should override it and recontrol it to
>>>a level trigger type.
>>>
>>Yes, "these touchpads use i2c-hid". I have examined the configuration of
>>irq-type in drivers/hid/i2c-hid/i2c-hid-core.c and can confirm it's been
>>configured to be level-low.
>>
>>$ sudo cat /sys/kernel/debug/gpio|grep -A1 pin130
>>260:pin130      Level trigger| Active low| interrupt is enabled| interrupt is unmasked| disable wakeup in S0i3 state| disable wakeup in S3 state|
>>
>>(Of course we rely on drivers/pinctrl/pinctrl-amd.c to read&interpret
>>data from the corresponding registers. If pinctrl-amd is return false
>>reports, we can do nothing about this)
>
>Well you could review the code printing this vs say the code setting
>the trigger type. If those don't match then something is definitely
>wrong somewhere.
>
Thank you for the suggestion! I just did a review and didn't find
anything suspicious. Before, I thought I need some hardware specs to
confirm the code is written following the specs but I can't find any
documentation. And pinctrl-amd has proven to be working for other
laptop model although there were several touchpad issues caused by
pinctrl-amd which have been fixed. So I can assume there's nothing
wrong with basic functionalities like setting interrupt trigger
type.

>>Btw, we can't make any change in i2c-hid because they will be overridden
>>by drivers/pinctrl/pinctrl-amd.c which use the values from the ACPI tables
>>instead,
>>
>>static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>>{
>>
>>     /* Ignore the settings coming from the client and
>>      * read the values from the ACPI tables
>>      * while setting the trigger type
>>      */
>>
>>     irq_flags = irq_get_trigger_type(d->irq);
>>     if (irq_flags != IRQ_TYPE_NONE)
>>         type = irq_flags;
>>}
>
>That looks a bit fishy, sometimes we need to override the irq-type from
>a driver because the ACPI tables of various devices are often of
>dubious quality. AFAIK non of the Intel GPIO drivers do something like
>this...
>
>Also I'm not seeing this in the latest upstream code, so I guess this
>bit got recently dropped ... ?
>
>What kernel version are you testing with? You really should always test
>things like this with Linus' latest master branch.
>

Sorry for the confusion! I use 5.7.4 for testing which was the latest
version when I had this laptop. And this part of code of overriding
the irq-type indeed has indeed been removed on Jun 26.

I has been sticking with 5.7.4 because some users who also own this
laptop have been actively reporting the results with the latest kernel
and I occasionally test it myself (for example, today I checked 5.9 rc6).
I will use the latest kernel to reduce the communication cost.

>Hmm, I wonder if this is not an i2c-controller issue instead. But you should
>that you tried to modify the i2c-hid code to poll the GPIO and then run its
>threaded-irq handler on a successfull poll instead works around things, right ?
>
>Still it would be interesting to add a printk to the begin + end of the
>i2c-hid threaded-irq-handler to see how long it takes to run.
>
Yes. Polling the touchpad based on pin#130's status could make the
touchpad work which has been confirmed by other affected user.
I have already examined i2c-controller, i2c-hid and hid-multoutch
before focusing on pinctrl-amd. The i2c-hid threaded-irq-handler
can process ~500 interrupts at maximum. Based on these evidences
(for the details, please check https://www.spinics.net/lists/linux-input/msg69267.html),
I think I could move on to examine pinctrl-amd.
>
>Regards,
>
>Hans
>
>
>
>>Also, With CONFIG_GENERIC_IRQ_DEBUGFS enabled, `cat /sys/kernel/debug/irq/irqs/72`
>>also shows irq#72 (#72 is requested IRQ of this touchpad device) has the
>>expected irq-type,
>>
>>$ cat /sys/kernel/debug/irq/irqs/72
>>handler:  handle_level_irq
>>device:   (null)
>>status:   0x00000508
>>             _IRQ_NOPROBE
>>istate:   0x00000020
>>             IRQS_ONESHOT
>>ddepth:   0
>>wdepth:   0
>>dstate:   0x00402208
>>             IRQ_TYPE_LEVEL_LOW
>>             IRQD_LEVEL
>>             IRQD_ACTIVATED
>>             IRQD_IRQ_STARTED`
>>
>>>###
>>>
>>>As you said hopefully the IOApic code is fine. Notice that the ioapic
>>>irqchip driver does not allow configuring the trigger type.
>>>
>>
>>Yes. unlike pinctrl-amd, arch/x86/kernel/apic/io_apic.c doesn't provide
>>`(struct irq_chip*)->irq_set_type`. I notice during the setting-up of
>>ia-apic, all pins are configured with edge-high according to the IRQ
>>redirection table which can be printed out with the "apic=debug" kernel
>>parameter,
>>
>>     .... IRQ redirection table:
>>     IOAPIC 0:
>>      pin00, disabled, edge , high, V(00), IRR(0), S(0), physical, D(00), M(0)
>>
>>      pin06, enabled , edge , high, V(06), IRR(0), S(0), physical, D(00), M(0)
>>      pin07, disabled, edge , high, V(00), IRR(0), S(0), physical, D(00), M(0)
>>
>>Later, I manually printed out the IRQ redirection table when processing
>>touchpad HID reports, pin07 (which is connected with the GPIO's common
>>interrupt output line) has adopted the expected configuration,
>>
>>     pin07, enabled , level, low , V(07), IRR(1), S(0), physical, D(00), M(0)
>>
>>Today I played with the "noapic" kernel parameter to use PIC mode
>>so we can confirm there is nothing wrong with io-apic. Unfortunately
>>the I2C adapter can't be set-up (the error is "controller timed out").
>>As a consequence, the touchpad as an I2C client won't work either.
>>
>>And I can't find a way to disable APIC for Windows either.
>>
>>>I guess
>>>this is not part of the ioapic spec and that the BIOS/firmware is setting
>>>the triggerlevel in a io-apic implementation specific way, so we better hope
>>>it is right. I have had the unfortunate experience to try and debug a wrong
>>>io-apic irq-pin trigger-type issue with TPMs in some Lenovo thinkpads and
>>>in the end only the Lenovo BIOS team could fix this.
>>
>>If the same BIOS/firmware is setting the trigger level in a wrong way,
>>shouldn't we find the same issue under Windows? Btw, I've set
>>'acpi_osi="Windows 2015"'
>>as the kernel parameter before but I didn't notice any change.
>>
>>>Regards,
>>>
>>>Hans
>>>
>>
>>--
>>Best regards,
>>Coiby
>>
>

--
Best regards,
Coiby

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-02 14:51           ` Coiby Xu
  0 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-02 14:51 UTC (permalink / raw)
  To: Hans de Goede
  Cc: open list:GPIO SUBSYSTEM, Linus Walleij, Shyam Sundar S K,
	Nehal Shah, linux-kernel-mentees

On Fri, Oct 02, 2020 at 03:36:29PM +0200, Hans de Goede wrote:
>Hi,
>
>On 10/2/20 2:42 PM, Coiby Xu wrote:
>>On Fri, Oct 02, 2020 at 11:40:12AM +0200, Hans de Goede wrote:
>>>Hi,
>>>
>>>On 10/1/20 10:57 PM, Linus Walleij wrote:
>>>>Sorry for top posting, but I want to page some people.
>>>>
>>>>I do not know anything about ACPI, but Hans de Goede is really
>>>>good with this kind of things and could possibly provide some
>>>>insight.
>>>
>>>Thanks, although I'm honored to be considered the go to person
>>>for these kinda things my specialty really lies with these
>>>kinda issues with intel Bay Trail and Cherry Trail SoCs
>>>never the less let me take a look.
>>
>>Thank you for taking time to examine this touchpad issue!
>>
>>>
>>>>On Thu, Oct 1, 2020 at 3:23 PM Coiby Xu <coiby.xu@gmail.com> wrote:
>>>>>
>>>>>Hi,
>>>>>
>>>>>I'm trying to fix broken touchpads [1] for a new laptop model Legion-5
>>>>>15ARH05 which is shipped with two different touchpads, i.e., ElAN and
>>>>>Synaptics. For the ELAN touchpad, the kernel receives no interrupts to
>>>>>be informed of new data from the touchpad. For the Synaptics touchpad,
>>>>>only 7 interrupts are received per second which makes the touchpad
>>>>>completely unusable. Based on current observations, pinctrl-amd seems to
>>>>>be the most suspicious cause.
>>>>>
>>>>>
>>>>>Why do I think pinctrl-amd smells the most suspicious?
>>>>>======================================================
>>>>>
>>>>>This laptop model has the following hardware configurations specified
>>>>>via ACPI,
>>>>>  - The touchpad's data interrupt line is connected to pin#130 of a GPIO
>>>>>    chip
>>>>>
>>>>>         GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
>>>>>                         "\\_SB.GPIO", 0x00, ResourceConsumer, ,
>>>>>                         )
>>>>>                         {   // Pin list
>>>>>                             0x0082
>>>>>                         }
>>>>>
>>>>>  - This GPIO chip (HID: AMDI0030) which is assigned with IRQ#7 has its
>>>>>    common interrupt output line connected to one IO-APIC's pin#7
>>>>>
>>>>>         Interrupt (ResourceConsumer, Level, ActiveLow, Shared, ,, )
>>>>>         {
>>>>>             0x00000007,
>>>>>         }
>>>
>>>So these both look fine.
>>>
>>>>>I add some code to kernel to poll the status of the GPIO chip's pin#130
>>>>>and IO-APIc's pin#7 every 1ms when I move my finger on the surface of
>>>>>the Synaptics touchpad continuously for about 1s. During the process of I
>>>>>move my finger, most of the time,
>>>>>  - GPIO chip's pin#130: low input, interrupt unmasked
>>>>>  - IO-APIC's pin#7: IRR=0, interrupt unmasked (in fact mask/unmask_ioapic_irq
>>>>>    have never been called by the IRQ follow controller handle_fasteoi_irq)
>>>>>
>>>>>So the touchpad has been generating interrupts most of the time while
>>>>>IO-APIC controller hasn't been masking the interrupt from the GPIO chip.
>>>>>But somehow the kernel could only get ~7 interrupts each second
>>>
>>>So are you seeing these 7 interrupts / second for the touchpad irq or for
>>>the GPIO controllers parent irq ?
>>>
>>>Also to these 7 interrupts/sec stop happening when you do not touch the
>>>touchpad ?
>>>
>>I see these 7 interrupts / second for the GPIO controller's parent irq.
>>And they stop happening when I don't touch the touchpad.
>
>Only from the parent irq, or also on the touchpad irq itself ?
>
>If this only happens on the parent irq, then I would start looking at the
>amd-pinctrl code which determines which of its "child" irqs to fire.

This only happens on the parent irq. The input's pin#130 of the GIPO
chip is low most of the time and pin#130.
>
>>>To me this sounds like the interrupt is configured as being triggered on
>>>a negative edge so that it only fires once when the line from the touchpad
>>>goes low, and for some reason 7 times a second the touchpad controller
>>>briefly releases the line (sorta gives up to signal the irq and then
>>>tries again?).
>>>
>>>>>while
>>>>>the touchpad could generate 140 interrupts (time resolution of 7.2ms)
>>>>>per second. Assuming IO-APIC (arch/x86/kernel/apic/io_apic.c) is fine,
>>>>>then there's something wrong with the GPIO interrupt controller which
>>>>>works fine for the touchpad under Windows. Besides if I poll the touchpad
>>>>>data based on pin#130's status, the touchpad could also work under
>>>>>Windows.
>>>
>>>I agree that this sounds like a problem with the GpioInt handling.
>>>
>>>>>Ways to debug pinctrl-amd
>>>>>=========================
>>>>>
>>>>>I can't find any documentation about the AMDI0030 GPIO chip except for
>>>>>the commit logs of drivers/pinctrl/pinctrl-amd. One commit
>>>>>ba714a9c1dea85e0bf2899d02dfeb9c70040427c ("pinctrl/amd: Use regular interrupt instead of chained")
>>>>>inspired me to bring back chained interrupt to see if "an interrupt storm"
>>>>>would happen. The only change I noticed is that the interrupts arrive in
>>>>>pairs. The time internal between two interrupts in a pair is ~0.0016s
>>>>>but the time internal between interrupt pairs is still ~0.12s (~8Hz).
>>>>>Unfortunately, I don't get any insight about the GPIO interrupt
>>>>>controller from this tweaking. I wonder if there are any other ways
>>>>>to debug drivers/pinctrl/pinctrl-amd?
>>>
>>>The way I would try to debug this (with access to the hardware) is
>>>to try an verify the interrupt trigger (level vs edge) settings inside
>>>pinctrl/amd by adding a bunch of printks printing them whenever the
>>>relevant register bits are touched.
>>>
>>>So I'm going to guess here that these touchpads use i2c-hid, so I
>>>took a quick peak at the i2c-hid irq request code from
>>>drivers/hid/i2c-hid/i2c-hid-core.c:
>>>
>>>       unsigned long irqflags = 0;
>>>       int ret;
>>>
>>>       dev_dbg(&client->dev, "Requesting IRQ: %d\n", client->irq);
>>>
>>>       if (!irq_get_trigger_type(client->irq))
>>>               irqflags = IRQF_TRIGGER_LOW;
>>>
>>>       ret = request_threaded_irq(client->irq, NULL, i2c_hid_irq,
>>>                                  irqflags | IRQF_ONESHOT, client->name, ihid);
>>>
>>>So this tries to preserve the pre-configured irq-type on the irq
>>>line and if no irq-type is set then it overrides the trigger-type
>>>to IRQF_TRIGGER_LOW, which means level-low.
>>>
>>>One quick hack you can try is ommenting out the "if (!irq_get_trigger_type(client->irq))"
>>>type, I guess maybe the pinctrl-amd code is defaulting all IRQs to some
>>>edge trigger type? This should override it and recontrol it to
>>>a level trigger type.
>>>
>>Yes, "these touchpads use i2c-hid". I have examined the configuration of
>>irq-type in drivers/hid/i2c-hid/i2c-hid-core.c and can confirm it's been
>>configured to be level-low.
>>
>>$ sudo cat /sys/kernel/debug/gpio|grep -A1 pin130
>>260:pin130      Level trigger| Active low| interrupt is enabled| interrupt is unmasked| disable wakeup in S0i3 state| disable wakeup in S3 state|
>>
>>(Of course we rely on drivers/pinctrl/pinctrl-amd.c to read&interpret
>>data from the corresponding registers. If pinctrl-amd is return false
>>reports, we can do nothing about this)
>
>Well you could review the code printing this vs say the code setting
>the trigger type. If those don't match then something is definitely
>wrong somewhere.
>
Thank you for the suggestion! I just did a review and didn't find
anything suspicious. Before, I thought I need some hardware specs to
confirm the code is written following the specs but I can't find any
documentation. And pinctrl-amd has proven to be working for other
laptop model although there were several touchpad issues caused by
pinctrl-amd which have been fixed. So I can assume there's nothing
wrong with basic functionalities like setting interrupt trigger
type.

>>Btw, we can't make any change in i2c-hid because they will be overridden
>>by drivers/pinctrl/pinctrl-amd.c which use the values from the ACPI tables
>>instead,
>>
>>static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>>{
>>
>>     /* Ignore the settings coming from the client and
>>      * read the values from the ACPI tables
>>      * while setting the trigger type
>>      */
>>
>>     irq_flags = irq_get_trigger_type(d->irq);
>>     if (irq_flags != IRQ_TYPE_NONE)
>>         type = irq_flags;
>>}
>
>That looks a bit fishy, sometimes we need to override the irq-type from
>a driver because the ACPI tables of various devices are often of
>dubious quality. AFAIK non of the Intel GPIO drivers do something like
>this...
>
>Also I'm not seeing this in the latest upstream code, so I guess this
>bit got recently dropped ... ?
>
>What kernel version are you testing with? You really should always test
>things like this with Linus' latest master branch.
>

Sorry for the confusion! I use 5.7.4 for testing which was the latest
version when I had this laptop. And this part of code of overriding
the irq-type indeed has indeed been removed on Jun 26.

I has been sticking with 5.7.4 because some users who also own this
laptop have been actively reporting the results with the latest kernel
and I occasionally test it myself (for example, today I checked 5.9 rc6).
I will use the latest kernel to reduce the communication cost.

>Hmm, I wonder if this is not an i2c-controller issue instead. But you should
>that you tried to modify the i2c-hid code to poll the GPIO and then run its
>threaded-irq handler on a successfull poll instead works around things, right ?
>
>Still it would be interesting to add a printk to the begin + end of the
>i2c-hid threaded-irq-handler to see how long it takes to run.
>
Yes. Polling the touchpad based on pin#130's status could make the
touchpad work which has been confirmed by other affected user.
I have already examined i2c-controller, i2c-hid and hid-multoutch
before focusing on pinctrl-amd. The i2c-hid threaded-irq-handler
can process ~500 interrupts at maximum. Based on these evidences
(for the details, please check https://www.spinics.net/lists/linux-input/msg69267.html),
I think I could move on to examine pinctrl-amd.
>
>Regards,
>
>Hans
>
>
>
>>Also, With CONFIG_GENERIC_IRQ_DEBUGFS enabled, `cat /sys/kernel/debug/irq/irqs/72`
>>also shows irq#72 (#72 is requested IRQ of this touchpad device) has the
>>expected irq-type,
>>
>>$ cat /sys/kernel/debug/irq/irqs/72
>>handler:  handle_level_irq
>>device:   (null)
>>status:   0x00000508
>>             _IRQ_NOPROBE
>>istate:   0x00000020
>>             IRQS_ONESHOT
>>ddepth:   0
>>wdepth:   0
>>dstate:   0x00402208
>>             IRQ_TYPE_LEVEL_LOW
>>             IRQD_LEVEL
>>             IRQD_ACTIVATED
>>             IRQD_IRQ_STARTED`
>>
>>>###
>>>
>>>As you said hopefully the IOApic code is fine. Notice that the ioapic
>>>irqchip driver does not allow configuring the trigger type.
>>>
>>
>>Yes. unlike pinctrl-amd, arch/x86/kernel/apic/io_apic.c doesn't provide
>>`(struct irq_chip*)->irq_set_type`. I notice during the setting-up of
>>ia-apic, all pins are configured with edge-high according to the IRQ
>>redirection table which can be printed out with the "apic=debug" kernel
>>parameter,
>>
>>     .... IRQ redirection table:
>>     IOAPIC 0:
>>      pin00, disabled, edge , high, V(00), IRR(0), S(0), physical, D(00), M(0)
>>
>>      pin06, enabled , edge , high, V(06), IRR(0), S(0), physical, D(00), M(0)
>>      pin07, disabled, edge , high, V(00), IRR(0), S(0), physical, D(00), M(0)
>>
>>Later, I manually printed out the IRQ redirection table when processing
>>touchpad HID reports, pin07 (which is connected with the GPIO's common
>>interrupt output line) has adopted the expected configuration,
>>
>>     pin07, enabled , level, low , V(07), IRR(1), S(0), physical, D(00), M(0)
>>
>>Today I played with the "noapic" kernel parameter to use PIC mode
>>so we can confirm there is nothing wrong with io-apic. Unfortunately
>>the I2C adapter can't be set-up (the error is "controller timed out").
>>As a consequence, the touchpad as an I2C client won't work either.
>>
>>And I can't find a way to disable APIC for Windows either.
>>
>>>I guess
>>>this is not part of the ioapic spec and that the BIOS/firmware is setting
>>>the triggerlevel in a io-apic implementation specific way, so we better hope
>>>it is right. I have had the unfortunate experience to try and debug a wrong
>>>io-apic irq-pin trigger-type issue with TPMs in some Lenovo thinkpads and
>>>in the end only the Lenovo BIOS team could fix this.
>>
>>If the same BIOS/firmware is setting the trigger level in a wrong way,
>>shouldn't we find the same issue under Windows? Btw, I've set
>>'acpi_osi="Windows 2015"'
>>as the kernel parameter before but I didn't notice any change.
>>
>>>Regards,
>>>
>>>Hans
>>>
>>
>>--
>>Best regards,
>>Coiby
>>
>

--
Best regards,
Coiby
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-02 14:51           ` [Linux-kernel-mentees] " Coiby Xu
@ 2020-10-02 19:44             ` Hans de Goede
  -1 siblings, 0 replies; 84+ messages in thread
From: Hans de Goede @ 2020-10-02 19:44 UTC (permalink / raw)
  To: Coiby Xu
  Cc: Linus Walleij, open list:GPIO SUBSYSTEM, Nehal Shah,
	Shyam Sundar S K, linux-kernel-mentees

Hi,

On 10/2/20 4:51 PM, Coiby Xu wrote:
> On Fri, Oct 02, 2020 at 03:36:29PM +0200, Hans de Goede wrote:

<snip>

>>>> So are you seeing these 7 interrupts / second for the touchpad irq or for
>>>> the GPIO controllers parent irq ?
>>>>
>>>> Also to these 7 interrupts/sec stop happening when you do not touch the
>>>> touchpad ?
>>>>
>>> I see these 7 interrupts / second for the GPIO controller's parent irq.
>>> And they stop happening when I don't touch the touchpad.
>>
>> Only from the parent irq, or also on the touchpad irq itself ?
>>
>> If this only happens on the parent irq, then I would start looking at the
>> amd-pinctrl code which determines which of its "child" irqs to fire.
> 
> This only happens on the parent irq. The input's pin#130 of the GIPO
> chip is low most of the time and pin#130.

Right, but it is a low-level triggered IRQ, so when it is low it should
be executing the i2c-hid interrupt-handler. If it is not executing that
then it is time to look at amd-pinctrl's irq-handler and figure out why
that is not triggering the child irq handler for the touchpad.

Regards,

Hans


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-02 19:44             ` Hans de Goede
  0 siblings, 0 replies; 84+ messages in thread
From: Hans de Goede @ 2020-10-02 19:44 UTC (permalink / raw)
  To: Coiby Xu
  Cc: open list:GPIO SUBSYSTEM, Linus Walleij, Shyam Sundar S K,
	Nehal Shah, linux-kernel-mentees

Hi,

On 10/2/20 4:51 PM, Coiby Xu wrote:
> On Fri, Oct 02, 2020 at 03:36:29PM +0200, Hans de Goede wrote:

<snip>

>>>> So are you seeing these 7 interrupts / second for the touchpad irq or for
>>>> the GPIO controllers parent irq ?
>>>>
>>>> Also to these 7 interrupts/sec stop happening when you do not touch the
>>>> touchpad ?
>>>>
>>> I see these 7 interrupts / second for the GPIO controller's parent irq.
>>> And they stop happening when I don't touch the touchpad.
>>
>> Only from the parent irq, or also on the touchpad irq itself ?
>>
>> If this only happens on the parent irq, then I would start looking at the
>> amd-pinctrl code which determines which of its "child" irqs to fire.
> 
> This only happens on the parent irq. The input's pin#130 of the GIPO
> chip is low most of the time and pin#130.

Right, but it is a low-level triggered IRQ, so when it is low it should
be executing the i2c-hid interrupt-handler. If it is not executing that
then it is time to look at amd-pinctrl's irq-handler and figure out why
that is not triggering the child irq handler for the touchpad.

Regards,

Hans

_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-02 19:44             ` [Linux-kernel-mentees] " Hans de Goede
@ 2020-10-02 22:45               ` Coiby Xu
  -1 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-02 22:45 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Linus Walleij, open list:GPIO SUBSYSTEM, Nehal Shah,
	Shyam Sundar S K, linux-kernel-mentees

On Fri, Oct 02, 2020 at 09:44:54PM +0200, Hans de Goede wrote:
>Hi,
>
>On 10/2/20 4:51 PM, Coiby Xu wrote:
>>On Fri, Oct 02, 2020 at 03:36:29PM +0200, Hans de Goede wrote:
>
><snip>
>
>>>>>So are you seeing these 7 interrupts / second for the touchpad irq or for
>>>>>the GPIO controllers parent irq ?
>>>>>
>>>>>Also to these 7 interrupts/sec stop happening when you do not touch the
>>>>>touchpad ?
>>>>>
>>>>I see these 7 interrupts / second for the GPIO controller's parent irq.
>>>>And they stop happening when I don't touch the touchpad.
>>>
>>>Only from the parent irq, or also on the touchpad irq itself ?
>>>
>>>If this only happens on the parent irq, then I would start looking at the
>>>amd-pinctrl code which determines which of its "child" irqs to fire.
>>
>>This only happens on the parent irq. The input's pin#130 of the GIPO
>>chip is low most of the time and pin#130.
>
>Right, but it is a low-level triggered IRQ, so when it is low it should
>be executing the i2c-hid interrupt-handler. If it is not executing that
>then it is time to look at amd-pinctrl's irq-handler and figure out why
>that is not triggering the child irq handler for the touchpad.
>
I'm not sure if I have some incorrect understandings about GPIO
interrupt controller because I don't quite follow your reasoning.
What I actually suspect is there's something wrong with amd-pinctrl
which makes the GPIO chip fail to assert its common interrupt output
line connected to one IO-APIC's pin#7 thus IRQ#7 fails to fire. What
I learn about this low-level triggered IRQ is that the i2c-hid
interrupt-handler will be woken up by amd-pinctrl's irq-handler which
is executed when the parent IRQ#7 fires. The code path is as follows,

     <IRQ>
     dump_stack+0x64/0x88
     __irq_wake_thread.cold+0x9/0x12
     __handle_irq_event_percpu+0x80/0x1c0
     handle_irq_event+0x58/0xb0
     handle_level_irq+0xb7/0x1a0
     generic_handle_irq+0x4a/0x60
     amd_gpio_irq_handler+0x15f/0x1b0 [pinctrl_amd]
     __handle_irq_event_percpu+0x45/0x1c0
     handle_irq_event+0x58/0xb0
     handle_fasteoi_irq+0xa2/0x210
     do_IRQ+0x70/0x120
     common_interrupt+0xf/0xf
     </IRQ>

But the problem is somehow IRQ#7 doesn't even fire when the input's
pin#130 of the GIPO is low. Without IRQ#7 firing, amd-pinctrl's
irq-handler wouldn't be executed in the first place, let alone
triggering the child irq handler. Btw, amd-pinctrl's irq-handler
simply iterate over all pins. If there is mapped irq found for this
hwirq (yes, it won't even check if this pin triggers the interrupt),
then it will call generic_handle_irq. So there's nothing wrong about
this part of code.

I've reverted commit ba714a9c1dea85e0bf2899d02dfeb9c70040427c
("pinctrl/amd: Use regular interrupt instead of chained") to bring
back chained interrupt to see if "an irq storm" would happen which
seems to be what I need since currently IRQ#7 only fires ~7 times per
second. The results is the interrupts arrive in pairs. The time
internal between two interrupts in a pair is ~0.0016s but the time
internal between interrupt pairs is still ~0.12s (~8Hz). I can't
understand this kind of behaviour. This GPIO chip acts like a
black box to me. That's also why I ask for other ways to debug
amd-pinctrl here in the hope I could understand why the time internal
between the two interrupts in a par is much shorter thus to find a
way to let IRQ#7 fires much more frequently.

>Regards,
>
>Hans
>

--
Best regards,
Coiby

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-02 22:45               ` Coiby Xu
  0 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-02 22:45 UTC (permalink / raw)
  To: Hans de Goede
  Cc: open list:GPIO SUBSYSTEM, Linus Walleij, Shyam Sundar S K,
	Nehal Shah, linux-kernel-mentees

On Fri, Oct 02, 2020 at 09:44:54PM +0200, Hans de Goede wrote:
>Hi,
>
>On 10/2/20 4:51 PM, Coiby Xu wrote:
>>On Fri, Oct 02, 2020 at 03:36:29PM +0200, Hans de Goede wrote:
>
><snip>
>
>>>>>So are you seeing these 7 interrupts / second for the touchpad irq or for
>>>>>the GPIO controllers parent irq ?
>>>>>
>>>>>Also to these 7 interrupts/sec stop happening when you do not touch the
>>>>>touchpad ?
>>>>>
>>>>I see these 7 interrupts / second for the GPIO controller's parent irq.
>>>>And they stop happening when I don't touch the touchpad.
>>>
>>>Only from the parent irq, or also on the touchpad irq itself ?
>>>
>>>If this only happens on the parent irq, then I would start looking at the
>>>amd-pinctrl code which determines which of its "child" irqs to fire.
>>
>>This only happens on the parent irq. The input's pin#130 of the GIPO
>>chip is low most of the time and pin#130.
>
>Right, but it is a low-level triggered IRQ, so when it is low it should
>be executing the i2c-hid interrupt-handler. If it is not executing that
>then it is time to look at amd-pinctrl's irq-handler and figure out why
>that is not triggering the child irq handler for the touchpad.
>
I'm not sure if I have some incorrect understandings about GPIO
interrupt controller because I don't quite follow your reasoning.
What I actually suspect is there's something wrong with amd-pinctrl
which makes the GPIO chip fail to assert its common interrupt output
line connected to one IO-APIC's pin#7 thus IRQ#7 fails to fire. What
I learn about this low-level triggered IRQ is that the i2c-hid
interrupt-handler will be woken up by amd-pinctrl's irq-handler which
is executed when the parent IRQ#7 fires. The code path is as follows,

     <IRQ>
     dump_stack+0x64/0x88
     __irq_wake_thread.cold+0x9/0x12
     __handle_irq_event_percpu+0x80/0x1c0
     handle_irq_event+0x58/0xb0
     handle_level_irq+0xb7/0x1a0
     generic_handle_irq+0x4a/0x60
     amd_gpio_irq_handler+0x15f/0x1b0 [pinctrl_amd]
     __handle_irq_event_percpu+0x45/0x1c0
     handle_irq_event+0x58/0xb0
     handle_fasteoi_irq+0xa2/0x210
     do_IRQ+0x70/0x120
     common_interrupt+0xf/0xf
     </IRQ>

But the problem is somehow IRQ#7 doesn't even fire when the input's
pin#130 of the GIPO is low. Without IRQ#7 firing, amd-pinctrl's
irq-handler wouldn't be executed in the first place, let alone
triggering the child irq handler. Btw, amd-pinctrl's irq-handler
simply iterate over all pins. If there is mapped irq found for this
hwirq (yes, it won't even check if this pin triggers the interrupt),
then it will call generic_handle_irq. So there's nothing wrong about
this part of code.

I've reverted commit ba714a9c1dea85e0bf2899d02dfeb9c70040427c
("pinctrl/amd: Use regular interrupt instead of chained") to bring
back chained interrupt to see if "an irq storm" would happen which
seems to be what I need since currently IRQ#7 only fires ~7 times per
second. The results is the interrupts arrive in pairs. The time
internal between two interrupts in a pair is ~0.0016s but the time
internal between interrupt pairs is still ~0.12s (~8Hz). I can't
understand this kind of behaviour. This GPIO chip acts like a
black box to me. That's also why I ask for other ways to debug
amd-pinctrl here in the hope I could understand why the time internal
between the two interrupts in a par is much shorter thus to find a
way to let IRQ#7 fires much more frequently.

>Regards,
>
>Hans
>

--
Best regards,
Coiby
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-02 22:45               ` [Linux-kernel-mentees] " Coiby Xu
@ 2020-10-03 13:22                 ` Hans de Goede
  -1 siblings, 0 replies; 84+ messages in thread
From: Hans de Goede @ 2020-10-03 13:22 UTC (permalink / raw)
  To: Coiby Xu
  Cc: Linus Walleij, open list:GPIO SUBSYSTEM, Nehal Shah,
	Shyam Sundar S K, linux-kernel-mentees

Hi,

On 10/3/20 12:45 AM, Coiby Xu wrote:
> On Fri, Oct 02, 2020 at 09:44:54PM +0200, Hans de Goede wrote:
>> Hi,
>>
>> On 10/2/20 4:51 PM, Coiby Xu wrote:
>>> On Fri, Oct 02, 2020 at 03:36:29PM +0200, Hans de Goede wrote:
>>
>> <snip>
>>
>>>>>> So are you seeing these 7 interrupts / second for the touchpad irq or for
>>>>>> the GPIO controllers parent irq ?
>>>>>>
>>>>>> Also to these 7 interrupts/sec stop happening when you do not touch the
>>>>>> touchpad ?
>>>>>>
>>>>> I see these 7 interrupts / second for the GPIO controller's parent irq.
>>>>> And they stop happening when I don't touch the touchpad.
>>>>
>>>> Only from the parent irq, or also on the touchpad irq itself ?
>>>>
>>>> If this only happens on the parent irq, then I would start looking at the
>>>> amd-pinctrl code which determines which of its "child" irqs to fire.
>>>
>>> This only happens on the parent irq. The input's pin#130 of the GIPO
>>> chip is low most of the time and pin#130.
>>
>> Right, but it is a low-level triggered IRQ, so when it is low it should
>> be executing the i2c-hid interrupt-handler. If it is not executing that
>> then it is time to look at amd-pinctrl's irq-handler and figure out why
>> that is not triggering the child irq handler for the touchpad.
>>
> I'm not sure if I have some incorrect understandings about GPIO
> interrupt controller because I don't quite follow your reasoning.
> What I actually suspect is there's something wrong with amd-pinctrl
> which makes the GPIO chip fail to assert its common interrupt output
> line connected to one IO-APIC's pin#7 thus IRQ#7 fails to fire. What
> I learn about this low-level triggered IRQ is that the i2c-hid
> interrupt-handler will be woken up by amd-pinctrl's irq-handler which
> is executed when the parent IRQ#7 fires. The code path is as follows,
> 
>      <IRQ>
>      dump_stack+0x64/0x88
>      __irq_wake_thread.cold+0x9/0x12
>      __handle_irq_event_percpu+0x80/0x1c0
>      handle_irq_event+0x58/0xb0
>      handle_level_irq+0xb7/0x1a0
>      generic_handle_irq+0x4a/0x60
>      amd_gpio_irq_handler+0x15f/0x1b0 [pinctrl_amd]
>      __handle_irq_event_percpu+0x45/0x1c0
>      handle_irq_event+0x58/0xb0
>      handle_fasteoi_irq+0xa2/0x210
>      do_IRQ+0x70/0x120
>      common_interrupt+0xf/0xf
>      </IRQ>
> 
> But the problem is somehow IRQ#7 doesn't even fire when the input's
> pin#130 of the GIPO is low. Without IRQ#7 firing, amd-pinctrl's
> irq-handler wouldn't be executed in the first place, let alone
> triggering the child irq handler. Btw, amd-pinctrl's irq-handler
> simply iterate over all pins. If there is mapped irq found for this
> hwirq (yes, it won't even check if this pin triggers the interrupt),
> then it will call generic_handle_irq. So there's nothing wrong about
> this part of code.

Ok, so the i2c-hid irq does fire, but only 7 times a second just
like the GPIO controller's parent irq.

The only thing I can think of then is to add printk-s to check how
long the i2c-hid interrupt handler takes to complete. It could be
there is a subtle bug somewhere causing the i2c transfers to take
longer when run from a (threaded) irq handler. That would be weird
though, so I don't expect this to result in any useful findings.

Other then that I'm all out of ideas I'm afraid.

Regards,

Hans


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-03 13:22                 ` Hans de Goede
  0 siblings, 0 replies; 84+ messages in thread
From: Hans de Goede @ 2020-10-03 13:22 UTC (permalink / raw)
  To: Coiby Xu
  Cc: open list:GPIO SUBSYSTEM, Linus Walleij, Shyam Sundar S K,
	Nehal Shah, linux-kernel-mentees

Hi,

On 10/3/20 12:45 AM, Coiby Xu wrote:
> On Fri, Oct 02, 2020 at 09:44:54PM +0200, Hans de Goede wrote:
>> Hi,
>>
>> On 10/2/20 4:51 PM, Coiby Xu wrote:
>>> On Fri, Oct 02, 2020 at 03:36:29PM +0200, Hans de Goede wrote:
>>
>> <snip>
>>
>>>>>> So are you seeing these 7 interrupts / second for the touchpad irq or for
>>>>>> the GPIO controllers parent irq ?
>>>>>>
>>>>>> Also to these 7 interrupts/sec stop happening when you do not touch the
>>>>>> touchpad ?
>>>>>>
>>>>> I see these 7 interrupts / second for the GPIO controller's parent irq.
>>>>> And they stop happening when I don't touch the touchpad.
>>>>
>>>> Only from the parent irq, or also on the touchpad irq itself ?
>>>>
>>>> If this only happens on the parent irq, then I would start looking at the
>>>> amd-pinctrl code which determines which of its "child" irqs to fire.
>>>
>>> This only happens on the parent irq. The input's pin#130 of the GIPO
>>> chip is low most of the time and pin#130.
>>
>> Right, but it is a low-level triggered IRQ, so when it is low it should
>> be executing the i2c-hid interrupt-handler. If it is not executing that
>> then it is time to look at amd-pinctrl's irq-handler and figure out why
>> that is not triggering the child irq handler for the touchpad.
>>
> I'm not sure if I have some incorrect understandings about GPIO
> interrupt controller because I don't quite follow your reasoning.
> What I actually suspect is there's something wrong with amd-pinctrl
> which makes the GPIO chip fail to assert its common interrupt output
> line connected to one IO-APIC's pin#7 thus IRQ#7 fails to fire. What
> I learn about this low-level triggered IRQ is that the i2c-hid
> interrupt-handler will be woken up by amd-pinctrl's irq-handler which
> is executed when the parent IRQ#7 fires. The code path is as follows,
> 
>      <IRQ>
>      dump_stack+0x64/0x88
>      __irq_wake_thread.cold+0x9/0x12
>      __handle_irq_event_percpu+0x80/0x1c0
>      handle_irq_event+0x58/0xb0
>      handle_level_irq+0xb7/0x1a0
>      generic_handle_irq+0x4a/0x60
>      amd_gpio_irq_handler+0x15f/0x1b0 [pinctrl_amd]
>      __handle_irq_event_percpu+0x45/0x1c0
>      handle_irq_event+0x58/0xb0
>      handle_fasteoi_irq+0xa2/0x210
>      do_IRQ+0x70/0x120
>      common_interrupt+0xf/0xf
>      </IRQ>
> 
> But the problem is somehow IRQ#7 doesn't even fire when the input's
> pin#130 of the GIPO is low. Without IRQ#7 firing, amd-pinctrl's
> irq-handler wouldn't be executed in the first place, let alone
> triggering the child irq handler. Btw, amd-pinctrl's irq-handler
> simply iterate over all pins. If there is mapped irq found for this
> hwirq (yes, it won't even check if this pin triggers the interrupt),
> then it will call generic_handle_irq. So there's nothing wrong about
> this part of code.

Ok, so the i2c-hid irq does fire, but only 7 times a second just
like the GPIO controller's parent irq.

The only thing I can think of then is to add printk-s to check how
long the i2c-hid interrupt handler takes to complete. It could be
there is a subtle bug somewhere causing the i2c transfers to take
longer when run from a (threaded) irq handler. That would be weird
though, so I don't expect this to result in any useful findings.

Other then that I'm all out of ideas I'm afraid.

Regards,

Hans

_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-03 13:22                 ` [Linux-kernel-mentees] " Hans de Goede
@ 2020-10-03 23:03                   ` Coiby Xu
  -1 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-03 23:03 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Linus Walleij, open list:GPIO SUBSYSTEM, Nehal Shah,
	Shyam Sundar S K, linux-kernel-mentees

On Sat, Oct 03, 2020 at 03:22:46PM +0200, Hans de Goede wrote:
>Hi,
>
>On 10/3/20 12:45 AM, Coiby Xu wrote:
>>On Fri, Oct 02, 2020 at 09:44:54PM +0200, Hans de Goede wrote:
>>>Hi,
>>>
>>>On 10/2/20 4:51 PM, Coiby Xu wrote:
>>>>On Fri, Oct 02, 2020 at 03:36:29PM +0200, Hans de Goede wrote:
>>>
>>><snip>
>>>
>>>>>>>So are you seeing these 7 interrupts / second for the touchpad irq or for
>>>>>>>the GPIO controllers parent irq ?
>>>>>>>
>>>>>>>Also to these 7 interrupts/sec stop happening when you do not touch the
>>>>>>>touchpad ?
>>>>>>>
>>>>>>I see these 7 interrupts / second for the GPIO controller's parent irq.
>>>>>>And they stop happening when I don't touch the touchpad.
>>>>>
>>>>>Only from the parent irq, or also on the touchpad irq itself ?
>>>>>
>>>>>If this only happens on the parent irq, then I would start looking at the
>>>>>amd-pinctrl code which determines which of its "child" irqs to fire.
>>>>
>>>>This only happens on the parent irq. The input's pin#130 of the GIPO
>>>>chip is low most of the time and pin#130.
>>>
>>>Right, but it is a low-level triggered IRQ, so when it is low it should
>>>be executing the i2c-hid interrupt-handler. If it is not executing that
>>>then it is time to look at amd-pinctrl's irq-handler and figure out why
>>>that is not triggering the child irq handler for the touchpad.
>>>
>>I'm not sure if I have some incorrect understandings about GPIO
>>interrupt controller because I don't quite follow your reasoning.
>>What I actually suspect is there's something wrong with amd-pinctrl
>>which makes the GPIO chip fail to assert its common interrupt output
>>line connected to one IO-APIC's pin#7 thus IRQ#7 fails to fire. What
>>I learn about this low-level triggered IRQ is that the i2c-hid
>>interrupt-handler will be woken up by amd-pinctrl's irq-handler which
>>is executed when the parent IRQ#7 fires. The code path is as follows,
>>
>>     <IRQ>
>>     dump_stack+0x64/0x88
>>     __irq_wake_thread.cold+0x9/0x12
>>     __handle_irq_event_percpu+0x80/0x1c0
>>     handle_irq_event+0x58/0xb0
>>     handle_level_irq+0xb7/0x1a0
>>     generic_handle_irq+0x4a/0x60
>>     amd_gpio_irq_handler+0x15f/0x1b0 [pinctrl_amd]
>>     __handle_irq_event_percpu+0x45/0x1c0
>>     handle_irq_event+0x58/0xb0
>>     handle_fasteoi_irq+0xa2/0x210
>>     do_IRQ+0x70/0x120
>>     common_interrupt+0xf/0xf
>>     </IRQ>
>>
>>But the problem is somehow IRQ#7 doesn't even fire when the input's
>>pin#130 of the GIPO is low. Without IRQ#7 firing, amd-pinctrl's
>>irq-handler wouldn't be executed in the first place, let alonet
>>triggering the child irq handler. Btw, amd-pinctrl's irq-handler
>>simply iterate over all pins. If there is mapped irq found for this
>>hwirq (yes, it won't even check if this pin triggers the interrupt),
>>then it will call generic_handle_irq. So there's nothing wrong about
>>this part of code.
>
>Ok, so the i2c-hid irq does fire, but only 7 times a second just
>like the GPIO controller's parent irq.
>
I'm not sure if it's correct to say if hi2c-hid irq fires or not and how
frequently it fires since the i2c-hid irq is mapped to pin#130 of the
GPIO interrupt controller and the touchpad has another interrupt line
connected to pin#130 which fires to indicate new data. All we know is
pin#130 of the GPIO chip has low input most of the time when the finger
is on the touchpad so we can infer the touchpad has been trying to
notify the kernel of new data but somehow GPIO's parent irq only fires 7
times / second.
>The only thing I can think of then is to add printk-s to check how
>long the i2c-hid interrupt handler takes to complete. It could be
>there is a subtle bug somewhere causing the i2c transfers to take
>longer when run from a (threaded) irq handler. That would be weird
>though, so I don't expect this to result in any useful findings.
>

I also doubted if it takes too much time for the i2c-hid handler to
finish reading i2c transfer, processing data and delivering to the input
system. After measuring the time internal between the starting of the
GPIO irq's parent handler and when pin#130 is unmasked, we can exclude
this possibility.

I have been wondering if we let make pin#130 have low input thus to
trigger a interrupt firing or assert the GPIO's common interrupt output
line manually thus we can measure how long does it take for the kernel
to receive the signal. But once GPIO's pin is programmed to be a
interrupt line we can't write anything to it and it seems other
interrupts can only be generated by the hardware. So this idea is not
plausible

>Other then that I'm all out of ideas I'm afraid.
>
Thank you for taking time to investigate this issue anyway! Have a nice
weekend:)
>Regards,
>
>Hans
>

--
Best regards,
Coiby

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-03 23:03                   ` Coiby Xu
  0 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-03 23:03 UTC (permalink / raw)
  To: Hans de Goede
  Cc: open list:GPIO SUBSYSTEM, Linus Walleij, Shyam Sundar S K,
	Nehal Shah, linux-kernel-mentees

On Sat, Oct 03, 2020 at 03:22:46PM +0200, Hans de Goede wrote:
>Hi,
>
>On 10/3/20 12:45 AM, Coiby Xu wrote:
>>On Fri, Oct 02, 2020 at 09:44:54PM +0200, Hans de Goede wrote:
>>>Hi,
>>>
>>>On 10/2/20 4:51 PM, Coiby Xu wrote:
>>>>On Fri, Oct 02, 2020 at 03:36:29PM +0200, Hans de Goede wrote:
>>>
>>><snip>
>>>
>>>>>>>So are you seeing these 7 interrupts / second for the touchpad irq or for
>>>>>>>the GPIO controllers parent irq ?
>>>>>>>
>>>>>>>Also to these 7 interrupts/sec stop happening when you do not touch the
>>>>>>>touchpad ?
>>>>>>>
>>>>>>I see these 7 interrupts / second for the GPIO controller's parent irq.
>>>>>>And they stop happening when I don't touch the touchpad.
>>>>>
>>>>>Only from the parent irq, or also on the touchpad irq itself ?
>>>>>
>>>>>If this only happens on the parent irq, then I would start looking at the
>>>>>amd-pinctrl code which determines which of its "child" irqs to fire.
>>>>
>>>>This only happens on the parent irq. The input's pin#130 of the GIPO
>>>>chip is low most of the time and pin#130.
>>>
>>>Right, but it is a low-level triggered IRQ, so when it is low it should
>>>be executing the i2c-hid interrupt-handler. If it is not executing that
>>>then it is time to look at amd-pinctrl's irq-handler and figure out why
>>>that is not triggering the child irq handler for the touchpad.
>>>
>>I'm not sure if I have some incorrect understandings about GPIO
>>interrupt controller because I don't quite follow your reasoning.
>>What I actually suspect is there's something wrong with amd-pinctrl
>>which makes the GPIO chip fail to assert its common interrupt output
>>line connected to one IO-APIC's pin#7 thus IRQ#7 fails to fire. What
>>I learn about this low-level triggered IRQ is that the i2c-hid
>>interrupt-handler will be woken up by amd-pinctrl's irq-handler which
>>is executed when the parent IRQ#7 fires. The code path is as follows,
>>
>>     <IRQ>
>>     dump_stack+0x64/0x88
>>     __irq_wake_thread.cold+0x9/0x12
>>     __handle_irq_event_percpu+0x80/0x1c0
>>     handle_irq_event+0x58/0xb0
>>     handle_level_irq+0xb7/0x1a0
>>     generic_handle_irq+0x4a/0x60
>>     amd_gpio_irq_handler+0x15f/0x1b0 [pinctrl_amd]
>>     __handle_irq_event_percpu+0x45/0x1c0
>>     handle_irq_event+0x58/0xb0
>>     handle_fasteoi_irq+0xa2/0x210
>>     do_IRQ+0x70/0x120
>>     common_interrupt+0xf/0xf
>>     </IRQ>
>>
>>But the problem is somehow IRQ#7 doesn't even fire when the input's
>>pin#130 of the GIPO is low. Without IRQ#7 firing, amd-pinctrl's
>>irq-handler wouldn't be executed in the first place, let alonet
>>triggering the child irq handler. Btw, amd-pinctrl's irq-handler
>>simply iterate over all pins. If there is mapped irq found for this
>>hwirq (yes, it won't even check if this pin triggers the interrupt),
>>then it will call generic_handle_irq. So there's nothing wrong about
>>this part of code.
>
>Ok, so the i2c-hid irq does fire, but only 7 times a second just
>like the GPIO controller's parent irq.
>
I'm not sure if it's correct to say if hi2c-hid irq fires or not and how
frequently it fires since the i2c-hid irq is mapped to pin#130 of the
GPIO interrupt controller and the touchpad has another interrupt line
connected to pin#130 which fires to indicate new data. All we know is
pin#130 of the GPIO chip has low input most of the time when the finger
is on the touchpad so we can infer the touchpad has been trying to
notify the kernel of new data but somehow GPIO's parent irq only fires 7
times / second.
>The only thing I can think of then is to add printk-s to check how
>long the i2c-hid interrupt handler takes to complete. It could be
>there is a subtle bug somewhere causing the i2c transfers to take
>longer when run from a (threaded) irq handler. That would be weird
>though, so I don't expect this to result in any useful findings.
>

I also doubted if it takes too much time for the i2c-hid handler to
finish reading i2c transfer, processing data and delivering to the input
system. After measuring the time internal between the starting of the
GPIO irq's parent handler and when pin#130 is unmasked, we can exclude
this possibility.

I have been wondering if we let make pin#130 have low input thus to
trigger a interrupt firing or assert the GPIO's common interrupt output
line manually thus we can measure how long does it take for the kernel
to receive the signal. But once GPIO's pin is programmed to be a
interrupt line we can't write anything to it and it seems other
interrupts can only be generated by the hardware. So this idea is not
plausible

>Other then that I'm all out of ideas I'm afraid.
>
Thank you for taking time to investigate this issue anyway! Have a nice
weekend:)
>Regards,
>
>Hans
>

--
Best regards,
Coiby
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-03 23:03                   ` [Linux-kernel-mentees] " Coiby Xu
@ 2020-10-04  5:16                     ` Coiby Xu
  -1 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-04  5:16 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Linus Walleij, open list:GPIO SUBSYSTEM, Nehal Shah,
	Shyam Sundar S K, linux-kernel-mentees

On Sun, Oct 04, 2020 at 07:03:40AM +0800, Coiby Xu wrote:
>On Sat, Oct 03, 2020 at 03:22:46PM +0200, Hans de Goede wrote:
>>Hi,
>>
>>On 10/3/20 12:45 AM, Coiby Xu wrote:
>>>On Fri, Oct 02, 2020 at 09:44:54PM +0200, Hans de Goede wrote:
>>>>Hi,
>>>>
>>>>On 10/2/20 4:51 PM, Coiby Xu wrote:
>>>>>On Fri, Oct 02, 2020 at 03:36:29PM +0200, Hans de Goede wrote:
>>>>
>>>><snip>
>>>>
>>>>>>>>So are you seeing these 7 interrupts / second for the touchpad irq or for
>>>>>>>>the GPIO controllers parent irq ?
>>>>>>>>
>>>>>>>>Also to these 7 interrupts/sec stop happening when you do not touch the
>>>>>>>>touchpad ?
>>>>>>>>
>>>>>>>I see these 7 interrupts / second for the GPIO controller's parent irq.
>>>>>>>And they stop happening when I don't touch the touchpad.
>>>>>>
>>>>>>Only from the parent irq, or also on the touchpad irq itself ?
>>>>>>
>>>>>>If this only happens on the parent irq, then I would start looking at the
>>>>>>amd-pinctrl code which determines which of its "child" irqs to fire.
>>>>>
>>>>>This only happens on the parent irq. The input's pin#130 of the GIPO
>>>>>chip is low most of the time and pin#130.
>>>>
>>>>Right, but it is a low-level triggered IRQ, so when it is low it should
>>>>be executing the i2c-hid interrupt-handler. If it is not executing that
>>>>then it is time to look at amd-pinctrl's irq-handler and figure out why
>>>>that is not triggering the child irq handler for the touchpad.
>>>>
>>>I'm not sure if I have some incorrect understandings about GPIO
>>>interrupt controller because I don't quite follow your reasoning.
>>>What I actually suspect is there's something wrong with amd-pinctrl
>>>which makes the GPIO chip fail to assert its common interrupt output
>>>line connected to one IO-APIC's pin#7 thus IRQ#7 fails to fire. What
>>>I learn about this low-level triggered IRQ is that the i2c-hid
>>>interrupt-handler will be woken up by amd-pinctrl's irq-handler which
>>>is executed when the parent IRQ#7 fires. The code path is as follows,
>>>
>>>    <IRQ>
>>>    dump_stack+0x64/0x88
>>>    __irq_wake_thread.cold+0x9/0x12
>>>    __handle_irq_event_percpu+0x80/0x1c0
>>>    handle_irq_event+0x58/0xb0
>>>    handle_level_irq+0xb7/0x1a0
>>>    generic_handle_irq+0x4a/0x60
>>>    amd_gpio_irq_handler+0x15f/0x1b0 [pinctrl_amd]
>>>    __handle_irq_event_percpu+0x45/0x1c0
>>>    handle_irq_event+0x58/0xb0
>>>    handle_fasteoi_irq+0xa2/0x210
>>>    do_IRQ+0x70/0x120
>>>    common_interrupt+0xf/0xf
>>>    </IRQ>
>>>
>>>But the problem is somehow IRQ#7 doesn't even fire when the input's
>>>pin#130 of the GIPO is low. Without IRQ#7 firing, amd-pinctrl's
>>>irq-handler wouldn't be executed in the first place, let alonet
>>>triggering the child irq handler. Btw, amd-pinctrl's irq-handler
>>>simply iterate over all pins. If there is mapped irq found for this
>>>hwirq (yes, it won't even check if this pin triggers the interrupt),
>>>then it will call generic_handle_irq. So there's nothing wrong about
>>>this part of code.
>>
>>Ok, so the i2c-hid irq does fire, but only 7 times a second just
>>like the GPIO controller's parent irq.
>>
>I'm not sure if it's correct to say if hi2c-hid irq fires or not and how
>frequently it fires since the i2c-hid irq is mapped to pin#130 of the
>GPIO interrupt controller and the touchpad has another interrupt line
>connected to pin#130 which fires to indicate new data. All we know is
>pin#130 of the GPIO chip has low input most of the time when the finger
>is on the touchpad so we can infer the touchpad has been trying to
>notify the kernel of new data but somehow GPIO's parent irq only fires 7
>times / second.
>
>>The only thing I can think of then is to add printk-s to check how
>>long the i2c-hid interrupt handler takes to complete. It could be
>>there is a subtle bug somewhere causing the i2c transfers to take
>>longer when run from a (threaded) irq handler. That would be weird
>>though, so I don't expect this to result in any useful findings.
>>
>
>I also doubted if it takes too much time for the i2c-hid handler to
>finish reading i2c transfer, processing data and delivering to the input
>system. After measuring the time internal between the starting of the
>GPIO irq's parent handler and when pin#130 is unmasked, we can exclude
>this possibility.
>
>I have been wondering if we let make pin#130 have low input thus to
>trigger a interrupt firing or assert the GPIO's common interrupt output
>line manually thus we can measure how long does it take for the kernel
>to receive the signal. But once GPIO's pin is programmed to be a
>interrupt line we can't write anything to it and it seems other
>interrupts can only be generated by the hardware. So this idea is not
>plausible
>

Btw, there are other users who have the same laptop model but with a
different touchpad (ELAN). Their touchpads would show in
/proc/bus/input/devices but are completely dead. hid-recorder which
will read HID reports from /dev/hidraw gets nothing if they put there
fingers on the touchpad but the polling mode could also save their
touchpads. It seems GPIO controller's parent irq for the ELAN touchpad
doesn't even fire once. And unlike GPIO, IO-APIC has also be used by
other devices like the keyboard. So maybe it's safe to assert the root
cause is from the GPIO controller.

>>Other then that I'm all out of ideas I'm afraid.
>>
>Thank you for taking time to investigate this issue anyway! Have a nice
>weekend:)
>>Regards,
>>
>>Hans
>>
>
>--
>Best regards,
>Coiby

--
Best regards,
Coiby

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-04  5:16                     ` Coiby Xu
  0 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-04  5:16 UTC (permalink / raw)
  To: Hans de Goede
  Cc: open list:GPIO SUBSYSTEM, Linus Walleij, Shyam Sundar S K,
	Nehal Shah, linux-kernel-mentees

On Sun, Oct 04, 2020 at 07:03:40AM +0800, Coiby Xu wrote:
>On Sat, Oct 03, 2020 at 03:22:46PM +0200, Hans de Goede wrote:
>>Hi,
>>
>>On 10/3/20 12:45 AM, Coiby Xu wrote:
>>>On Fri, Oct 02, 2020 at 09:44:54PM +0200, Hans de Goede wrote:
>>>>Hi,
>>>>
>>>>On 10/2/20 4:51 PM, Coiby Xu wrote:
>>>>>On Fri, Oct 02, 2020 at 03:36:29PM +0200, Hans de Goede wrote:
>>>>
>>>><snip>
>>>>
>>>>>>>>So are you seeing these 7 interrupts / second for the touchpad irq or for
>>>>>>>>the GPIO controllers parent irq ?
>>>>>>>>
>>>>>>>>Also to these 7 interrupts/sec stop happening when you do not touch the
>>>>>>>>touchpad ?
>>>>>>>>
>>>>>>>I see these 7 interrupts / second for the GPIO controller's parent irq.
>>>>>>>And they stop happening when I don't touch the touchpad.
>>>>>>
>>>>>>Only from the parent irq, or also on the touchpad irq itself ?
>>>>>>
>>>>>>If this only happens on the parent irq, then I would start looking at the
>>>>>>amd-pinctrl code which determines which of its "child" irqs to fire.
>>>>>
>>>>>This only happens on the parent irq. The input's pin#130 of the GIPO
>>>>>chip is low most of the time and pin#130.
>>>>
>>>>Right, but it is a low-level triggered IRQ, so when it is low it should
>>>>be executing the i2c-hid interrupt-handler. If it is not executing that
>>>>then it is time to look at amd-pinctrl's irq-handler and figure out why
>>>>that is not triggering the child irq handler for the touchpad.
>>>>
>>>I'm not sure if I have some incorrect understandings about GPIO
>>>interrupt controller because I don't quite follow your reasoning.
>>>What I actually suspect is there's something wrong with amd-pinctrl
>>>which makes the GPIO chip fail to assert its common interrupt output
>>>line connected to one IO-APIC's pin#7 thus IRQ#7 fails to fire. What
>>>I learn about this low-level triggered IRQ is that the i2c-hid
>>>interrupt-handler will be woken up by amd-pinctrl's irq-handler which
>>>is executed when the parent IRQ#7 fires. The code path is as follows,
>>>
>>>    <IRQ>
>>>    dump_stack+0x64/0x88
>>>    __irq_wake_thread.cold+0x9/0x12
>>>    __handle_irq_event_percpu+0x80/0x1c0
>>>    handle_irq_event+0x58/0xb0
>>>    handle_level_irq+0xb7/0x1a0
>>>    generic_handle_irq+0x4a/0x60
>>>    amd_gpio_irq_handler+0x15f/0x1b0 [pinctrl_amd]
>>>    __handle_irq_event_percpu+0x45/0x1c0
>>>    handle_irq_event+0x58/0xb0
>>>    handle_fasteoi_irq+0xa2/0x210
>>>    do_IRQ+0x70/0x120
>>>    common_interrupt+0xf/0xf
>>>    </IRQ>
>>>
>>>But the problem is somehow IRQ#7 doesn't even fire when the input's
>>>pin#130 of the GIPO is low. Without IRQ#7 firing, amd-pinctrl's
>>>irq-handler wouldn't be executed in the first place, let alonet
>>>triggering the child irq handler. Btw, amd-pinctrl's irq-handler
>>>simply iterate over all pins. If there is mapped irq found for this
>>>hwirq (yes, it won't even check if this pin triggers the interrupt),
>>>then it will call generic_handle_irq. So there's nothing wrong about
>>>this part of code.
>>
>>Ok, so the i2c-hid irq does fire, but only 7 times a second just
>>like the GPIO controller's parent irq.
>>
>I'm not sure if it's correct to say if hi2c-hid irq fires or not and how
>frequently it fires since the i2c-hid irq is mapped to pin#130 of the
>GPIO interrupt controller and the touchpad has another interrupt line
>connected to pin#130 which fires to indicate new data. All we know is
>pin#130 of the GPIO chip has low input most of the time when the finger
>is on the touchpad so we can infer the touchpad has been trying to
>notify the kernel of new data but somehow GPIO's parent irq only fires 7
>times / second.
>
>>The only thing I can think of then is to add printk-s to check how
>>long the i2c-hid interrupt handler takes to complete. It could be
>>there is a subtle bug somewhere causing the i2c transfers to take
>>longer when run from a (threaded) irq handler. That would be weird
>>though, so I don't expect this to result in any useful findings.
>>
>
>I also doubted if it takes too much time for the i2c-hid handler to
>finish reading i2c transfer, processing data and delivering to the input
>system. After measuring the time internal between the starting of the
>GPIO irq's parent handler and when pin#130 is unmasked, we can exclude
>this possibility.
>
>I have been wondering if we let make pin#130 have low input thus to
>trigger a interrupt firing or assert the GPIO's common interrupt output
>line manually thus we can measure how long does it take for the kernel
>to receive the signal. But once GPIO's pin is programmed to be a
>interrupt line we can't write anything to it and it seems other
>interrupts can only be generated by the hardware. So this idea is not
>plausible
>

Btw, there are other users who have the same laptop model but with a
different touchpad (ELAN). Their touchpads would show in
/proc/bus/input/devices but are completely dead. hid-recorder which
will read HID reports from /dev/hidraw gets nothing if they put there
fingers on the touchpad but the polling mode could also save their
touchpads. It seems GPIO controller's parent irq for the ELAN touchpad
doesn't even fire once. And unlike GPIO, IO-APIC has also be used by
other devices like the keyboard. So maybe it's safe to assert the root
cause is from the GPIO controller.

>>Other then that I'm all out of ideas I'm afraid.
>>
>Thank you for taking time to investigate this issue anyway! Have a nice
>weekend:)
>>Regards,
>>
>>Hans
>>
>
>--
>Best regards,
>Coiby

--
Best regards,
Coiby
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-04  5:16                     ` [Linux-kernel-mentees] " Coiby Xu
@ 2020-10-06  4:49                       ` Coiby Xu
  -1 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-06  4:49 UTC (permalink / raw)
  To: Hans de Goede, Linus Walleij
  Cc: open list:GPIO SUBSYSTEM, wang jun, Nehal Shah, Shyam Sundar S K,
	linux-kernel-mentees

Hi Hans and Linus,

I've found the direct evidence proving the GPIO interrupt controller is
malfunctioning.

I've found a way to let the GPIO chip trigger an interrupt by accident
when playing with the GPIO sysfs interface,

  - export pin130 which is used by the touchad
  - set the direction to be "out"
  - `echo 0 > value` will trigger the GPIO controller's parent irq and
    "echo 1 > value" will make it stop firing

(I'm not sure if this is yet another bug of the GPIO chip. Anyway I can
manually trigger an interrupt now.)

I wrote a C program is to let GPIO controller quickly generate some
interrupts then disable the firing of interrupts by toggling pin#130's
value with an specified time interval, i.e., set the value to 0 first
and then after some time, re-set the value to 1. There is no interrupt
firing unless time internal > 120ms (~7Hz). This explains why we can
only see 7 interrupts for the GPIO controller's parent irq.

My hypothesis is the GPIO doesn't have proper power setting so it stays
in an idle state or its clock frequency is too low by default thus not
quick enough to read interrupt input. Then pinctrl-amd must miss some
code to configure the chip and I need a hardware reference manual of this
GPIO chip (HID: AMDI0030) or reverse-engineer the driver for Windows
since I couldn't find a copy of reference manual online? What would you
suggest?

Thank you!

On Sun, Oct 04, 2020 at 01:16:44PM +0800, Coiby Xu wrote:
>On Sun, Oct 04, 2020 at 07:03:40AM +0800, Coiby Xu wrote:
>>On Sat, Oct 03, 2020 at 03:22:46PM +0200, Hans de Goede wrote:
>>>Hi,
>>>
>>>On 10/3/20 12:45 AM, Coiby Xu wrote:
>>>>On Fri, Oct 02, 2020 at 09:44:54PM +0200, Hans de Goede wrote:
>>>>>Hi,
>>>>>
>>>>>On 10/2/20 4:51 PM, Coiby Xu wrote:
>>>>>>On Fri, Oct 02, 2020 at 03:36:29PM +0200, Hans de Goede wrote:
>>>>>
>>>>><snip>
>>>>>
>>>>>>>>>So are you seeing these 7 interrupts / second for the touchpad irq or for
>>>>>>>>>the GPIO controllers parent irq ?
>>>>>>>>>
>>>>>>>>>Also to these 7 interrupts/sec stop happening when you do not touch the
>>>>>>>>>touchpad ?
>>>>>>>>>
>>>>>>>>I see these 7 interrupts / second for the GPIO controller's parent irq.
>>>>>>>>And they stop happening when I don't touch the touchpad.
>>>>>>>
>>>>>>>Only from the parent irq, or also on the touchpad irq itself ?
>>>>>>>
>>>>>>>If this only happens on the parent irq, then I would start looking at the
>>>>>>>amd-pinctrl code which determines which of its "child" irqs to fire.
>>>>>>
>>>>>>This only happens on the parent irq. The input's pin#130 of the GIPO
>>>>>>chip is low most of the time and pin#130.
>>>>>
>>>>>Right, but it is a low-level triggered IRQ, so when it is low it should
>>>>>be executing the i2c-hid interrupt-handler. If it is not executing that
>>>>>then it is time to look at amd-pinctrl's irq-handler and figure out why
>>>>>that is not triggering the child irq handler for the touchpad.
>>>>>
>>>>I'm not sure if I have some incorrect understandings about GPIO
>>>>interrupt controller because I don't quite follow your reasoning.
>>>>What I actually suspect is there's something wrong with amd-pinctrl
>>>>which makes the GPIO chip fail to assert its common interrupt output
>>>>line connected to one IO-APIC's pin#7 thus IRQ#7 fails to fire. What
>>>>I learn about this low-level triggered IRQ is that the i2c-hid
>>>>interrupt-handler will be woken up by amd-pinctrl's irq-handler which
>>>>is executed when the parent IRQ#7 fires. The code path is as follows,
>>>>
>>>>    <IRQ>
>>>>    dump_stack+0x64/0x88
>>>>    __irq_wake_thread.cold+0x9/0x12
>>>>    __handle_irq_event_percpu+0x80/0x1c0
>>>>    handle_irq_event+0x58/0xb0
>>>>    handle_level_irq+0xb7/0x1a0
>>>>    generic_handle_irq+0x4a/0x60
>>>>    amd_gpio_irq_handler+0x15f/0x1b0 [pinctrl_amd]
>>>>    __handle_irq_event_percpu+0x45/0x1c0
>>>>    handle_irq_event+0x58/0xb0
>>>>    handle_fasteoi_irq+0xa2/0x210
>>>>    do_IRQ+0x70/0x120
>>>>    common_interrupt+0xf/0xf
>>>>    </IRQ>
>>>>
>>>>But the problem is somehow IRQ#7 doesn't even fire when the input's
>>>>pin#130 of the GIPO is low. Without IRQ#7 firing, amd-pinctrl's
>>>>irq-handler wouldn't be executed in the first place, let alonet
>>>>triggering the child irq handler. Btw, amd-pinctrl's irq-handler
>>>>simply iterate over all pins. If there is mapped irq found for this
>>>>hwirq (yes, it won't even check if this pin triggers the interrupt),
>>>>then it will call generic_handle_irq. So there's nothing wrong about
>>>>this part of code.
>>>
>>>Ok, so the i2c-hid irq does fire, but only 7 times a second just
>>>like the GPIO controller's parent irq.
>>>
>>I'm not sure if it's correct to say if hi2c-hid irq fires or not and how
>>frequently it fires since the i2c-hid irq is mapped to pin#130 of the
>>GPIO interrupt controller and the touchpad has another interrupt line
>>connected to pin#130 which fires to indicate new data. All we know is
>>pin#130 of the GPIO chip has low input most of the time when the finger
>>is on the touchpad so we can infer the touchpad has been trying to
>>notify the kernel of new data but somehow GPIO's parent irq only fires 7
>>times / second.
>>
>>>The only thing I can think of then is to add printk-s to check how
>>>long the i2c-hid interrupt handler takes to complete. It could be
>>>there is a subtle bug somewhere causing the i2c transfers to take
>>>longer when run from a (threaded) irq handler. That would be weird
>>>though, so I don't expect this to result in any useful findings.
>>>
>>
>>I also doubted if it takes too much time for the i2c-hid handler to
>>finish reading i2c transfer, processing data and delivering to the input
>>system. After measuring the time internal between the starting of the
>>GPIO irq's parent handler and when pin#130 is unmasked, we can exclude
>>this possibility.
>>
>>I have been wondering if we let make pin#130 have low input thus to
>>trigger a interrupt firing or assert the GPIO's common interrupt output
>>line manually thus we can measure how long does it take for the kernel
>>to receive the signal. But once GPIO's pin is programmed to be a
>>interrupt line we can't write anything to it and it seems other
>>interrupts can only be generated by the hardware. So this idea is not
>>plausible
>>
>
>Btw, there are other users who have the same laptop model but with a
>different touchpad (ELAN). Their touchpads would show in
>/proc/bus/input/devices but are completely dead. hid-recorder which
>will read HID reports from /dev/hidraw gets nothing if they put there
>fingers on the touchpad but the polling mode could also save their
>touchpads. It seems GPIO controller's parent irq for the ELAN touchpad
>doesn't even fire once. And unlike GPIO, IO-APIC has also be used by
>other devices like the keyboard. So maybe it's safe to assert the root
>cause is from the GPIO controller.
>
>>>Other then that I'm all out of ideas I'm afraid.
>>>
>>Thank you for taking time to investigate this issue anyway! Have a nice
>>weekend:)
>>>Regards,
>>>
>>>Hans
>>>
>>
>>--
>>Best regards,
>>Coiby
>
>--
>Best regards,
>Coiby

--
Best regards,
Coiby

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-06  4:49                       ` Coiby Xu
  0 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-06  4:49 UTC (permalink / raw)
  To: Hans de Goede, Linus Walleij
  Cc: wang jun, open list:GPIO SUBSYSTEM, Shyam Sundar S K, Nehal Shah,
	linux-kernel-mentees

Hi Hans and Linus,

I've found the direct evidence proving the GPIO interrupt controller is
malfunctioning.

I've found a way to let the GPIO chip trigger an interrupt by accident
when playing with the GPIO sysfs interface,

  - export pin130 which is used by the touchad
  - set the direction to be "out"
  - `echo 0 > value` will trigger the GPIO controller's parent irq and
    "echo 1 > value" will make it stop firing

(I'm not sure if this is yet another bug of the GPIO chip. Anyway I can
manually trigger an interrupt now.)

I wrote a C program is to let GPIO controller quickly generate some
interrupts then disable the firing of interrupts by toggling pin#130's
value with an specified time interval, i.e., set the value to 0 first
and then after some time, re-set the value to 1. There is no interrupt
firing unless time internal > 120ms (~7Hz). This explains why we can
only see 7 interrupts for the GPIO controller's parent irq.

My hypothesis is the GPIO doesn't have proper power setting so it stays
in an idle state or its clock frequency is too low by default thus not
quick enough to read interrupt input. Then pinctrl-amd must miss some
code to configure the chip and I need a hardware reference manual of this
GPIO chip (HID: AMDI0030) or reverse-engineer the driver for Windows
since I couldn't find a copy of reference manual online? What would you
suggest?

Thank you!

On Sun, Oct 04, 2020 at 01:16:44PM +0800, Coiby Xu wrote:
>On Sun, Oct 04, 2020 at 07:03:40AM +0800, Coiby Xu wrote:
>>On Sat, Oct 03, 2020 at 03:22:46PM +0200, Hans de Goede wrote:
>>>Hi,
>>>
>>>On 10/3/20 12:45 AM, Coiby Xu wrote:
>>>>On Fri, Oct 02, 2020 at 09:44:54PM +0200, Hans de Goede wrote:
>>>>>Hi,
>>>>>
>>>>>On 10/2/20 4:51 PM, Coiby Xu wrote:
>>>>>>On Fri, Oct 02, 2020 at 03:36:29PM +0200, Hans de Goede wrote:
>>>>>
>>>>><snip>
>>>>>
>>>>>>>>>So are you seeing these 7 interrupts / second for the touchpad irq or for
>>>>>>>>>the GPIO controllers parent irq ?
>>>>>>>>>
>>>>>>>>>Also to these 7 interrupts/sec stop happening when you do not touch the
>>>>>>>>>touchpad ?
>>>>>>>>>
>>>>>>>>I see these 7 interrupts / second for the GPIO controller's parent irq.
>>>>>>>>And they stop happening when I don't touch the touchpad.
>>>>>>>
>>>>>>>Only from the parent irq, or also on the touchpad irq itself ?
>>>>>>>
>>>>>>>If this only happens on the parent irq, then I would start looking at the
>>>>>>>amd-pinctrl code which determines which of its "child" irqs to fire.
>>>>>>
>>>>>>This only happens on the parent irq. The input's pin#130 of the GIPO
>>>>>>chip is low most of the time and pin#130.
>>>>>
>>>>>Right, but it is a low-level triggered IRQ, so when it is low it should
>>>>>be executing the i2c-hid interrupt-handler. If it is not executing that
>>>>>then it is time to look at amd-pinctrl's irq-handler and figure out why
>>>>>that is not triggering the child irq handler for the touchpad.
>>>>>
>>>>I'm not sure if I have some incorrect understandings about GPIO
>>>>interrupt controller because I don't quite follow your reasoning.
>>>>What I actually suspect is there's something wrong with amd-pinctrl
>>>>which makes the GPIO chip fail to assert its common interrupt output
>>>>line connected to one IO-APIC's pin#7 thus IRQ#7 fails to fire. What
>>>>I learn about this low-level triggered IRQ is that the i2c-hid
>>>>interrupt-handler will be woken up by amd-pinctrl's irq-handler which
>>>>is executed when the parent IRQ#7 fires. The code path is as follows,
>>>>
>>>>    <IRQ>
>>>>    dump_stack+0x64/0x88
>>>>    __irq_wake_thread.cold+0x9/0x12
>>>>    __handle_irq_event_percpu+0x80/0x1c0
>>>>    handle_irq_event+0x58/0xb0
>>>>    handle_level_irq+0xb7/0x1a0
>>>>    generic_handle_irq+0x4a/0x60
>>>>    amd_gpio_irq_handler+0x15f/0x1b0 [pinctrl_amd]
>>>>    __handle_irq_event_percpu+0x45/0x1c0
>>>>    handle_irq_event+0x58/0xb0
>>>>    handle_fasteoi_irq+0xa2/0x210
>>>>    do_IRQ+0x70/0x120
>>>>    common_interrupt+0xf/0xf
>>>>    </IRQ>
>>>>
>>>>But the problem is somehow IRQ#7 doesn't even fire when the input's
>>>>pin#130 of the GIPO is low. Without IRQ#7 firing, amd-pinctrl's
>>>>irq-handler wouldn't be executed in the first place, let alonet
>>>>triggering the child irq handler. Btw, amd-pinctrl's irq-handler
>>>>simply iterate over all pins. If there is mapped irq found for this
>>>>hwirq (yes, it won't even check if this pin triggers the interrupt),
>>>>then it will call generic_handle_irq. So there's nothing wrong about
>>>>this part of code.
>>>
>>>Ok, so the i2c-hid irq does fire, but only 7 times a second just
>>>like the GPIO controller's parent irq.
>>>
>>I'm not sure if it's correct to say if hi2c-hid irq fires or not and how
>>frequently it fires since the i2c-hid irq is mapped to pin#130 of the
>>GPIO interrupt controller and the touchpad has another interrupt line
>>connected to pin#130 which fires to indicate new data. All we know is
>>pin#130 of the GPIO chip has low input most of the time when the finger
>>is on the touchpad so we can infer the touchpad has been trying to
>>notify the kernel of new data but somehow GPIO's parent irq only fires 7
>>times / second.
>>
>>>The only thing I can think of then is to add printk-s to check how
>>>long the i2c-hid interrupt handler takes to complete. It could be
>>>there is a subtle bug somewhere causing the i2c transfers to take
>>>longer when run from a (threaded) irq handler. That would be weird
>>>though, so I don't expect this to result in any useful findings.
>>>
>>
>>I also doubted if it takes too much time for the i2c-hid handler to
>>finish reading i2c transfer, processing data and delivering to the input
>>system. After measuring the time internal between the starting of the
>>GPIO irq's parent handler and when pin#130 is unmasked, we can exclude
>>this possibility.
>>
>>I have been wondering if we let make pin#130 have low input thus to
>>trigger a interrupt firing or assert the GPIO's common interrupt output
>>line manually thus we can measure how long does it take for the kernel
>>to receive the signal. But once GPIO's pin is programmed to be a
>>interrupt line we can't write anything to it and it seems other
>>interrupts can only be generated by the hardware. So this idea is not
>>plausible
>>
>
>Btw, there are other users who have the same laptop model but with a
>different touchpad (ELAN). Their touchpads would show in
>/proc/bus/input/devices but are completely dead. hid-recorder which
>will read HID reports from /dev/hidraw gets nothing if they put there
>fingers on the touchpad but the polling mode could also save their
>touchpads. It seems GPIO controller's parent irq for the ELAN touchpad
>doesn't even fire once. And unlike GPIO, IO-APIC has also be used by
>other devices like the keyboard. So maybe it's safe to assert the root
>cause is from the GPIO controller.
>
>>>Other then that I'm all out of ideas I'm afraid.
>>>
>>Thank you for taking time to investigate this issue anyway! Have a nice
>>weekend:)
>>>Regards,
>>>
>>>Hans
>>>
>>
>>--
>>Best regards,
>>Coiby
>
>--
>Best regards,
>Coiby

--
Best regards,
Coiby
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-06  4:49                       ` [Linux-kernel-mentees] " Coiby Xu
@ 2020-10-06  6:28                         ` Hans de Goede
  -1 siblings, 0 replies; 84+ messages in thread
From: Hans de Goede @ 2020-10-06  6:28 UTC (permalink / raw)
  To: Coiby Xu, Linus Walleij
  Cc: open list:GPIO SUBSYSTEM, wang jun, Nehal Shah, Shyam Sundar S K,
	linux-kernel-mentees

Hi,

On 10/6/20 6:49 AM, Coiby Xu wrote:
> Hi Hans and Linus,
> 
> I've found the direct evidence proving the GPIO interrupt controller is
> malfunctioning.
> 
> I've found a way to let the GPIO chip trigger an interrupt by accident
> when playing with the GPIO sysfs interface,
> 
>   - export pin130 which is used by the touchad
>   - set the direction to be "out"
>   - `echo 0 > value` will trigger the GPIO controller's parent irq and
>     "echo 1 > value" will make it stop firing
> 
> (I'm not sure if this is yet another bug of the GPIO chip. Anyway I can
> manually trigger an interrupt now.)
> 
> I wrote a C program is to let GPIO controller quickly generate some
> interrupts then disable the firing of interrupts by toggling pin#130's
> value with an specified time interval, i.e., set the value to 0 first
> and then after some time, re-set the value to 1. There is no interrupt
> firing unless time internal > 120ms (~7Hz). This explains why we can
> only see 7 interrupts for the GPIO controller's parent irq.

That is a great find, well done.

> My hypothesis is the GPIO doesn't have proper power setting so it stays
> in an idle state or its clock frequency is too low by default thus not
> quick enough to read interrupt input. Then pinctrl-amd must miss some
> code to configure the chip and I need a hardware reference manual of this
> GPIO chip (HID: AMDI0030) or reverse-engineer the driver for Windows
> since I couldn't find a copy of reference manual online? What would you
> suggest?

This sounds like it might have something to do with the glitch filter.
The code in pinctrl-amd.c to setup the trigger-type also configures
the glitch filter, you could try changing that code to disable the
glitch-filter. The defines for setting the glitch-filter bits to
disabled are already there.

Regards,

Hans




> 
> Thank you!
> 
> On Sun, Oct 04, 2020 at 01:16:44PM +0800, Coiby Xu wrote:
>> On Sun, Oct 04, 2020 at 07:03:40AM +0800, Coiby Xu wrote:
>>> On Sat, Oct 03, 2020 at 03:22:46PM +0200, Hans de Goede wrote:
>>>> Hi,
>>>>
>>>> On 10/3/20 12:45 AM, Coiby Xu wrote:
>>>>> On Fri, Oct 02, 2020 at 09:44:54PM +0200, Hans de Goede wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 10/2/20 4:51 PM, Coiby Xu wrote:
>>>>>>> On Fri, Oct 02, 2020 at 03:36:29PM +0200, Hans de Goede wrote:
>>>>>>
>>>>>> <snip>
>>>>>>
>>>>>>>>>> So are you seeing these 7 interrupts / second for the touchpad irq or for
>>>>>>>>>> the GPIO controllers parent irq ?
>>>>>>>>>>
>>>>>>>>>> Also to these 7 interrupts/sec stop happening when you do not touch the
>>>>>>>>>> touchpad ?
>>>>>>>>>>
>>>>>>>>> I see these 7 interrupts / second for the GPIO controller's parent irq.
>>>>>>>>> And they stop happening when I don't touch the touchpad.
>>>>>>>>
>>>>>>>> Only from the parent irq, or also on the touchpad irq itself ?
>>>>>>>>
>>>>>>>> If this only happens on the parent irq, then I would start looking at the
>>>>>>>> amd-pinctrl code which determines which of its "child" irqs to fire.
>>>>>>>
>>>>>>> This only happens on the parent irq. The input's pin#130 of the GIPO
>>>>>>> chip is low most of the time and pin#130.
>>>>>>
>>>>>> Right, but it is a low-level triggered IRQ, so when it is low it should
>>>>>> be executing the i2c-hid interrupt-handler. If it is not executing that
>>>>>> then it is time to look at amd-pinctrl's irq-handler and figure out why
>>>>>> that is not triggering the child irq handler for the touchpad.
>>>>>>
>>>>> I'm not sure if I have some incorrect understandings about GPIO
>>>>> interrupt controller because I don't quite follow your reasoning.
>>>>> What I actually suspect is there's something wrong with amd-pinctrl
>>>>> which makes the GPIO chip fail to assert its common interrupt output
>>>>> line connected to one IO-APIC's pin#7 thus IRQ#7 fails to fire. What
>>>>> I learn about this low-level triggered IRQ is that the i2c-hid
>>>>> interrupt-handler will be woken up by amd-pinctrl's irq-handler which
>>>>> is executed when the parent IRQ#7 fires. The code path is as follows,
>>>>>
>>>>>     <IRQ>
>>>>>     dump_stack+0x64/0x88
>>>>>     __irq_wake_thread.cold+0x9/0x12
>>>>>     __handle_irq_event_percpu+0x80/0x1c0
>>>>>     handle_irq_event+0x58/0xb0
>>>>>     handle_level_irq+0xb7/0x1a0
>>>>>     generic_handle_irq+0x4a/0x60
>>>>>     amd_gpio_irq_handler+0x15f/0x1b0 [pinctrl_amd]
>>>>>     __handle_irq_event_percpu+0x45/0x1c0
>>>>>     handle_irq_event+0x58/0xb0
>>>>>     handle_fasteoi_irq+0xa2/0x210
>>>>>     do_IRQ+0x70/0x120
>>>>>     common_interrupt+0xf/0xf
>>>>>     </IRQ>
>>>>>
>>>>> But the problem is somehow IRQ#7 doesn't even fire when the input's
>>>>> pin#130 of the GIPO is low. Without IRQ#7 firing, amd-pinctrl's
>>>>> irq-handler wouldn't be executed in the first place, let alonet
>>>>> triggering the child irq handler. Btw, amd-pinctrl's irq-handler
>>>>> simply iterate over all pins. If there is mapped irq found for this
>>>>> hwirq (yes, it won't even check if this pin triggers the interrupt),
>>>>> then it will call generic_handle_irq. So there's nothing wrong about
>>>>> this part of code.
>>>>
>>>> Ok, so the i2c-hid irq does fire, but only 7 times a second just
>>>> like the GPIO controller's parent irq.
>>>>
>>> I'm not sure if it's correct to say if hi2c-hid irq fires or not and how
>>> frequently it fires since the i2c-hid irq is mapped to pin#130 of the
>>> GPIO interrupt controller and the touchpad has another interrupt line
>>> connected to pin#130 which fires to indicate new data. All we know is
>>> pin#130 of the GPIO chip has low input most of the time when the finger
>>> is on the touchpad so we can infer the touchpad has been trying to
>>> notify the kernel of new data but somehow GPIO's parent irq only fires 7
>>> times / second.
>>>
>>>> The only thing I can think of then is to add printk-s to check how
>>>> long the i2c-hid interrupt handler takes to complete. It could be
>>>> there is a subtle bug somewhere causing the i2c transfers to take
>>>> longer when run from a (threaded) irq handler. That would be weird
>>>> though, so I don't expect this to result in any useful findings.
>>>>
>>>
>>> I also doubted if it takes too much time for the i2c-hid handler to
>>> finish reading i2c transfer, processing data and delivering to the input
>>> system. After measuring the time internal between the starting of the
>>> GPIO irq's parent handler and when pin#130 is unmasked, we can exclude
>>> this possibility.
>>>
>>> I have been wondering if we let make pin#130 have low input thus to
>>> trigger a interrupt firing or assert the GPIO's common interrupt output
>>> line manually thus we can measure how long does it take for the kernel
>>> to receive the signal. But once GPIO's pin is programmed to be a
>>> interrupt line we can't write anything to it and it seems other
>>> interrupts can only be generated by the hardware. So this idea is not
>>> plausible
>>>
>>
>> Btw, there are other users who have the same laptop model but with a
>> different touchpad (ELAN). Their touchpads would show in
>> /proc/bus/input/devices but are completely dead. hid-recorder which
>> will read HID reports from /dev/hidraw gets nothing if they put there
>> fingers on the touchpad but the polling mode could also save their
>> touchpads. It seems GPIO controller's parent irq for the ELAN touchpad
>> doesn't even fire once. And unlike GPIO, IO-APIC has also be used by
>> other devices like the keyboard. So maybe it's safe to assert the root
>> cause is from the GPIO controller.
>>
>>>> Other then that I'm all out of ideas I'm afraid.
>>>>
>>> Thank you for taking time to investigate this issue anyway! Have a nice
>>> weekend:)
>>>> Regards,
>>>>
>>>> Hans
>>>>
>>>
>>> -- 
>>> Best regards,
>>> Coiby
>>
>> -- 
>> Best regards,
>> Coiby
> 
> -- 
> Best regards,
> Coiby
> 


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-06  6:28                         ` Hans de Goede
  0 siblings, 0 replies; 84+ messages in thread
From: Hans de Goede @ 2020-10-06  6:28 UTC (permalink / raw)
  To: Coiby Xu, Linus Walleij
  Cc: wang jun, open list:GPIO SUBSYSTEM, Shyam Sundar S K, Nehal Shah,
	linux-kernel-mentees

Hi,

On 10/6/20 6:49 AM, Coiby Xu wrote:
> Hi Hans and Linus,
> 
> I've found the direct evidence proving the GPIO interrupt controller is
> malfunctioning.
> 
> I've found a way to let the GPIO chip trigger an interrupt by accident
> when playing with the GPIO sysfs interface,
> 
>   - export pin130 which is used by the touchad
>   - set the direction to be "out"
>   - `echo 0 > value` will trigger the GPIO controller's parent irq and
>     "echo 1 > value" will make it stop firing
> 
> (I'm not sure if this is yet another bug of the GPIO chip. Anyway I can
> manually trigger an interrupt now.)
> 
> I wrote a C program is to let GPIO controller quickly generate some
> interrupts then disable the firing of interrupts by toggling pin#130's
> value with an specified time interval, i.e., set the value to 0 first
> and then after some time, re-set the value to 1. There is no interrupt
> firing unless time internal > 120ms (~7Hz). This explains why we can
> only see 7 interrupts for the GPIO controller's parent irq.

That is a great find, well done.

> My hypothesis is the GPIO doesn't have proper power setting so it stays
> in an idle state or its clock frequency is too low by default thus not
> quick enough to read interrupt input. Then pinctrl-amd must miss some
> code to configure the chip and I need a hardware reference manual of this
> GPIO chip (HID: AMDI0030) or reverse-engineer the driver for Windows
> since I couldn't find a copy of reference manual online? What would you
> suggest?

This sounds like it might have something to do with the glitch filter.
The code in pinctrl-amd.c to setup the trigger-type also configures
the glitch filter, you could try changing that code to disable the
glitch-filter. The defines for setting the glitch-filter bits to
disabled are already there.

Regards,

Hans




> 
> Thank you!
> 
> On Sun, Oct 04, 2020 at 01:16:44PM +0800, Coiby Xu wrote:
>> On Sun, Oct 04, 2020 at 07:03:40AM +0800, Coiby Xu wrote:
>>> On Sat, Oct 03, 2020 at 03:22:46PM +0200, Hans de Goede wrote:
>>>> Hi,
>>>>
>>>> On 10/3/20 12:45 AM, Coiby Xu wrote:
>>>>> On Fri, Oct 02, 2020 at 09:44:54PM +0200, Hans de Goede wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 10/2/20 4:51 PM, Coiby Xu wrote:
>>>>>>> On Fri, Oct 02, 2020 at 03:36:29PM +0200, Hans de Goede wrote:
>>>>>>
>>>>>> <snip>
>>>>>>
>>>>>>>>>> So are you seeing these 7 interrupts / second for the touchpad irq or for
>>>>>>>>>> the GPIO controllers parent irq ?
>>>>>>>>>>
>>>>>>>>>> Also to these 7 interrupts/sec stop happening when you do not touch the
>>>>>>>>>> touchpad ?
>>>>>>>>>>
>>>>>>>>> I see these 7 interrupts / second for the GPIO controller's parent irq.
>>>>>>>>> And they stop happening when I don't touch the touchpad.
>>>>>>>>
>>>>>>>> Only from the parent irq, or also on the touchpad irq itself ?
>>>>>>>>
>>>>>>>> If this only happens on the parent irq, then I would start looking at the
>>>>>>>> amd-pinctrl code which determines which of its "child" irqs to fire.
>>>>>>>
>>>>>>> This only happens on the parent irq. The input's pin#130 of the GIPO
>>>>>>> chip is low most of the time and pin#130.
>>>>>>
>>>>>> Right, but it is a low-level triggered IRQ, so when it is low it should
>>>>>> be executing the i2c-hid interrupt-handler. If it is not executing that
>>>>>> then it is time to look at amd-pinctrl's irq-handler and figure out why
>>>>>> that is not triggering the child irq handler for the touchpad.
>>>>>>
>>>>> I'm not sure if I have some incorrect understandings about GPIO
>>>>> interrupt controller because I don't quite follow your reasoning.
>>>>> What I actually suspect is there's something wrong with amd-pinctrl
>>>>> which makes the GPIO chip fail to assert its common interrupt output
>>>>> line connected to one IO-APIC's pin#7 thus IRQ#7 fails to fire. What
>>>>> I learn about this low-level triggered IRQ is that the i2c-hid
>>>>> interrupt-handler will be woken up by amd-pinctrl's irq-handler which
>>>>> is executed when the parent IRQ#7 fires. The code path is as follows,
>>>>>
>>>>>     <IRQ>
>>>>>     dump_stack+0x64/0x88
>>>>>     __irq_wake_thread.cold+0x9/0x12
>>>>>     __handle_irq_event_percpu+0x80/0x1c0
>>>>>     handle_irq_event+0x58/0xb0
>>>>>     handle_level_irq+0xb7/0x1a0
>>>>>     generic_handle_irq+0x4a/0x60
>>>>>     amd_gpio_irq_handler+0x15f/0x1b0 [pinctrl_amd]
>>>>>     __handle_irq_event_percpu+0x45/0x1c0
>>>>>     handle_irq_event+0x58/0xb0
>>>>>     handle_fasteoi_irq+0xa2/0x210
>>>>>     do_IRQ+0x70/0x120
>>>>>     common_interrupt+0xf/0xf
>>>>>     </IRQ>
>>>>>
>>>>> But the problem is somehow IRQ#7 doesn't even fire when the input's
>>>>> pin#130 of the GIPO is low. Without IRQ#7 firing, amd-pinctrl's
>>>>> irq-handler wouldn't be executed in the first place, let alonet
>>>>> triggering the child irq handler. Btw, amd-pinctrl's irq-handler
>>>>> simply iterate over all pins. If there is mapped irq found for this
>>>>> hwirq (yes, it won't even check if this pin triggers the interrupt),
>>>>> then it will call generic_handle_irq. So there's nothing wrong about
>>>>> this part of code.
>>>>
>>>> Ok, so the i2c-hid irq does fire, but only 7 times a second just
>>>> like the GPIO controller's parent irq.
>>>>
>>> I'm not sure if it's correct to say if hi2c-hid irq fires or not and how
>>> frequently it fires since the i2c-hid irq is mapped to pin#130 of the
>>> GPIO interrupt controller and the touchpad has another interrupt line
>>> connected to pin#130 which fires to indicate new data. All we know is
>>> pin#130 of the GPIO chip has low input most of the time when the finger
>>> is on the touchpad so we can infer the touchpad has been trying to
>>> notify the kernel of new data but somehow GPIO's parent irq only fires 7
>>> times / second.
>>>
>>>> The only thing I can think of then is to add printk-s to check how
>>>> long the i2c-hid interrupt handler takes to complete. It could be
>>>> there is a subtle bug somewhere causing the i2c transfers to take
>>>> longer when run from a (threaded) irq handler. That would be weird
>>>> though, so I don't expect this to result in any useful findings.
>>>>
>>>
>>> I also doubted if it takes too much time for the i2c-hid handler to
>>> finish reading i2c transfer, processing data and delivering to the input
>>> system. After measuring the time internal between the starting of the
>>> GPIO irq's parent handler and when pin#130 is unmasked, we can exclude
>>> this possibility.
>>>
>>> I have been wondering if we let make pin#130 have low input thus to
>>> trigger a interrupt firing or assert the GPIO's common interrupt output
>>> line manually thus we can measure how long does it take for the kernel
>>> to receive the signal. But once GPIO's pin is programmed to be a
>>> interrupt line we can't write anything to it and it seems other
>>> interrupts can only be generated by the hardware. So this idea is not
>>> plausible
>>>
>>
>> Btw, there are other users who have the same laptop model but with a
>> different touchpad (ELAN). Their touchpads would show in
>> /proc/bus/input/devices but are completely dead. hid-recorder which
>> will read HID reports from /dev/hidraw gets nothing if they put there
>> fingers on the touchpad but the polling mode could also save their
>> touchpads. It seems GPIO controller's parent irq for the ELAN touchpad
>> doesn't even fire once. And unlike GPIO, IO-APIC has also be used by
>> other devices like the keyboard. So maybe it's safe to assert the root
>> cause is from the GPIO controller.
>>
>>>> Other then that I'm all out of ideas I'm afraid.
>>>>
>>> Thank you for taking time to investigate this issue anyway! Have a nice
>>> weekend:)
>>>> Regards,
>>>>
>>>> Hans
>>>>
>>>
>>> -- 
>>> Best regards,
>>> Coiby
>>
>> -- 
>> Best regards,
>> Coiby
> 
> -- 
> Best regards,
> Coiby
> 

_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-06  6:28                         ` [Linux-kernel-mentees] " Hans de Goede
@ 2020-10-06  8:31                           ` Coiby Xu
  -1 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-06  8:31 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Linus Walleij, open list:GPIO SUBSYSTEM, wang jun, Nehal Shah,
	Shyam Sundar S K, linux-kernel-mentees

On Tue, Oct 06, 2020 at 08:28:40AM +0200, Hans de Goede wrote:
>Hi,
>
>On 10/6/20 6:49 AM, Coiby Xu wrote:
>>Hi Hans and Linus,
>>
>>I've found the direct evidence proving the GPIO interrupt controller is
>>malfunctioning.
>>
>>I've found a way to let the GPIO chip trigger an interrupt by accident
>>when playing with the GPIO sysfs interface,
>>
>>  - export pin130 which is used by the touchad
>>  - set the direction to be "out"
>>  - `echo 0 > value` will trigger the GPIO controller's parent irq and
>>    "echo 1 > value" will make it stop firing
>>
>>(I'm not sure if this is yet another bug of the GPIO chip. Anyway I can
>>manually trigger an interrupt now.)
>>
>>I wrote a C program is to let GPIO controller quickly generate some
>>interrupts then disable the firing of interrupts by toggling pin#130's
>>value with an specified time interval, i.e., set the value to 0 first
>>and then after some time, re-set the value to 1. There is no interrupt
>>firing unless time internal > 120ms (~7Hz). This explains why we can
>>only see 7 interrupts for the GPIO controller's parent irq.
>
>That is a great find, well done.
>
>>My hypothesis is the GPIO doesn't have proper power setting so it stays
>>in an idle state or its clock frequency is too low by default thus not
>>quick enough to read interrupt input. Then pinctrl-amd must miss some
>>code to configure the chip and I need a hardware reference manual of this
>>GPIO chip (HID: AMDI0030) or reverse-engineer the driver for Windows
>>since I couldn't find a copy of reference manual online? What would you
>>suggest?
>
>This sounds like it might have something to do with the glitch filter.
>The code in pinctrl-amd.c to setup the trigger-type also configures
>the glitch filter, you could try changing that code to disable the
>glitch-filter. The defines for setting the glitch-filter bits to
>disabled are already there.
>

Disabling the glitch filter works like a charm! Other enthusiastic
Linux users who have been troubled by this issue for months would
also feel great to know this small tweaking could bring their
touchpad back to life:) Thank you!

$ git diff
diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
index 9a760f5cd7ed..e786d779d6c8 100644
--- a/drivers/pinctrl/pinctrl-amd.c
+++ b/drivers/pinctrl/pinctrl-amd.c
@@ -463,7 +463,7 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
                 pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
                 pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
                 pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
-               pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
+               /** pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; */
                 irq_set_handler_locked(d, handle_level_irq);
                 break;

I will learn more about the glitch filter and the implementation of
pinctrl and see if I can disable glitch filter only for this touchpad.

>Regards,
>
>Hans
>
>
>
>
>>
>>Thank you!
>>
>>On Sun, Oct 04, 2020 at 01:16:44PM +0800, Coiby Xu wrote:
>>>On Sun, Oct 04, 2020 at 07:03:40AM +0800, Coiby Xu wrote:
>>>>On Sat, Oct 03, 2020 at 03:22:46PM +0200, Hans de Goede wrote:
>>>>>Hi,
>>>>>
>>>>>On 10/3/20 12:45 AM, Coiby Xu wrote:
>>>>>>On Fri, Oct 02, 2020 at 09:44:54PM +0200, Hans de Goede wrote:
>>>>>>>Hi,
>>>>>>>
>>>>>>>On 10/2/20 4:51 PM, Coiby Xu wrote:
>>>>>>>>On Fri, Oct 02, 2020 at 03:36:29PM +0200, Hans de Goede wrote:
>>>>>>>
>>>>>>><snip>
>>>>>>>
>>>>>>>>>>>So are you seeing these 7 interrupts / second for the touchpad irq or for
>>>>>>>>>>>the GPIO controllers parent irq ?
>>>>>>>>>>>
>>>>>>>>>>>Also to these 7 interrupts/sec stop happening when you do not touch the
>>>>>>>>>>>touchpad ?
>>>>>>>>>>>
>>>>>>>>>>I see these 7 interrupts / second for the GPIO controller's parent irq.
>>>>>>>>>>And they stop happening when I don't touch the touchpad.
>>>>>>>>>
>>>>>>>>>Only from the parent irq, or also on the touchpad irq itself ?
>>>>>>>>>
>>>>>>>>>If this only happens on the parent irq, then I would start looking at the
>>>>>>>>>amd-pinctrl code which determines which of its "child" irqs to fire.
>>>>>>>>
>>>>>>>>This only happens on the parent irq. The input's pin#130 of the GIPO
>>>>>>>>chip is low most of the time and pin#130.
>>>>>>>
>>>>>>>Right, but it is a low-level triggered IRQ, so when it is low it should
>>>>>>>be executing the i2c-hid interrupt-handler. If it is not executing that
>>>>>>>then it is time to look at amd-pinctrl's irq-handler and figure out why
>>>>>>>that is not triggering the child irq handler for the touchpad.
>>>>>>>
>>>>>>I'm not sure if I have some incorrect understandings about GPIO
>>>>>>interrupt controller because I don't quite follow your reasoning.
>>>>>>What I actually suspect is there's something wrong with amd-pinctrl
>>>>>>which makes the GPIO chip fail to assert its common interrupt output
>>>>>>line connected to one IO-APIC's pin#7 thus IRQ#7 fails to fire. What
>>>>>>I learn about this low-level triggered IRQ is that the i2c-hid
>>>>>>interrupt-handler will be woken up by amd-pinctrl's irq-handler which
>>>>>>is executed when the parent IRQ#7 fires. The code path is as follows,
>>>>>>
>>>>>>    <IRQ>
>>>>>>    dump_stack+0x64/0x88
>>>>>>    __irq_wake_thread.cold+0x9/0x12
>>>>>>    __handle_irq_event_percpu+0x80/0x1c0
>>>>>>    handle_irq_event+0x58/0xb0
>>>>>>    handle_level_irq+0xb7/0x1a0
>>>>>>    generic_handle_irq+0x4a/0x60
>>>>>>    amd_gpio_irq_handler+0x15f/0x1b0 [pinctrl_amd]
>>>>>>    __handle_irq_event_percpu+0x45/0x1c0
>>>>>>    handle_irq_event+0x58/0xb0
>>>>>>    handle_fasteoi_irq+0xa2/0x210
>>>>>>    do_IRQ+0x70/0x120
>>>>>>    common_interrupt+0xf/0xf
>>>>>>    </IRQ>
>>>>>>
>>>>>>But the problem is somehow IRQ#7 doesn't even fire when the input's
>>>>>>pin#130 of the GIPO is low. Without IRQ#7 firing, amd-pinctrl's
>>>>>>irq-handler wouldn't be executed in the first place, let alonet
>>>>>>triggering the child irq handler. Btw, amd-pinctrl's irq-handler
>>>>>>simply iterate over all pins. If there is mapped irq found for this
>>>>>>hwirq (yes, it won't even check if this pin triggers the interrupt),
>>>>>>then it will call generic_handle_irq. So there's nothing wrong about
>>>>>>this part of code.
>>>>>
>>>>>Ok, so the i2c-hid irq does fire, but only 7 times a second just
>>>>>like the GPIO controller's parent irq.
>>>>>
>>>>I'm not sure if it's correct to say if hi2c-hid irq fires or not and how
>>>>frequently it fires since the i2c-hid irq is mapped to pin#130 of the
>>>>GPIO interrupt controller and the touchpad has another interrupt line
>>>>connected to pin#130 which fires to indicate new data. All we know is
>>>>pin#130 of the GPIO chip has low input most of the time when the finger
>>>>is on the touchpad so we can infer the touchpad has been trying to
>>>>notify the kernel of new data but somehow GPIO's parent irq only fires 7
>>>>times / second.
>>>>
>>>>>The only thing I can think of then is to add printk-s to check how
>>>>>long the i2c-hid interrupt handler takes to complete. It could be
>>>>>there is a subtle bug somewhere causing the i2c transfers to take
>>>>>longer when run from a (threaded) irq handler. That would be weird
>>>>>though, so I don't expect this to result in any useful findings.
>>>>>
>>>>
>>>>I also doubted if it takes too much time for the i2c-hid handler to
>>>>finish reading i2c transfer, processing data and delivering to the input
>>>>system. After measuring the time internal between the starting of the
>>>>GPIO irq's parent handler and when pin#130 is unmasked, we can exclude
>>>>this possibility.
>>>>
>>>>I have been wondering if we let make pin#130 have low input thus to
>>>>trigger a interrupt firing or assert the GPIO's common interrupt output
>>>>line manually thus we can measure how long does it take for the kernel
>>>>to receive the signal. But once GPIO's pin is programmed to be a
>>>>interrupt line we can't write anything to it and it seems other
>>>>interrupts can only be generated by the hardware. So this idea is not
>>>>plausible
>>>>
>>>
>>>Btw, there are other users who have the same laptop model but with a
>>>different touchpad (ELAN). Their touchpads would show in
>>>/proc/bus/input/devices but are completely dead. hid-recorder which
>>>will read HID reports from /dev/hidraw gets nothing if they put there
>>>fingers on the touchpad but the polling mode could also save their
>>>touchpads. It seems GPIO controller's parent irq for the ELAN touchpad
>>>doesn't even fire once. And unlike GPIO, IO-APIC has also be used by
>>>other devices like the keyboard. So maybe it's safe to assert the root
>>>cause is from the GPIO controller.
>>>
>>>>>Other then that I'm all out of ideas I'm afraid.
>>>>>
>>>>Thank you for taking time to investigate this issue anyway! Have a nice
>>>>weekend:)
>>>>>Regards,
>>>>>
>>>>>Hans
>>>>>
>>>>
>>>>--
>>>>Best regards,
>>>>Coiby
>>>
>>>--
>>>Best regards,
>>>Coiby
>>
>>--
>>Best regards,
>>Coiby
>>
>

--
Best regards,
Coiby

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-06  8:31                           ` Coiby Xu
  0 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-06  8:31 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Shyam Sundar S K, Linus Walleij, wang jun,
	open list:GPIO SUBSYSTEM, linux-kernel-mentees, Nehal Shah

On Tue, Oct 06, 2020 at 08:28:40AM +0200, Hans de Goede wrote:
>Hi,
>
>On 10/6/20 6:49 AM, Coiby Xu wrote:
>>Hi Hans and Linus,
>>
>>I've found the direct evidence proving the GPIO interrupt controller is
>>malfunctioning.
>>
>>I've found a way to let the GPIO chip trigger an interrupt by accident
>>when playing with the GPIO sysfs interface,
>>
>>  - export pin130 which is used by the touchad
>>  - set the direction to be "out"
>>  - `echo 0 > value` will trigger the GPIO controller's parent irq and
>>    "echo 1 > value" will make it stop firing
>>
>>(I'm not sure if this is yet another bug of the GPIO chip. Anyway I can
>>manually trigger an interrupt now.)
>>
>>I wrote a C program is to let GPIO controller quickly generate some
>>interrupts then disable the firing of interrupts by toggling pin#130's
>>value with an specified time interval, i.e., set the value to 0 first
>>and then after some time, re-set the value to 1. There is no interrupt
>>firing unless time internal > 120ms (~7Hz). This explains why we can
>>only see 7 interrupts for the GPIO controller's parent irq.
>
>That is a great find, well done.
>
>>My hypothesis is the GPIO doesn't have proper power setting so it stays
>>in an idle state or its clock frequency is too low by default thus not
>>quick enough to read interrupt input. Then pinctrl-amd must miss some
>>code to configure the chip and I need a hardware reference manual of this
>>GPIO chip (HID: AMDI0030) or reverse-engineer the driver for Windows
>>since I couldn't find a copy of reference manual online? What would you
>>suggest?
>
>This sounds like it might have something to do with the glitch filter.
>The code in pinctrl-amd.c to setup the trigger-type also configures
>the glitch filter, you could try changing that code to disable the
>glitch-filter. The defines for setting the glitch-filter bits to
>disabled are already there.
>

Disabling the glitch filter works like a charm! Other enthusiastic
Linux users who have been troubled by this issue for months would
also feel great to know this small tweaking could bring their
touchpad back to life:) Thank you!

$ git diff
diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
index 9a760f5cd7ed..e786d779d6c8 100644
--- a/drivers/pinctrl/pinctrl-amd.c
+++ b/drivers/pinctrl/pinctrl-amd.c
@@ -463,7 +463,7 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
                 pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
                 pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
                 pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
-               pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
+               /** pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; */
                 irq_set_handler_locked(d, handle_level_irq);
                 break;

I will learn more about the glitch filter and the implementation of
pinctrl and see if I can disable glitch filter only for this touchpad.

>Regards,
>
>Hans
>
>
>
>
>>
>>Thank you!
>>
>>On Sun, Oct 04, 2020 at 01:16:44PM +0800, Coiby Xu wrote:
>>>On Sun, Oct 04, 2020 at 07:03:40AM +0800, Coiby Xu wrote:
>>>>On Sat, Oct 03, 2020 at 03:22:46PM +0200, Hans de Goede wrote:
>>>>>Hi,
>>>>>
>>>>>On 10/3/20 12:45 AM, Coiby Xu wrote:
>>>>>>On Fri, Oct 02, 2020 at 09:44:54PM +0200, Hans de Goede wrote:
>>>>>>>Hi,
>>>>>>>
>>>>>>>On 10/2/20 4:51 PM, Coiby Xu wrote:
>>>>>>>>On Fri, Oct 02, 2020 at 03:36:29PM +0200, Hans de Goede wrote:
>>>>>>>
>>>>>>><snip>
>>>>>>>
>>>>>>>>>>>So are you seeing these 7 interrupts / second for the touchpad irq or for
>>>>>>>>>>>the GPIO controllers parent irq ?
>>>>>>>>>>>
>>>>>>>>>>>Also to these 7 interrupts/sec stop happening when you do not touch the
>>>>>>>>>>>touchpad ?
>>>>>>>>>>>
>>>>>>>>>>I see these 7 interrupts / second for the GPIO controller's parent irq.
>>>>>>>>>>And they stop happening when I don't touch the touchpad.
>>>>>>>>>
>>>>>>>>>Only from the parent irq, or also on the touchpad irq itself ?
>>>>>>>>>
>>>>>>>>>If this only happens on the parent irq, then I would start looking at the
>>>>>>>>>amd-pinctrl code which determines which of its "child" irqs to fire.
>>>>>>>>
>>>>>>>>This only happens on the parent irq. The input's pin#130 of the GIPO
>>>>>>>>chip is low most of the time and pin#130.
>>>>>>>
>>>>>>>Right, but it is a low-level triggered IRQ, so when it is low it should
>>>>>>>be executing the i2c-hid interrupt-handler. If it is not executing that
>>>>>>>then it is time to look at amd-pinctrl's irq-handler and figure out why
>>>>>>>that is not triggering the child irq handler for the touchpad.
>>>>>>>
>>>>>>I'm not sure if I have some incorrect understandings about GPIO
>>>>>>interrupt controller because I don't quite follow your reasoning.
>>>>>>What I actually suspect is there's something wrong with amd-pinctrl
>>>>>>which makes the GPIO chip fail to assert its common interrupt output
>>>>>>line connected to one IO-APIC's pin#7 thus IRQ#7 fails to fire. What
>>>>>>I learn about this low-level triggered IRQ is that the i2c-hid
>>>>>>interrupt-handler will be woken up by amd-pinctrl's irq-handler which
>>>>>>is executed when the parent IRQ#7 fires. The code path is as follows,
>>>>>>
>>>>>>    <IRQ>
>>>>>>    dump_stack+0x64/0x88
>>>>>>    __irq_wake_thread.cold+0x9/0x12
>>>>>>    __handle_irq_event_percpu+0x80/0x1c0
>>>>>>    handle_irq_event+0x58/0xb0
>>>>>>    handle_level_irq+0xb7/0x1a0
>>>>>>    generic_handle_irq+0x4a/0x60
>>>>>>    amd_gpio_irq_handler+0x15f/0x1b0 [pinctrl_amd]
>>>>>>    __handle_irq_event_percpu+0x45/0x1c0
>>>>>>    handle_irq_event+0x58/0xb0
>>>>>>    handle_fasteoi_irq+0xa2/0x210
>>>>>>    do_IRQ+0x70/0x120
>>>>>>    common_interrupt+0xf/0xf
>>>>>>    </IRQ>
>>>>>>
>>>>>>But the problem is somehow IRQ#7 doesn't even fire when the input's
>>>>>>pin#130 of the GIPO is low. Without IRQ#7 firing, amd-pinctrl's
>>>>>>irq-handler wouldn't be executed in the first place, let alonet
>>>>>>triggering the child irq handler. Btw, amd-pinctrl's irq-handler
>>>>>>simply iterate over all pins. If there is mapped irq found for this
>>>>>>hwirq (yes, it won't even check if this pin triggers the interrupt),
>>>>>>then it will call generic_handle_irq. So there's nothing wrong about
>>>>>>this part of code.
>>>>>
>>>>>Ok, so the i2c-hid irq does fire, but only 7 times a second just
>>>>>like the GPIO controller's parent irq.
>>>>>
>>>>I'm not sure if it's correct to say if hi2c-hid irq fires or not and how
>>>>frequently it fires since the i2c-hid irq is mapped to pin#130 of the
>>>>GPIO interrupt controller and the touchpad has another interrupt line
>>>>connected to pin#130 which fires to indicate new data. All we know is
>>>>pin#130 of the GPIO chip has low input most of the time when the finger
>>>>is on the touchpad so we can infer the touchpad has been trying to
>>>>notify the kernel of new data but somehow GPIO's parent irq only fires 7
>>>>times / second.
>>>>
>>>>>The only thing I can think of then is to add printk-s to check how
>>>>>long the i2c-hid interrupt handler takes to complete. It could be
>>>>>there is a subtle bug somewhere causing the i2c transfers to take
>>>>>longer when run from a (threaded) irq handler. That would be weird
>>>>>though, so I don't expect this to result in any useful findings.
>>>>>
>>>>
>>>>I also doubted if it takes too much time for the i2c-hid handler to
>>>>finish reading i2c transfer, processing data and delivering to the input
>>>>system. After measuring the time internal between the starting of the
>>>>GPIO irq's parent handler and when pin#130 is unmasked, we can exclude
>>>>this possibility.
>>>>
>>>>I have been wondering if we let make pin#130 have low input thus to
>>>>trigger a interrupt firing or assert the GPIO's common interrupt output
>>>>line manually thus we can measure how long does it take for the kernel
>>>>to receive the signal. But once GPIO's pin is programmed to be a
>>>>interrupt line we can't write anything to it and it seems other
>>>>interrupts can only be generated by the hardware. So this idea is not
>>>>plausible
>>>>
>>>
>>>Btw, there are other users who have the same laptop model but with a
>>>different touchpad (ELAN). Their touchpads would show in
>>>/proc/bus/input/devices but are completely dead. hid-recorder which
>>>will read HID reports from /dev/hidraw gets nothing if they put there
>>>fingers on the touchpad but the polling mode could also save their
>>>touchpads. It seems GPIO controller's parent irq for the ELAN touchpad
>>>doesn't even fire once. And unlike GPIO, IO-APIC has also be used by
>>>other devices like the keyboard. So maybe it's safe to assert the root
>>>cause is from the GPIO controller.
>>>
>>>>>Other then that I'm all out of ideas I'm afraid.
>>>>>
>>>>Thank you for taking time to investigate this issue anyway! Have a nice
>>>>weekend:)
>>>>>Regards,
>>>>>
>>>>>Hans
>>>>>
>>>>
>>>>--
>>>>Best regards,
>>>>Coiby
>>>
>>>--
>>>Best regards,
>>>Coiby
>>
>>--
>>Best regards,
>>Coiby
>>
>

--
Best regards,
Coiby
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply related	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-06  8:31                           ` [Linux-kernel-mentees] " Coiby Xu
@ 2020-10-06  8:55                             ` Hans de Goede
  -1 siblings, 0 replies; 84+ messages in thread
From: Hans de Goede @ 2020-10-06  8:55 UTC (permalink / raw)
  To: Coiby Xu
  Cc: Linus Walleij, open list:GPIO SUBSYSTEM, wang jun, Nehal Shah,
	Shyam Sundar S K, linux-kernel-mentees

Hi,

On 10/6/20 10:31 AM, Coiby Xu wrote:
> On Tue, Oct 06, 2020 at 08:28:40AM +0200, Hans de Goede wrote:
>> Hi,
>>
>> On 10/6/20 6:49 AM, Coiby Xu wrote:
>>> Hi Hans and Linus,
>>>
>>> I've found the direct evidence proving the GPIO interrupt controller is
>>> malfunctioning.
>>>
>>> I've found a way to let the GPIO chip trigger an interrupt by accident
>>> when playing with the GPIO sysfs interface,
>>>
>>>  - export pin130 which is used by the touchad
>>>  - set the direction to be "out"
>>>  - `echo 0 > value` will trigger the GPIO controller's parent irq and
>>>    "echo 1 > value" will make it stop firing
>>>
>>> (I'm not sure if this is yet another bug of the GPIO chip. Anyway I can
>>> manually trigger an interrupt now.)
>>>
>>> I wrote a C program is to let GPIO controller quickly generate some
>>> interrupts then disable the firing of interrupts by toggling pin#130's
>>> value with an specified time interval, i.e., set the value to 0 first
>>> and then after some time, re-set the value to 1. There is no interrupt
>>> firing unless time internal > 120ms (~7Hz). This explains why we can
>>> only see 7 interrupts for the GPIO controller's parent irq.
>>
>> That is a great find, well done.
>>
>>> My hypothesis is the GPIO doesn't have proper power setting so it stays
>>> in an idle state or its clock frequency is too low by default thus not
>>> quick enough to read interrupt input. Then pinctrl-amd must miss some
>>> code to configure the chip and I need a hardware reference manual of this
>>> GPIO chip (HID: AMDI0030) or reverse-engineer the driver for Windows
>>> since I couldn't find a copy of reference manual online? What would you
>>> suggest?
>>
>> This sounds like it might have something to do with the glitch filter.
>> The code in pinctrl-amd.c to setup the trigger-type also configures
>> the glitch filter, you could try changing that code to disable the
>> glitch-filter. The defines for setting the glitch-filter bits to
>> disabled are already there.
>>
> 
> Disabling the glitch filter works like a charm! Other enthusiastic
> Linux users who have been troubled by this issue for months would
> also feel great to know this small tweaking could bring their
> touchpad back to life:) Thank you!

That is good to hear, I'm glad that we have finally found a solution.

> $ git diff
> diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
> index 9a760f5cd7ed..e786d779d6c8 100644
> --- a/drivers/pinctrl/pinctrl-amd.c
> +++ b/drivers/pinctrl/pinctrl-amd.c
> @@ -463,7 +463,7 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>                  pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>                  pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>                  pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
> -               pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
> +               /** pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; */
>                  irq_set_handler_locked(d, handle_level_irq);
>                  break;
> 
> I will learn more about the glitch filter and the implementation of
> pinctrl and see if I can disable glitch filter only for this touchpad.

The glitch filter likely also has settings for how long a glitch
lasts, which apparently goes all the way up to 120ms. If it would
only delay reporting by say 0.1ms and consider any pulse longer
then 0.1s not a glitch, then having it enabled would be fine.

I don't think we want some sort of quirk here to only disable the
glitch filter for some touchpads. One approach might be to simply
disable it completely for level type irqs.

What we really need here is some input from AMD engineers with how
this is all supposed to work.

E.g. maybe the glitch-filter is setup by the BIOS and we should not
touch it all ?

Or maybe instead of DB_TYPE_PRESERVE_HIGH_GLITCH low level interrupts
should use DB_TYPE_PRESERVE_LOW_GLITCH ?   Some docs for the hw
would really help here ...

Regards,

Hans


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-06  8:55                             ` Hans de Goede
  0 siblings, 0 replies; 84+ messages in thread
From: Hans de Goede @ 2020-10-06  8:55 UTC (permalink / raw)
  To: Coiby Xu
  Cc: Shyam Sundar S K, Linus Walleij, wang jun,
	open list:GPIO SUBSYSTEM, linux-kernel-mentees, Nehal Shah

Hi,

On 10/6/20 10:31 AM, Coiby Xu wrote:
> On Tue, Oct 06, 2020 at 08:28:40AM +0200, Hans de Goede wrote:
>> Hi,
>>
>> On 10/6/20 6:49 AM, Coiby Xu wrote:
>>> Hi Hans and Linus,
>>>
>>> I've found the direct evidence proving the GPIO interrupt controller is
>>> malfunctioning.
>>>
>>> I've found a way to let the GPIO chip trigger an interrupt by accident
>>> when playing with the GPIO sysfs interface,
>>>
>>>  - export pin130 which is used by the touchad
>>>  - set the direction to be "out"
>>>  - `echo 0 > value` will trigger the GPIO controller's parent irq and
>>>    "echo 1 > value" will make it stop firing
>>>
>>> (I'm not sure if this is yet another bug of the GPIO chip. Anyway I can
>>> manually trigger an interrupt now.)
>>>
>>> I wrote a C program is to let GPIO controller quickly generate some
>>> interrupts then disable the firing of interrupts by toggling pin#130's
>>> value with an specified time interval, i.e., set the value to 0 first
>>> and then after some time, re-set the value to 1. There is no interrupt
>>> firing unless time internal > 120ms (~7Hz). This explains why we can
>>> only see 7 interrupts for the GPIO controller's parent irq.
>>
>> That is a great find, well done.
>>
>>> My hypothesis is the GPIO doesn't have proper power setting so it stays
>>> in an idle state or its clock frequency is too low by default thus not
>>> quick enough to read interrupt input. Then pinctrl-amd must miss some
>>> code to configure the chip and I need a hardware reference manual of this
>>> GPIO chip (HID: AMDI0030) or reverse-engineer the driver for Windows
>>> since I couldn't find a copy of reference manual online? What would you
>>> suggest?
>>
>> This sounds like it might have something to do with the glitch filter.
>> The code in pinctrl-amd.c to setup the trigger-type also configures
>> the glitch filter, you could try changing that code to disable the
>> glitch-filter. The defines for setting the glitch-filter bits to
>> disabled are already there.
>>
> 
> Disabling the glitch filter works like a charm! Other enthusiastic
> Linux users who have been troubled by this issue for months would
> also feel great to know this small tweaking could bring their
> touchpad back to life:) Thank you!

That is good to hear, I'm glad that we have finally found a solution.

> $ git diff
> diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
> index 9a760f5cd7ed..e786d779d6c8 100644
> --- a/drivers/pinctrl/pinctrl-amd.c
> +++ b/drivers/pinctrl/pinctrl-amd.c
> @@ -463,7 +463,7 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>                  pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>                  pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>                  pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
> -               pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
> +               /** pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; */
>                  irq_set_handler_locked(d, handle_level_irq);
>                  break;
> 
> I will learn more about the glitch filter and the implementation of
> pinctrl and see if I can disable glitch filter only for this touchpad.

The glitch filter likely also has settings for how long a glitch
lasts, which apparently goes all the way up to 120ms. If it would
only delay reporting by say 0.1ms and consider any pulse longer
then 0.1s not a glitch, then having it enabled would be fine.

I don't think we want some sort of quirk here to only disable the
glitch filter for some touchpads. One approach might be to simply
disable it completely for level type irqs.

What we really need here is some input from AMD engineers with how
this is all supposed to work.

E.g. maybe the glitch-filter is setup by the BIOS and we should not
touch it all ?

Or maybe instead of DB_TYPE_PRESERVE_HIGH_GLITCH low level interrupts
should use DB_TYPE_PRESERVE_LOW_GLITCH ?   Some docs for the hw
would really help here ...

Regards,

Hans

_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-06  8:31                           ` [Linux-kernel-mentees] " Coiby Xu
@ 2020-10-06  9:16                             ` Linus Walleij
  -1 siblings, 0 replies; 84+ messages in thread
From: Linus Walleij @ 2020-10-06  9:16 UTC (permalink / raw)
  To: Coiby Xu
  Cc: Hans de Goede, open list:GPIO SUBSYSTEM, wang jun, Nehal Shah,
	Shyam Sundar S K, linux-kernel-mentees

On Tue, Oct 6, 2020 at 10:32 AM Coiby Xu <coiby.xu@gmail.com> wrote:

> Disabling the glitch filter works like a charm! Other enthusiastic
> Linux users who have been troubled by this issue for months would
> also feel great to know this small tweaking could bring their
> touchpad back to life:) Thank you!

Oh you found the bug :D

> $ git diff
> diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
> index 9a760f5cd7ed..e786d779d6c8 100644
> --- a/drivers/pinctrl/pinctrl-amd.c
> +++ b/drivers/pinctrl/pinctrl-amd.c
> @@ -463,7 +463,7 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>                  pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>                  pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>                  pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
> -               pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
> +               /** pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; */
>                  irq_set_handler_locked(d, handle_level_irq);
>                  break;
>
> I will learn more about the glitch filter and the implementation of
> pinctrl and see if I can disable glitch filter only for this touchpad.

Yes we certainly need a quirk for this of some kind, examine the ACPI
quirk infrastructure in drivers/gpio/gpiolib-acpi.c to see if you can use
that to handle this.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-06  9:16                             ` Linus Walleij
  0 siblings, 0 replies; 84+ messages in thread
From: Linus Walleij @ 2020-10-06  9:16 UTC (permalink / raw)
  To: Coiby Xu
  Cc: Shyam Sundar S K, open list:GPIO SUBSYSTEM, wang jun,
	Hans de Goede, linux-kernel-mentees, Nehal Shah

On Tue, Oct 6, 2020 at 10:32 AM Coiby Xu <coiby.xu@gmail.com> wrote:

> Disabling the glitch filter works like a charm! Other enthusiastic
> Linux users who have been troubled by this issue for months would
> also feel great to know this small tweaking could bring their
> touchpad back to life:) Thank you!

Oh you found the bug :D

> $ git diff
> diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
> index 9a760f5cd7ed..e786d779d6c8 100644
> --- a/drivers/pinctrl/pinctrl-amd.c
> +++ b/drivers/pinctrl/pinctrl-amd.c
> @@ -463,7 +463,7 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>                  pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>                  pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>                  pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
> -               pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
> +               /** pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; */
>                  irq_set_handler_locked(d, handle_level_irq);
>                  break;
>
> I will learn more about the glitch filter and the implementation of
> pinctrl and see if I can disable glitch filter only for this touchpad.

Yes we certainly need a quirk for this of some kind, examine the ACPI
quirk infrastructure in drivers/gpio/gpiolib-acpi.c to see if you can use
that to handle this.

Yours,
Linus Walleij
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-06  8:55                             ` [Linux-kernel-mentees] " Hans de Goede
@ 2020-10-06  9:28                               ` Hans de Goede
  -1 siblings, 0 replies; 84+ messages in thread
From: Hans de Goede @ 2020-10-06  9:28 UTC (permalink / raw)
  To: Coiby Xu
  Cc: Linus Walleij, open list:GPIO SUBSYSTEM, wang jun, Nehal Shah,
	Shyam Sundar S K, linux-kernel-mentees

Hi,

On 10/6/20 10:55 AM, Hans de Goede wrote:
> Hi,
> 
> On 10/6/20 10:31 AM, Coiby Xu wrote:
>> On Tue, Oct 06, 2020 at 08:28:40AM +0200, Hans de Goede wrote:
>>> Hi,
>>>
>>> On 10/6/20 6:49 AM, Coiby Xu wrote:
>>>> Hi Hans and Linus,
>>>>
>>>> I've found the direct evidence proving the GPIO interrupt controller is
>>>> malfunctioning.
>>>>
>>>> I've found a way to let the GPIO chip trigger an interrupt by accident
>>>> when playing with the GPIO sysfs interface,
>>>>
>>>>  - export pin130 which is used by the touchad
>>>>  - set the direction to be "out"
>>>>  - `echo 0 > value` will trigger the GPIO controller's parent irq and
>>>>    "echo 1 > value" will make it stop firing
>>>>
>>>> (I'm not sure if this is yet another bug of the GPIO chip. Anyway I can
>>>> manually trigger an interrupt now.)
>>>>
>>>> I wrote a C program is to let GPIO controller quickly generate some
>>>> interrupts then disable the firing of interrupts by toggling pin#130's
>>>> value with an specified time interval, i.e., set the value to 0 first
>>>> and then after some time, re-set the value to 1. There is no interrupt
>>>> firing unless time internal > 120ms (~7Hz). This explains why we can
>>>> only see 7 interrupts for the GPIO controller's parent irq.
>>>
>>> That is a great find, well done.
>>>
>>>> My hypothesis is the GPIO doesn't have proper power setting so it stays
>>>> in an idle state or its clock frequency is too low by default thus not
>>>> quick enough to read interrupt input. Then pinctrl-amd must miss some
>>>> code to configure the chip and I need a hardware reference manual of this
>>>> GPIO chip (HID: AMDI0030) or reverse-engineer the driver for Windows
>>>> since I couldn't find a copy of reference manual online? What would you
>>>> suggest?
>>>
>>> This sounds like it might have something to do with the glitch filter.
>>> The code in pinctrl-amd.c to setup the trigger-type also configures
>>> the glitch filter, you could try changing that code to disable the
>>> glitch-filter. The defines for setting the glitch-filter bits to
>>> disabled are already there.
>>>
>>
>> Disabling the glitch filter works like a charm! Other enthusiastic
>> Linux users who have been troubled by this issue for months would
>> also feel great to know this small tweaking could bring their
>> touchpad back to life:) Thank you!
> 
> That is good to hear, I'm glad that we have finally found a solution.
> 
>> $ git diff
>> diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
>> index 9a760f5cd7ed..e786d779d6c8 100644
>> --- a/drivers/pinctrl/pinctrl-amd.c
>> +++ b/drivers/pinctrl/pinctrl-amd.c
>> @@ -463,7 +463,7 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>>                  pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>>                  pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>>                  pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>> -               pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>> +               /** pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; */
>>                  irq_set_handler_locked(d, handle_level_irq);
>>                  break;
>>
>> I will learn more about the glitch filter and the implementation of
>> pinctrl and see if I can disable glitch filter only for this touchpad.
> 
> The glitch filter likely also has settings for how long a glitch
> lasts, which apparently goes all the way up to 120ms. If it would
> only delay reporting by say 0.1ms and consider any pulse longer
> then 0.1s not a glitch, then having it enabled would be fine.
> 
> I don't think we want some sort of quirk here to only disable the
> glitch filter for some touchpads. One approach might be to simply
> disable it completely for level type irqs.
> 
> What we really need here is some input from AMD engineers with how
> this is all supposed to work.
> 
> E.g. maybe the glitch-filter is setup by the BIOS and we should not
> touch it all ?
> 
> Or maybe instead of DB_TYPE_PRESERVE_HIGH_GLITCH low level interrupts
> should use DB_TYPE_PRESERVE_LOW_GLITCH ?   Some docs for the hw
> would really help here ...

So I've been digging through the history of the pinctrl-amd.c driver
and once upon a time it used to set a default debounce time of
2.75 ms.

See the patch generated by doing:

git format-patch 8cf4345575a416e6856a6856ac6eaa31ad883126~..8cf4345575a416e6856a6856ac6eaa31ad883126

In a linux kernel checkout.

So it would be interesting to add a debugging printk to see
what the value of pin_reg & DB_TMR_OUT_MASK is for the troublesome
GPIO.

I guess that it might be all 1s (0xfffffffff) or some such which
might be a way to check that we should disable the glitch-filter
for this pin?

Regards,

Hans


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-06  9:28                               ` Hans de Goede
  0 siblings, 0 replies; 84+ messages in thread
From: Hans de Goede @ 2020-10-06  9:28 UTC (permalink / raw)
  To: Coiby Xu
  Cc: Shyam Sundar S K, Linus Walleij, wang jun,
	open list:GPIO SUBSYSTEM, linux-kernel-mentees, Nehal Shah

Hi,

On 10/6/20 10:55 AM, Hans de Goede wrote:
> Hi,
> 
> On 10/6/20 10:31 AM, Coiby Xu wrote:
>> On Tue, Oct 06, 2020 at 08:28:40AM +0200, Hans de Goede wrote:
>>> Hi,
>>>
>>> On 10/6/20 6:49 AM, Coiby Xu wrote:
>>>> Hi Hans and Linus,
>>>>
>>>> I've found the direct evidence proving the GPIO interrupt controller is
>>>> malfunctioning.
>>>>
>>>> I've found a way to let the GPIO chip trigger an interrupt by accident
>>>> when playing with the GPIO sysfs interface,
>>>>
>>>>  - export pin130 which is used by the touchad
>>>>  - set the direction to be "out"
>>>>  - `echo 0 > value` will trigger the GPIO controller's parent irq and
>>>>    "echo 1 > value" will make it stop firing
>>>>
>>>> (I'm not sure if this is yet another bug of the GPIO chip. Anyway I can
>>>> manually trigger an interrupt now.)
>>>>
>>>> I wrote a C program is to let GPIO controller quickly generate some
>>>> interrupts then disable the firing of interrupts by toggling pin#130's
>>>> value with an specified time interval, i.e., set the value to 0 first
>>>> and then after some time, re-set the value to 1. There is no interrupt
>>>> firing unless time internal > 120ms (~7Hz). This explains why we can
>>>> only see 7 interrupts for the GPIO controller's parent irq.
>>>
>>> That is a great find, well done.
>>>
>>>> My hypothesis is the GPIO doesn't have proper power setting so it stays
>>>> in an idle state or its clock frequency is too low by default thus not
>>>> quick enough to read interrupt input. Then pinctrl-amd must miss some
>>>> code to configure the chip and I need a hardware reference manual of this
>>>> GPIO chip (HID: AMDI0030) or reverse-engineer the driver for Windows
>>>> since I couldn't find a copy of reference manual online? What would you
>>>> suggest?
>>>
>>> This sounds like it might have something to do with the glitch filter.
>>> The code in pinctrl-amd.c to setup the trigger-type also configures
>>> the glitch filter, you could try changing that code to disable the
>>> glitch-filter. The defines for setting the glitch-filter bits to
>>> disabled are already there.
>>>
>>
>> Disabling the glitch filter works like a charm! Other enthusiastic
>> Linux users who have been troubled by this issue for months would
>> also feel great to know this small tweaking could bring their
>> touchpad back to life:) Thank you!
> 
> That is good to hear, I'm glad that we have finally found a solution.
> 
>> $ git diff
>> diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
>> index 9a760f5cd7ed..e786d779d6c8 100644
>> --- a/drivers/pinctrl/pinctrl-amd.c
>> +++ b/drivers/pinctrl/pinctrl-amd.c
>> @@ -463,7 +463,7 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>>                  pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>>                  pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>>                  pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>> -               pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>> +               /** pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; */
>>                  irq_set_handler_locked(d, handle_level_irq);
>>                  break;
>>
>> I will learn more about the glitch filter and the implementation of
>> pinctrl and see if I can disable glitch filter only for this touchpad.
> 
> The glitch filter likely also has settings for how long a glitch
> lasts, which apparently goes all the way up to 120ms. If it would
> only delay reporting by say 0.1ms and consider any pulse longer
> then 0.1s not a glitch, then having it enabled would be fine.
> 
> I don't think we want some sort of quirk here to only disable the
> glitch filter for some touchpads. One approach might be to simply
> disable it completely for level type irqs.
> 
> What we really need here is some input from AMD engineers with how
> this is all supposed to work.
> 
> E.g. maybe the glitch-filter is setup by the BIOS and we should not
> touch it all ?
> 
> Or maybe instead of DB_TYPE_PRESERVE_HIGH_GLITCH low level interrupts
> should use DB_TYPE_PRESERVE_LOW_GLITCH ?   Some docs for the hw
> would really help here ...

So I've been digging through the history of the pinctrl-amd.c driver
and once upon a time it used to set a default debounce time of
2.75 ms.

See the patch generated by doing:

git format-patch 8cf4345575a416e6856a6856ac6eaa31ad883126~..8cf4345575a416e6856a6856ac6eaa31ad883126

In a linux kernel checkout.

So it would be interesting to add a debugging printk to see
what the value of pin_reg & DB_TMR_OUT_MASK is for the troublesome
GPIO.

I guess that it might be all 1s (0xfffffffff) or some such which
might be a way to check that we should disable the glitch-filter
for this pin?

Regards,

Hans

_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-06  9:28                               ` [Linux-kernel-mentees] " Hans de Goede
@ 2020-10-06  9:29                                 ` Hans de Goede
  -1 siblings, 0 replies; 84+ messages in thread
From: Hans de Goede @ 2020-10-06  9:29 UTC (permalink / raw)
  To: Coiby Xu
  Cc: Linus Walleij, open list:GPIO SUBSYSTEM, wang jun, Nehal Shah,
	Shyam Sundar S K, linux-kernel-mentees



On 10/6/20 11:28 AM, Hans de Goede wrote:
> Hi,
> 
> On 10/6/20 10:55 AM, Hans de Goede wrote:
>> Hi,
>>
>> On 10/6/20 10:31 AM, Coiby Xu wrote:
>>> On Tue, Oct 06, 2020 at 08:28:40AM +0200, Hans de Goede wrote:
>>>> Hi,
>>>>
>>>> On 10/6/20 6:49 AM, Coiby Xu wrote:
>>>>> Hi Hans and Linus,
>>>>>
>>>>> I've found the direct evidence proving the GPIO interrupt controller is
>>>>> malfunctioning.
>>>>>
>>>>> I've found a way to let the GPIO chip trigger an interrupt by accident
>>>>> when playing with the GPIO sysfs interface,
>>>>>
>>>>>  - export pin130 which is used by the touchad
>>>>>  - set the direction to be "out"
>>>>>  - `echo 0 > value` will trigger the GPIO controller's parent irq and
>>>>>    "echo 1 > value" will make it stop firing
>>>>>
>>>>> (I'm not sure if this is yet another bug of the GPIO chip. Anyway I can
>>>>> manually trigger an interrupt now.)
>>>>>
>>>>> I wrote a C program is to let GPIO controller quickly generate some
>>>>> interrupts then disable the firing of interrupts by toggling pin#130's
>>>>> value with an specified time interval, i.e., set the value to 0 first
>>>>> and then after some time, re-set the value to 1. There is no interrupt
>>>>> firing unless time internal > 120ms (~7Hz). This explains why we can
>>>>> only see 7 interrupts for the GPIO controller's parent irq.
>>>>
>>>> That is a great find, well done.
>>>>
>>>>> My hypothesis is the GPIO doesn't have proper power setting so it stays
>>>>> in an idle state or its clock frequency is too low by default thus not
>>>>> quick enough to read interrupt input. Then pinctrl-amd must miss some
>>>>> code to configure the chip and I need a hardware reference manual of this
>>>>> GPIO chip (HID: AMDI0030) or reverse-engineer the driver for Windows
>>>>> since I couldn't find a copy of reference manual online? What would you
>>>>> suggest?
>>>>
>>>> This sounds like it might have something to do with the glitch filter.
>>>> The code in pinctrl-amd.c to setup the trigger-type also configures
>>>> the glitch filter, you could try changing that code to disable the
>>>> glitch-filter. The defines for setting the glitch-filter bits to
>>>> disabled are already there.
>>>>
>>>
>>> Disabling the glitch filter works like a charm! Other enthusiastic
>>> Linux users who have been troubled by this issue for months would
>>> also feel great to know this small tweaking could bring their
>>> touchpad back to life:) Thank you!
>>
>> That is good to hear, I'm glad that we have finally found a solution.
>>
>>> $ git diff
>>> diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
>>> index 9a760f5cd7ed..e786d779d6c8 100644
>>> --- a/drivers/pinctrl/pinctrl-amd.c
>>> +++ b/drivers/pinctrl/pinctrl-amd.c
>>> @@ -463,7 +463,7 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>>>                  pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>>>                  pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>>>                  pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>>> -               pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>>> +               /** pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; */
>>>                  irq_set_handler_locked(d, handle_level_irq);
>>>                  break;
>>>
>>> I will learn more about the glitch filter and the implementation of
>>> pinctrl and see if I can disable glitch filter only for this touchpad.
>>
>> The glitch filter likely also has settings for how long a glitch
>> lasts, which apparently goes all the way up to 120ms. If it would
>> only delay reporting by say 0.1ms and consider any pulse longer
>> then 0.1s not a glitch, then having it enabled would be fine.
>>
>> I don't think we want some sort of quirk here to only disable the
>> glitch filter for some touchpads. One approach might be to simply
>> disable it completely for level type irqs.
>>
>> What we really need here is some input from AMD engineers with how
>> this is all supposed to work.
>>
>> E.g. maybe the glitch-filter is setup by the BIOS and we should not
>> touch it all ?
>>
>> Or maybe instead of DB_TYPE_PRESERVE_HIGH_GLITCH low level interrupts
>> should use DB_TYPE_PRESERVE_LOW_GLITCH ?   Some docs for the hw
>> would really help here ...
> 
> So I've been digging through the history of the pinctrl-amd.c driver
> and once upon a time it used to set a default debounce time of
> 2.75 ms.
> 
> See the patch generated by doing:
> 
> git format-patch 8cf4345575a416e6856a6856ac6eaa31ad883126~..8cf4345575a416e6856a6856ac6eaa31ad883126
> 
> In a linux kernel checkout.
> 
> So it would be interesting to add a debugging printk to see
> what the value of pin_reg & DB_TMR_OUT_MASK is for the troublesome
> GPIO.
> 
> I guess that it might be all 1s (0xfffffffff) or some such which
> might be a way to check that we should disable the glitch-filter
> for this pin?

p.s.

Or maybe we should simply stop touching all the glitch-filter
related bits, in the same way as that old commit has already
removed the code setting the timing of the filter ?

At least is seems that forcing the filter to be on without
sanitizing the de-bounce time is not a good idea.

Regards,

Hans


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-06  9:29                                 ` Hans de Goede
  0 siblings, 0 replies; 84+ messages in thread
From: Hans de Goede @ 2020-10-06  9:29 UTC (permalink / raw)
  To: Coiby Xu
  Cc: Shyam Sundar S K, Linus Walleij, wang jun,
	open list:GPIO SUBSYSTEM, linux-kernel-mentees, Nehal Shah



On 10/6/20 11:28 AM, Hans de Goede wrote:
> Hi,
> 
> On 10/6/20 10:55 AM, Hans de Goede wrote:
>> Hi,
>>
>> On 10/6/20 10:31 AM, Coiby Xu wrote:
>>> On Tue, Oct 06, 2020 at 08:28:40AM +0200, Hans de Goede wrote:
>>>> Hi,
>>>>
>>>> On 10/6/20 6:49 AM, Coiby Xu wrote:
>>>>> Hi Hans and Linus,
>>>>>
>>>>> I've found the direct evidence proving the GPIO interrupt controller is
>>>>> malfunctioning.
>>>>>
>>>>> I've found a way to let the GPIO chip trigger an interrupt by accident
>>>>> when playing with the GPIO sysfs interface,
>>>>>
>>>>>  - export pin130 which is used by the touchad
>>>>>  - set the direction to be "out"
>>>>>  - `echo 0 > value` will trigger the GPIO controller's parent irq and
>>>>>    "echo 1 > value" will make it stop firing
>>>>>
>>>>> (I'm not sure if this is yet another bug of the GPIO chip. Anyway I can
>>>>> manually trigger an interrupt now.)
>>>>>
>>>>> I wrote a C program is to let GPIO controller quickly generate some
>>>>> interrupts then disable the firing of interrupts by toggling pin#130's
>>>>> value with an specified time interval, i.e., set the value to 0 first
>>>>> and then after some time, re-set the value to 1. There is no interrupt
>>>>> firing unless time internal > 120ms (~7Hz). This explains why we can
>>>>> only see 7 interrupts for the GPIO controller's parent irq.
>>>>
>>>> That is a great find, well done.
>>>>
>>>>> My hypothesis is the GPIO doesn't have proper power setting so it stays
>>>>> in an idle state or its clock frequency is too low by default thus not
>>>>> quick enough to read interrupt input. Then pinctrl-amd must miss some
>>>>> code to configure the chip and I need a hardware reference manual of this
>>>>> GPIO chip (HID: AMDI0030) or reverse-engineer the driver for Windows
>>>>> since I couldn't find a copy of reference manual online? What would you
>>>>> suggest?
>>>>
>>>> This sounds like it might have something to do with the glitch filter.
>>>> The code in pinctrl-amd.c to setup the trigger-type also configures
>>>> the glitch filter, you could try changing that code to disable the
>>>> glitch-filter. The defines for setting the glitch-filter bits to
>>>> disabled are already there.
>>>>
>>>
>>> Disabling the glitch filter works like a charm! Other enthusiastic
>>> Linux users who have been troubled by this issue for months would
>>> also feel great to know this small tweaking could bring their
>>> touchpad back to life:) Thank you!
>>
>> That is good to hear, I'm glad that we have finally found a solution.
>>
>>> $ git diff
>>> diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
>>> index 9a760f5cd7ed..e786d779d6c8 100644
>>> --- a/drivers/pinctrl/pinctrl-amd.c
>>> +++ b/drivers/pinctrl/pinctrl-amd.c
>>> @@ -463,7 +463,7 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>>>                  pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>>>                  pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>>>                  pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>>> -               pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>>> +               /** pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; */
>>>                  irq_set_handler_locked(d, handle_level_irq);
>>>                  break;
>>>
>>> I will learn more about the glitch filter and the implementation of
>>> pinctrl and see if I can disable glitch filter only for this touchpad.
>>
>> The glitch filter likely also has settings for how long a glitch
>> lasts, which apparently goes all the way up to 120ms. If it would
>> only delay reporting by say 0.1ms and consider any pulse longer
>> then 0.1s not a glitch, then having it enabled would be fine.
>>
>> I don't think we want some sort of quirk here to only disable the
>> glitch filter for some touchpads. One approach might be to simply
>> disable it completely for level type irqs.
>>
>> What we really need here is some input from AMD engineers with how
>> this is all supposed to work.
>>
>> E.g. maybe the glitch-filter is setup by the BIOS and we should not
>> touch it all ?
>>
>> Or maybe instead of DB_TYPE_PRESERVE_HIGH_GLITCH low level interrupts
>> should use DB_TYPE_PRESERVE_LOW_GLITCH ?   Some docs for the hw
>> would really help here ...
> 
> So I've been digging through the history of the pinctrl-amd.c driver
> and once upon a time it used to set a default debounce time of
> 2.75 ms.
> 
> See the patch generated by doing:
> 
> git format-patch 8cf4345575a416e6856a6856ac6eaa31ad883126~..8cf4345575a416e6856a6856ac6eaa31ad883126
> 
> In a linux kernel checkout.
> 
> So it would be interesting to add a debugging printk to see
> what the value of pin_reg & DB_TMR_OUT_MASK is for the troublesome
> GPIO.
> 
> I guess that it might be all 1s (0xfffffffff) or some such which
> might be a way to check that we should disable the glitch-filter
> for this pin?

p.s.

Or maybe we should simply stop touching all the glitch-filter
related bits, in the same way as that old commit has already
removed the code setting the timing of the filter ?

At least is seems that forcing the filter to be on without
sanitizing the de-bounce time is not a good idea.

Regards,

Hans

_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-06  9:28                               ` [Linux-kernel-mentees] " Hans de Goede
@ 2020-10-08 16:26                                 ` Coiby Xu
  -1 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-08 16:26 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Linus Walleij, open list:GPIO SUBSYSTEM, wang jun, Nehal Shah,
	Shyam Sundar S K, linux-kernel-mentees

On Tue, Oct 06, 2020 at 11:28:06AM +0200, Hans de Goede wrote:
>Hi,
>
>On 10/6/20 10:55 AM, Hans de Goede wrote:
>>Hi,
>>
>>On 10/6/20 10:31 AM, Coiby Xu wrote:
>>>On Tue, Oct 06, 2020 at 08:28:40AM +0200, Hans de Goede wrote:
>>>>Hi,
>>>>
>>>>On 10/6/20 6:49 AM, Coiby Xu wrote:
>>>>>Hi Hans and Linus,
>>>>>
>>>>>I've found the direct evidence proving the GPIO interrupt controller is
>>>>>malfunctioning.
>>>>>
>>>>>I've found a way to let the GPIO chip trigger an interrupt by accident
>>>>>when playing with the GPIO sysfs interface,
>>>>>
>>>>> - export pin130 which is used by the touchad
>>>>> - set the direction to be "out"
>>>>> - `echo 0 > value` will trigger the GPIO controller's parent irq and
>>>>>   "echo 1 > value" will make it stop firing
>>>>>
>>>>>(I'm not sure if this is yet another bug of the GPIO chip. Anyway I can
>>>>>manually trigger an interrupt now.)
>>>>>
>>>>>I wrote a C program is to let GPIO controller quickly generate some
>>>>>interrupts then disable the firing of interrupts by toggling pin#130's
>>>>>value with an specified time interval, i.e., set the value to 0 first
>>>>>and then after some time, re-set the value to 1. There is no interrupt
>>>>>firing unless time internal > 120ms (~7Hz). This explains why we can
>>>>>only see 7 interrupts for the GPIO controller's parent irq.
>>>>
>>>>That is a great find, well done.
>>>>
>>>>>My hypothesis is the GPIO doesn't have proper power setting so it stays
>>>>>in an idle state or its clock frequency is too low by default thus not
>>>>>quick enough to read interrupt input. Then pinctrl-amd must miss some
>>>>>code to configure the chip and I need a hardware reference manual of this
>>>>>GPIO chip (HID: AMDI0030) or reverse-engineer the driver for Windows
>>>>>since I couldn't find a copy of reference manual online? What would you
>>>>>suggest?
>>>>
>>>>This sounds like it might have something to do with the glitch filter.
>>>>The code in pinctrl-amd.c to setup the trigger-type also configures
>>>>the glitch filter, you could try changing that code to disable the
>>>>glitch-filter. The defines for setting the glitch-filter bits to
>>>>disabled are already there.
>>>>
>>>
>>>Disabling the glitch filter works like a charm! Other enthusiastic
>>>Linux users who have been troubled by this issue for months would
>>>also feel great to know this small tweaking could bring their
>>>touchpad back to life:) Thank you!
>>
>>That is good to hear, I'm glad that we have finally found a solution.
>>
>>>$ git diff
>>>diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
>>>index 9a760f5cd7ed..e786d779d6c8 100644
>>>--- a/drivers/pinctrl/pinctrl-amd.c
>>>+++ b/drivers/pinctrl/pinctrl-amd.c
>>>@@ -463,7 +463,7 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>>>                 pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>>>                 pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>>>                 pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>>>-               pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>>>+               /** pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; */
>>>                 irq_set_handler_locked(d, handle_level_irq);
>>>                 break;
>>>
>>>I will learn more about the glitch filter and the implementation of
>>>pinctrl and see if I can disable glitch filter only for this touchpad.
>>
>>The glitch filter likely also has settings for how long a glitch
>>lasts, which apparently goes all the way up to 120ms. If it would
>>only delay reporting by say 0.1ms and consider any pulse longer
>>then 0.1s not a glitch, then having it enabled would be fine.
>>
>>I don't think we want some sort of quirk here to only disable the
>>glitch filter for some touchpads. One approach might be to simply
>>disable it completely for level type irqs.
>>
>>What we really need here is some input from AMD engineers with how
>>this is all supposed to work.
>>
>>E.g. maybe the glitch-filter is setup by the BIOS and we should not
>>touch it all ?
>>
>>Or maybe instead of DB_TYPE_PRESERVE_HIGH_GLITCH low level interrupts
>>should use DB_TYPE_PRESERVE_LOW_GLITCH ?   Some docs for the hw
>>would really help here ...
>
>So I've been digging through the history of the pinctrl-amd.c driver
>and once upon a time it used to set a default debounce time of
>2.75 ms.
>
>See the patch generated by doing:
>
>git format-patch 8cf4345575a416e6856a6856ac6eaa31ad883126~..8cf4345575a416e6856a6856ac6eaa31ad883126
>
>In a linux kernel checkout.
>
>So it would be interesting to add a debugging printk to see
>what the value of pin_reg & DB_TMR_OUT_MASK is for the troublesome
>GPIO.
>
>I guess that it might be all 1s (0xfffffffff) or some such which
>might be a way to check that we should disable the glitch-filter
>for this pin?
>

I guess you mean the value of "pin_reg" (not
"pin_reg & DB_TMR_OUT_MASK"), then the value is 0x500e8. So it means
BIOS or Bootloader has set the debounce for us.

>Regards,
>
>Hans
>

--
Best regards,
Coiby

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-08 16:26                                 ` Coiby Xu
  0 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-08 16:26 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Shyam Sundar S K, Linus Walleij, wang jun,
	open list:GPIO SUBSYSTEM, linux-kernel-mentees, Nehal Shah

On Tue, Oct 06, 2020 at 11:28:06AM +0200, Hans de Goede wrote:
>Hi,
>
>On 10/6/20 10:55 AM, Hans de Goede wrote:
>>Hi,
>>
>>On 10/6/20 10:31 AM, Coiby Xu wrote:
>>>On Tue, Oct 06, 2020 at 08:28:40AM +0200, Hans de Goede wrote:
>>>>Hi,
>>>>
>>>>On 10/6/20 6:49 AM, Coiby Xu wrote:
>>>>>Hi Hans and Linus,
>>>>>
>>>>>I've found the direct evidence proving the GPIO interrupt controller is
>>>>>malfunctioning.
>>>>>
>>>>>I've found a way to let the GPIO chip trigger an interrupt by accident
>>>>>when playing with the GPIO sysfs interface,
>>>>>
>>>>> - export pin130 which is used by the touchad
>>>>> - set the direction to be "out"
>>>>> - `echo 0 > value` will trigger the GPIO controller's parent irq and
>>>>>   "echo 1 > value" will make it stop firing
>>>>>
>>>>>(I'm not sure if this is yet another bug of the GPIO chip. Anyway I can
>>>>>manually trigger an interrupt now.)
>>>>>
>>>>>I wrote a C program is to let GPIO controller quickly generate some
>>>>>interrupts then disable the firing of interrupts by toggling pin#130's
>>>>>value with an specified time interval, i.e., set the value to 0 first
>>>>>and then after some time, re-set the value to 1. There is no interrupt
>>>>>firing unless time internal > 120ms (~7Hz). This explains why we can
>>>>>only see 7 interrupts for the GPIO controller's parent irq.
>>>>
>>>>That is a great find, well done.
>>>>
>>>>>My hypothesis is the GPIO doesn't have proper power setting so it stays
>>>>>in an idle state or its clock frequency is too low by default thus not
>>>>>quick enough to read interrupt input. Then pinctrl-amd must miss some
>>>>>code to configure the chip and I need a hardware reference manual of this
>>>>>GPIO chip (HID: AMDI0030) or reverse-engineer the driver for Windows
>>>>>since I couldn't find a copy of reference manual online? What would you
>>>>>suggest?
>>>>
>>>>This sounds like it might have something to do with the glitch filter.
>>>>The code in pinctrl-amd.c to setup the trigger-type also configures
>>>>the glitch filter, you could try changing that code to disable the
>>>>glitch-filter. The defines for setting the glitch-filter bits to
>>>>disabled are already there.
>>>>
>>>
>>>Disabling the glitch filter works like a charm! Other enthusiastic
>>>Linux users who have been troubled by this issue for months would
>>>also feel great to know this small tweaking could bring their
>>>touchpad back to life:) Thank you!
>>
>>That is good to hear, I'm glad that we have finally found a solution.
>>
>>>$ git diff
>>>diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
>>>index 9a760f5cd7ed..e786d779d6c8 100644
>>>--- a/drivers/pinctrl/pinctrl-amd.c
>>>+++ b/drivers/pinctrl/pinctrl-amd.c
>>>@@ -463,7 +463,7 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>>>                 pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>>>                 pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>>>                 pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>>>-               pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>>>+               /** pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; */
>>>                 irq_set_handler_locked(d, handle_level_irq);
>>>                 break;
>>>
>>>I will learn more about the glitch filter and the implementation of
>>>pinctrl and see if I can disable glitch filter only for this touchpad.
>>
>>The glitch filter likely also has settings for how long a glitch
>>lasts, which apparently goes all the way up to 120ms. If it would
>>only delay reporting by say 0.1ms and consider any pulse longer
>>then 0.1s not a glitch, then having it enabled would be fine.
>>
>>I don't think we want some sort of quirk here to only disable the
>>glitch filter for some touchpads. One approach might be to simply
>>disable it completely for level type irqs.
>>
>>What we really need here is some input from AMD engineers with how
>>this is all supposed to work.
>>
>>E.g. maybe the glitch-filter is setup by the BIOS and we should not
>>touch it all ?
>>
>>Or maybe instead of DB_TYPE_PRESERVE_HIGH_GLITCH low level interrupts
>>should use DB_TYPE_PRESERVE_LOW_GLITCH ?   Some docs for the hw
>>would really help here ...
>
>So I've been digging through the history of the pinctrl-amd.c driver
>and once upon a time it used to set a default debounce time of
>2.75 ms.
>
>See the patch generated by doing:
>
>git format-patch 8cf4345575a416e6856a6856ac6eaa31ad883126~..8cf4345575a416e6856a6856ac6eaa31ad883126
>
>In a linux kernel checkout.
>
>So it would be interesting to add a debugging printk to see
>what the value of pin_reg & DB_TMR_OUT_MASK is for the troublesome
>GPIO.
>
>I guess that it might be all 1s (0xfffffffff) or some such which
>might be a way to check that we should disable the glitch-filter
>for this pin?
>

I guess you mean the value of "pin_reg" (not
"pin_reg & DB_TMR_OUT_MASK"), then the value is 0x500e8. So it means
BIOS or Bootloader has set the debounce for us.

>Regards,
>
>Hans
>

--
Best regards,
Coiby
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-06  9:29                                 ` [Linux-kernel-mentees] " Hans de Goede
@ 2020-10-08 16:32                                   ` Coiby Xu
  -1 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-08 16:32 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Linus Walleij, open list:GPIO SUBSYSTEM, wang jun, Nehal Shah,
	Shyam Sundar S K, linux-kernel-mentees

On Tue, Oct 06, 2020 at 11:29:40AM +0200, Hans de Goede wrote:
>
>
>On 10/6/20 11:28 AM, Hans de Goede wrote:
>>Hi,
>>
>>On 10/6/20 10:55 AM, Hans de Goede wrote:
>>>Hi,
>>>
>>>On 10/6/20 10:31 AM, Coiby Xu wrote:
>>>>On Tue, Oct 06, 2020 at 08:28:40AM +0200, Hans de Goede wrote:
>>>>>Hi,
>>>>>
>>>>>On 10/6/20 6:49 AM, Coiby Xu wrote:
>>>>>>Hi Hans and Linus,
>>>>>>
>>>>>>I've found the direct evidence proving the GPIO interrupt controller is
>>>>>>malfunctioning.
>>>>>>
>>>>>>I've found a way to let the GPIO chip trigger an interrupt by accident
>>>>>>when playing with the GPIO sysfs interface,
>>>>>>
>>>>>> - export pin130 which is used by the touchad
>>>>>> - set the direction to be "out"
>>>>>> - `echo 0 > value` will trigger the GPIO controller's parent irq and
>>>>>>   "echo 1 > value" will make it stop firing
>>>>>>
>>>>>>(I'm not sure if this is yet another bug of the GPIO chip. Anyway I can
>>>>>>manually trigger an interrupt now.)
>>>>>>
>>>>>>I wrote a C program is to let GPIO controller quickly generate some
>>>>>>interrupts then disable the firing of interrupts by toggling pin#130's
>>>>>>value with an specified time interval, i.e., set the value to 0 first
>>>>>>and then after some time, re-set the value to 1. There is no interrupt
>>>>>>firing unless time internal > 120ms (~7Hz). This explains why we can
>>>>>>only see 7 interrupts for the GPIO controller's parent irq.
>>>>>
>>>>>That is a great find, well done.
>>>>>
>>>>>>My hypothesis is the GPIO doesn't have proper power setting so it stays
>>>>>>in an idle state or its clock frequency is too low by default thus not
>>>>>>quick enough to read interrupt input. Then pinctrl-amd must miss some
>>>>>>code to configure the chip and I need a hardware reference manual of this
>>>>>>GPIO chip (HID: AMDI0030) or reverse-engineer the driver for Windows
>>>>>>since I couldn't find a copy of reference manual online? What would you
>>>>>>suggest?
>>>>>
>>>>>This sounds like it might have something to do with the glitch filter.
>>>>>The code in pinctrl-amd.c to setup the trigger-type also configures
>>>>>the glitch filter, you could try changing that code to disable the
>>>>>glitch-filter. The defines for setting the glitch-filter bits to
>>>>>disabled are already there.
>>>>>
>>>>
>>>>Disabling the glitch filter works like a charm! Other enthusiastic
>>>>Linux users who have been troubled by this issue for months would
>>>>also feel great to know this small tweaking could bring their
>>>>touchpad back to life:) Thank you!
>>>
>>>That is good to hear, I'm glad that we have finally found a solution.
>>>
>>>>$ git diff
>>>>diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
>>>>index 9a760f5cd7ed..e786d779d6c8 100644
>>>>--- a/drivers/pinctrl/pinctrl-amd.c
>>>>+++ b/drivers/pinctrl/pinctrl-amd.c
>>>>@@ -463,7 +463,7 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>>>>                 pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>>>>                 pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>>>>                 pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>>>>-               pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>>>>+               /** pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; */
>>>>                 irq_set_handler_locked(d, handle_level_irq);
>>>>                 break;
>>>>
>>>>I will learn more about the glitch filter and the implementation of
>>>>pinctrl and see if I can disable glitch filter only for this touchpad.
>>>
>>>The glitch filter likely also has settings for how long a glitch
>>>lasts, which apparently goes all the way up to 120ms. If it would
>>>only delay reporting by say 0.1ms and consider any pulse longer
>>>then 0.1s not a glitch, then having it enabled would be fine.
>>>
>>>I don't think we want some sort of quirk here to only disable the
>>>glitch filter for some touchpads. One approach might be to simply
>>>disable it completely for level type irqs.
>>>
>>>What we really need here is some input from AMD engineers with how
>>>this is all supposed to work.
>>>
>>>E.g. maybe the glitch-filter is setup by the BIOS and we should not
>>>touch it all ?
>>>
>>>Or maybe instead of DB_TYPE_PRESERVE_HIGH_GLITCH low level interrupts
>>>should use DB_TYPE_PRESERVE_LOW_GLITCH ?   Some docs for the hw
>>>would really help here ...
>>
>>So I've been digging through the history of the pinctrl-amd.c driver
>>and once upon a time it used to set a default debounce time of
>>2.75 ms.
>>
>>See the patch generated by doing:
>>
>>git format-patch 8cf4345575a416e6856a6856ac6eaa31ad883126~..8cf4345575a416e6856a6856ac6eaa31ad883126
>>
>>In a linux kernel checkout.
>>
>>So it would be interesting to add a debugging printk to see
>>what the value of pin_reg & DB_TMR_OUT_MASK is for the troublesome
>>GPIO.
>>
>>I guess that it might be all 1s (0xfffffffff) or some such which
>>might be a way to check that we should disable the glitch-filter
>>for this pin?
>
>p.s.
>
>Or maybe we should simply stop touching all the glitch-filter
>related bits, in the same way as that old commit has already
>removed the code setting the timing of the filter ?
>
>At least is seems that forcing the filter to be on without
>sanitizing the de-bounce time is not a good idea.
>

One evidence I find that supports this is I can only find "debounce"
in ACPI Spec 6.1 and searching for "glitch" return nothing. Debounce
could be used to configure pin for interrupt,

GpioInt (EdgeLevel, ActiveLevel, Shared, PinConfig, DebounceTimeout, ResourceSource,
          ResourceSourceIndex, ResourceUsage, DescriptorName, VendorData)
          {PinList}

>Regards,
>
>Hans
>

--
Best regards,
Coiby

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-08 16:32                                   ` Coiby Xu
  0 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-08 16:32 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Shyam Sundar S K, Linus Walleij, wang jun,
	open list:GPIO SUBSYSTEM, linux-kernel-mentees, Nehal Shah

On Tue, Oct 06, 2020 at 11:29:40AM +0200, Hans de Goede wrote:
>
>
>On 10/6/20 11:28 AM, Hans de Goede wrote:
>>Hi,
>>
>>On 10/6/20 10:55 AM, Hans de Goede wrote:
>>>Hi,
>>>
>>>On 10/6/20 10:31 AM, Coiby Xu wrote:
>>>>On Tue, Oct 06, 2020 at 08:28:40AM +0200, Hans de Goede wrote:
>>>>>Hi,
>>>>>
>>>>>On 10/6/20 6:49 AM, Coiby Xu wrote:
>>>>>>Hi Hans and Linus,
>>>>>>
>>>>>>I've found the direct evidence proving the GPIO interrupt controller is
>>>>>>malfunctioning.
>>>>>>
>>>>>>I've found a way to let the GPIO chip trigger an interrupt by accident
>>>>>>when playing with the GPIO sysfs interface,
>>>>>>
>>>>>> - export pin130 which is used by the touchad
>>>>>> - set the direction to be "out"
>>>>>> - `echo 0 > value` will trigger the GPIO controller's parent irq and
>>>>>>   "echo 1 > value" will make it stop firing
>>>>>>
>>>>>>(I'm not sure if this is yet another bug of the GPIO chip. Anyway I can
>>>>>>manually trigger an interrupt now.)
>>>>>>
>>>>>>I wrote a C program is to let GPIO controller quickly generate some
>>>>>>interrupts then disable the firing of interrupts by toggling pin#130's
>>>>>>value with an specified time interval, i.e., set the value to 0 first
>>>>>>and then after some time, re-set the value to 1. There is no interrupt
>>>>>>firing unless time internal > 120ms (~7Hz). This explains why we can
>>>>>>only see 7 interrupts for the GPIO controller's parent irq.
>>>>>
>>>>>That is a great find, well done.
>>>>>
>>>>>>My hypothesis is the GPIO doesn't have proper power setting so it stays
>>>>>>in an idle state or its clock frequency is too low by default thus not
>>>>>>quick enough to read interrupt input. Then pinctrl-amd must miss some
>>>>>>code to configure the chip and I need a hardware reference manual of this
>>>>>>GPIO chip (HID: AMDI0030) or reverse-engineer the driver for Windows
>>>>>>since I couldn't find a copy of reference manual online? What would you
>>>>>>suggest?
>>>>>
>>>>>This sounds like it might have something to do with the glitch filter.
>>>>>The code in pinctrl-amd.c to setup the trigger-type also configures
>>>>>the glitch filter, you could try changing that code to disable the
>>>>>glitch-filter. The defines for setting the glitch-filter bits to
>>>>>disabled are already there.
>>>>>
>>>>
>>>>Disabling the glitch filter works like a charm! Other enthusiastic
>>>>Linux users who have been troubled by this issue for months would
>>>>also feel great to know this small tweaking could bring their
>>>>touchpad back to life:) Thank you!
>>>
>>>That is good to hear, I'm glad that we have finally found a solution.
>>>
>>>>$ git diff
>>>>diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
>>>>index 9a760f5cd7ed..e786d779d6c8 100644
>>>>--- a/drivers/pinctrl/pinctrl-amd.c
>>>>+++ b/drivers/pinctrl/pinctrl-amd.c
>>>>@@ -463,7 +463,7 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>>>>                 pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>>>>                 pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>>>>                 pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>>>>-               pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>>>>+               /** pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; */
>>>>                 irq_set_handler_locked(d, handle_level_irq);
>>>>                 break;
>>>>
>>>>I will learn more about the glitch filter and the implementation of
>>>>pinctrl and see if I can disable glitch filter only for this touchpad.
>>>
>>>The glitch filter likely also has settings for how long a glitch
>>>lasts, which apparently goes all the way up to 120ms. If it would
>>>only delay reporting by say 0.1ms and consider any pulse longer
>>>then 0.1s not a glitch, then having it enabled would be fine.
>>>
>>>I don't think we want some sort of quirk here to only disable the
>>>glitch filter for some touchpads. One approach might be to simply
>>>disable it completely for level type irqs.
>>>
>>>What we really need here is some input from AMD engineers with how
>>>this is all supposed to work.
>>>
>>>E.g. maybe the glitch-filter is setup by the BIOS and we should not
>>>touch it all ?
>>>
>>>Or maybe instead of DB_TYPE_PRESERVE_HIGH_GLITCH low level interrupts
>>>should use DB_TYPE_PRESERVE_LOW_GLITCH ?   Some docs for the hw
>>>would really help here ...
>>
>>So I've been digging through the history of the pinctrl-amd.c driver
>>and once upon a time it used to set a default debounce time of
>>2.75 ms.
>>
>>See the patch generated by doing:
>>
>>git format-patch 8cf4345575a416e6856a6856ac6eaa31ad883126~..8cf4345575a416e6856a6856ac6eaa31ad883126
>>
>>In a linux kernel checkout.
>>
>>So it would be interesting to add a debugging printk to see
>>what the value of pin_reg & DB_TMR_OUT_MASK is for the troublesome
>>GPIO.
>>
>>I guess that it might be all 1s (0xfffffffff) or some such which
>>might be a way to check that we should disable the glitch-filter
>>for this pin?
>
>p.s.
>
>Or maybe we should simply stop touching all the glitch-filter
>related bits, in the same way as that old commit has already
>removed the code setting the timing of the filter ?
>
>At least is seems that forcing the filter to be on without
>sanitizing the de-bounce time is not a good idea.
>

One evidence I find that supports this is I can only find "debounce"
in ACPI Spec 6.1 and searching for "glitch" return nothing. Debounce
could be used to configure pin for interrupt,

GpioInt (EdgeLevel, ActiveLevel, Shared, PinConfig, DebounceTimeout, ResourceSource,
          ResourceSourceIndex, ResourceUsage, DescriptorName, VendorData)
          {PinList}

>Regards,
>
>Hans
>

--
Best regards,
Coiby
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-06  9:16                             ` [Linux-kernel-mentees] " Linus Walleij
@ 2020-10-08 16:40                               ` Coiby Xu
  -1 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-08 16:40 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Hans de Goede, open list:GPIO SUBSYSTEM, wang jun, Nehal Shah,
	Shyam Sundar S K, linux-kernel-mentees

On Tue, Oct 06, 2020 at 11:16:50AM +0200, Linus Walleij wrote:
>On Tue, Oct 6, 2020 at 10:32 AM Coiby Xu <coiby.xu@gmail.com> wrote:
>
>> Disabling the glitch filter works like a charm! Other enthusiastic
>> Linux users who have been troubled by this issue for months would
>> also feel great to know this small tweaking could bring their
>> touchpad back to life:) Thank you!
>
>Oh you found the bug :D
>

The credit should goes to Hans. Thanks to his expertise, only
one shot (disabling glitch filter) is needed. Thank you for
introducing him to me:)

>> $ git diff
>> diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
>> index 9a760f5cd7ed..e786d779d6c8 100644
>> --- a/drivers/pinctrl/pinctrl-amd.c
>> +++ b/drivers/pinctrl/pinctrl-amd.c
>> @@ -463,7 +463,7 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>>                  pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>>                  pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>>                  pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>> -               pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>> +               /** pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; */
>>                  irq_set_handler_locked(d, handle_level_irq);
>>                  break;
>>
>> I will learn more about the glitch filter and the implementation of
>> pinctrl and see if I can disable glitch filter only for this touchpad.
>
>Yes we certainly need a quirk for this of some kind, examine the ACPI
>quirk infrastructure in drivers/gpio/gpiolib-acpi.c to see if you can use
>that to handle this.
>

Thank you for pointing out where I should look at! A quirk is the only
foolproof way before we confirm the other two suggestions given by Hans.

>Yours,
>Linus Walleij

--
Best regards,
Coiby

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-08 16:40                               ` Coiby Xu
  0 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-08 16:40 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Shyam Sundar S K, open list:GPIO SUBSYSTEM, wang jun,
	Hans de Goede, linux-kernel-mentees, Nehal Shah

On Tue, Oct 06, 2020 at 11:16:50AM +0200, Linus Walleij wrote:
>On Tue, Oct 6, 2020 at 10:32 AM Coiby Xu <coiby.xu@gmail.com> wrote:
>
>> Disabling the glitch filter works like a charm! Other enthusiastic
>> Linux users who have been troubled by this issue for months would
>> also feel great to know this small tweaking could bring their
>> touchpad back to life:) Thank you!
>
>Oh you found the bug :D
>

The credit should goes to Hans. Thanks to his expertise, only
one shot (disabling glitch filter) is needed. Thank you for
introducing him to me:)

>> $ git diff
>> diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
>> index 9a760f5cd7ed..e786d779d6c8 100644
>> --- a/drivers/pinctrl/pinctrl-amd.c
>> +++ b/drivers/pinctrl/pinctrl-amd.c
>> @@ -463,7 +463,7 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>>                  pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>>                  pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>>                  pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>> -               pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>> +               /** pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; */
>>                  irq_set_handler_locked(d, handle_level_irq);
>>                  break;
>>
>> I will learn more about the glitch filter and the implementation of
>> pinctrl and see if I can disable glitch filter only for this touchpad.
>
>Yes we certainly need a quirk for this of some kind, examine the ACPI
>quirk infrastructure in drivers/gpio/gpiolib-acpi.c to see if you can use
>that to handle this.
>

Thank you for pointing out where I should look at! A quirk is the only
foolproof way before we confirm the other two suggestions given by Hans.

>Yours,
>Linus Walleij

--
Best regards,
Coiby
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-06  9:29                                 ` [Linux-kernel-mentees] " Hans de Goede
@ 2020-10-14  4:24                                   ` Coiby Xu
  -1 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-14  4:24 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Linus Walleij, open list:GPIO SUBSYSTEM, wang jun, Nehal Shah,
	Shyam Sundar S K, linux-kernel-mentees

On Tue, Oct 06, 2020 at 11:29:40AM +0200, Hans de Goede wrote:
>
>
>On 10/6/20 11:28 AM, Hans de Goede wrote:
>>Hi,
>>
>>On 10/6/20 10:55 AM, Hans de Goede wrote:
>>>Hi,
>>>
>>>On 10/6/20 10:31 AM, Coiby Xu wrote:
>>>>On Tue, Oct 06, 2020 at 08:28:40AM +0200, Hans de Goede wrote:
>>>>>Hi,
>>>>>
>>>>>On 10/6/20 6:49 AM, Coiby Xu wrote:
>>>>>>Hi Hans and Linus,
>>>>>>
>>>>>>I've found the direct evidence proving the GPIO interrupt controller is
>>>>>>malfunctioning.
>>>>>>
>>>>>>I've found a way to let the GPIO chip trigger an interrupt by accident
>>>>>>when playing with the GPIO sysfs interface,
>>>>>>
>>>>>> - export pin130 which is used by the touchad
>>>>>> - set the direction to be "out"
>>>>>> - `echo 0 > value` will trigger the GPIO controller's parent irq and
>>>>>>   "echo 1 > value" will make it stop firing
>>>>>>
>>>>>>(I'm not sure if this is yet another bug of the GPIO chip. Anyway I can
>>>>>>manually trigger an interrupt now.)
>>>>>>
>>>>>>I wrote a C program is to let GPIO controller quickly generate some
>>>>>>interrupts then disable the firing of interrupts by toggling pin#130's
>>>>>>value with an specified time interval, i.e., set the value to 0 first
>>>>>>and then after some time, re-set the value to 1. There is no interrupt
>>>>>>firing unless time internal > 120ms (~7Hz). This explains why we can
>>>>>>only see 7 interrupts for the GPIO controller's parent irq.
>>>>>
>>>>>That is a great find, well done.
>>>>>
>>>>>>My hypothesis is the GPIO doesn't have proper power setting so it stays
>>>>>>in an idle state or its clock frequency is too low by default thus not
>>>>>>quick enough to read interrupt input. Then pinctrl-amd must miss some
>>>>>>code to configure the chip and I need a hardware reference manual of this
>>>>>>GPIO chip (HID: AMDI0030) or reverse-engineer the driver for Windows
>>>>>>since I couldn't find a copy of reference manual online? What would you
>>>>>>suggest?
>>>>>
>>>>>This sounds like it might have something to do with the glitch filter.
>>>>>The code in pinctrl-amd.c to setup the trigger-type also configures
>>>>>the glitch filter, you could try changing that code to disable the
>>>>>glitch-filter. The defines for setting the glitch-filter bits to
>>>>>disabled are already there.
>>>>>
>>>>
>>>>Disabling the glitch filter works like a charm! Other enthusiastic
>>>>Linux users who have been troubled by this issue for months would
>>>>also feel great to know this small tweaking could bring their
>>>>touchpad back to life:) Thank you!
>>>
>>>That is good to hear, I'm glad that we have finally found a solution.
>>>
>>>>$ git diff
>>>>diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
>>>>index 9a760f5cd7ed..e786d779d6c8 100644
>>>>--- a/drivers/pinctrl/pinctrl-amd.c
>>>>+++ b/drivers/pinctrl/pinctrl-amd.c
>>>>@@ -463,7 +463,7 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>>>>                 pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>>>>                 pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>>>>                 pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>>>>-               pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>>>>+               /** pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; */
>>>>                 irq_set_handler_locked(d, handle_level_irq);
>>>>                 break;
>>>>
>>>>I will learn more about the glitch filter and the implementation of
>>>>pinctrl and see if I can disable glitch filter only for this touchpad.
>>>
>>>The glitch filter likely also has settings for how long a glitch
>>>lasts, which apparently goes all the way up to 120ms. If it would
>>>only delay reporting by say 0.1ms and consider any pulse longer
>>>then 0.1s not a glitch, then having it enabled would be fine.
>>>
>>>I don't think we want some sort of quirk here to only disable the
>>>glitch filter for some touchpads. One approach might be to simply
>>>disable it completely for level type irqs.
>>>
>>>What we really need here is some input from AMD engineers with how
>>>this is all supposed to work.
>>>
>>>E.g. maybe the glitch-filter is setup by the BIOS and we should not
>>>touch it all ?
>>>
>>>Or maybe instead of DB_TYPE_PRESERVE_HIGH_GLITCH low level interrupts
>>>should use DB_TYPE_PRESERVE_LOW_GLITCH ?   Some docs for the hw
>>>would really help here ...
>>
>>So I've been digging through the history of the pinctrl-amd.c driver
>>and once upon a time it used to set a default debounce time of
>>2.75 ms.
>>
>>See the patch generated by doing:
>>
>>git format-patch 8cf4345575a416e6856a6856ac6eaa31ad883126~..8cf4345575a416e6856a6856ac6eaa31ad883126
>>
>>In a linux kernel checkout.
>>
>>So it would be interesting to add a debugging printk to see
>>what the value of pin_reg & DB_TMR_OUT_MASK is for the troublesome
>>GPIO.
>>
>>I guess that it might be all 1s (0xfffffffff) or some such which
>>might be a way to check that we should disable the glitch-filter
>>for this pin?
>
>p.s.
>
>Or maybe we should simply stop touching all the glitch-filter
>related bits, in the same way as that old commit has already
>removed the code setting the timing of the filter ?
>
>At least is seems that forcing the filter to be on without
>sanitizing the de-bounce time is not a good idea.
>
Today I find an inconsistency in drivers/pinctrl/pinctrl-amd.c
so there must be a bug. As far as I can understand pinctrl-amd,
"pin_reg & ~DB_CNTRl_MASK" is used to mask out the debouncing
feature,

static int amd_gpio_set_debounce(struct gpio_chip *gc, unsigned offset,
		unsigned debounce)
{
     ...
	if (debounce) {
         ...
		if (debounce < 61) {
			pin_reg |= 1;
			pin_reg &= ~BIT(DB_TMR_OUT_UNIT_OFF);
			pin_reg &= ~BIT(DB_TMR_LARGE_OFF);
		...
		} else if (debounce < 1000000) {
			time = debounce / 62500;
			pin_reg |= time & DB_TMR_OUT_MASK;
			pin_reg |= BIT(DB_TMR_OUT_UNIT_OFF);
			pin_reg |= BIT(DB_TMR_LARGE_OFF);
		} else {
			pin_reg &= ~DB_CNTRl_MASK;
			ret = -EINVAL;
		}

	} else {
         ...
		pin_reg &= ~DB_CNTRl_MASK;
	}
     ...
}

However in amd_gpio_irq_set_type, "ping_reg & ~(DB_CNTRl_MASK << DB_CNTRL_OFF)"
is used,

static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
{

     ...
	case IRQ_TYPE_LEVEL_LOW:
		pin_reg |= LEVEL_TRIGGER << LEVEL_TRIG_OFF;
		pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
		pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
		pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
		pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
		irq_set_handler_locked(d, handle_level_irq);
		break;
     ...
}

If "pin_reg & ~DB_CNTRl_MASK" is used instead, the touchpad will work
flawlessly. So I believe "pin_reg & ~DB_CNTRl_MASK" is the correct way
to mask out the debouncing filter and the bug lies in amd_gpio_set_type.

Btw, can you explain what's the difference between glitch filter and
debouncing filter? Or can you point to some references? I've gain some
experience about how to configure the GPIO controller by studying the
code of pinctrl-amd and pinctrl-baytrail (I can't find the hardware
reference manual for baytrail either). I also tweaked the configuration
in pinctrl-amd, for example, setting the debounce timeout to 976 usec
and 3.9 msec without disabling the glitch filter could also save the
touchpad. But I need some knowledge to understand why this touchpad [1]
which also uses the buggy pinctrl-amd isn't affected.

[1] https://github.com/Syniurge/i2c-amd-mp2/issues/11#issuecomment-707427095

>Regards,
>
>Hans
>

--
Best regards,
Coiby

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-14  4:24                                   ` Coiby Xu
  0 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-14  4:24 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Shyam Sundar S K, Linus Walleij, wang jun,
	open list:GPIO SUBSYSTEM, linux-kernel-mentees, Nehal Shah

On Tue, Oct 06, 2020 at 11:29:40AM +0200, Hans de Goede wrote:
>
>
>On 10/6/20 11:28 AM, Hans de Goede wrote:
>>Hi,
>>
>>On 10/6/20 10:55 AM, Hans de Goede wrote:
>>>Hi,
>>>
>>>On 10/6/20 10:31 AM, Coiby Xu wrote:
>>>>On Tue, Oct 06, 2020 at 08:28:40AM +0200, Hans de Goede wrote:
>>>>>Hi,
>>>>>
>>>>>On 10/6/20 6:49 AM, Coiby Xu wrote:
>>>>>>Hi Hans and Linus,
>>>>>>
>>>>>>I've found the direct evidence proving the GPIO interrupt controller is
>>>>>>malfunctioning.
>>>>>>
>>>>>>I've found a way to let the GPIO chip trigger an interrupt by accident
>>>>>>when playing with the GPIO sysfs interface,
>>>>>>
>>>>>> - export pin130 which is used by the touchad
>>>>>> - set the direction to be "out"
>>>>>> - `echo 0 > value` will trigger the GPIO controller's parent irq and
>>>>>>   "echo 1 > value" will make it stop firing
>>>>>>
>>>>>>(I'm not sure if this is yet another bug of the GPIO chip. Anyway I can
>>>>>>manually trigger an interrupt now.)
>>>>>>
>>>>>>I wrote a C program is to let GPIO controller quickly generate some
>>>>>>interrupts then disable the firing of interrupts by toggling pin#130's
>>>>>>value with an specified time interval, i.e., set the value to 0 first
>>>>>>and then after some time, re-set the value to 1. There is no interrupt
>>>>>>firing unless time internal > 120ms (~7Hz). This explains why we can
>>>>>>only see 7 interrupts for the GPIO controller's parent irq.
>>>>>
>>>>>That is a great find, well done.
>>>>>
>>>>>>My hypothesis is the GPIO doesn't have proper power setting so it stays
>>>>>>in an idle state or its clock frequency is too low by default thus not
>>>>>>quick enough to read interrupt input. Then pinctrl-amd must miss some
>>>>>>code to configure the chip and I need a hardware reference manual of this
>>>>>>GPIO chip (HID: AMDI0030) or reverse-engineer the driver for Windows
>>>>>>since I couldn't find a copy of reference manual online? What would you
>>>>>>suggest?
>>>>>
>>>>>This sounds like it might have something to do with the glitch filter.
>>>>>The code in pinctrl-amd.c to setup the trigger-type also configures
>>>>>the glitch filter, you could try changing that code to disable the
>>>>>glitch-filter. The defines for setting the glitch-filter bits to
>>>>>disabled are already there.
>>>>>
>>>>
>>>>Disabling the glitch filter works like a charm! Other enthusiastic
>>>>Linux users who have been troubled by this issue for months would
>>>>also feel great to know this small tweaking could bring their
>>>>touchpad back to life:) Thank you!
>>>
>>>That is good to hear, I'm glad that we have finally found a solution.
>>>
>>>>$ git diff
>>>>diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
>>>>index 9a760f5cd7ed..e786d779d6c8 100644
>>>>--- a/drivers/pinctrl/pinctrl-amd.c
>>>>+++ b/drivers/pinctrl/pinctrl-amd.c
>>>>@@ -463,7 +463,7 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>>>>                 pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>>>>                 pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>>>>                 pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>>>>-               pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>>>>+               /** pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; */
>>>>                 irq_set_handler_locked(d, handle_level_irq);
>>>>                 break;
>>>>
>>>>I will learn more about the glitch filter and the implementation of
>>>>pinctrl and see if I can disable glitch filter only for this touchpad.
>>>
>>>The glitch filter likely also has settings for how long a glitch
>>>lasts, which apparently goes all the way up to 120ms. If it would
>>>only delay reporting by say 0.1ms and consider any pulse longer
>>>then 0.1s not a glitch, then having it enabled would be fine.
>>>
>>>I don't think we want some sort of quirk here to only disable the
>>>glitch filter for some touchpads. One approach might be to simply
>>>disable it completely for level type irqs.
>>>
>>>What we really need here is some input from AMD engineers with how
>>>this is all supposed to work.
>>>
>>>E.g. maybe the glitch-filter is setup by the BIOS and we should not
>>>touch it all ?
>>>
>>>Or maybe instead of DB_TYPE_PRESERVE_HIGH_GLITCH low level interrupts
>>>should use DB_TYPE_PRESERVE_LOW_GLITCH ?   Some docs for the hw
>>>would really help here ...
>>
>>So I've been digging through the history of the pinctrl-amd.c driver
>>and once upon a time it used to set a default debounce time of
>>2.75 ms.
>>
>>See the patch generated by doing:
>>
>>git format-patch 8cf4345575a416e6856a6856ac6eaa31ad883126~..8cf4345575a416e6856a6856ac6eaa31ad883126
>>
>>In a linux kernel checkout.
>>
>>So it would be interesting to add a debugging printk to see
>>what the value of pin_reg & DB_TMR_OUT_MASK is for the troublesome
>>GPIO.
>>
>>I guess that it might be all 1s (0xfffffffff) or some such which
>>might be a way to check that we should disable the glitch-filter
>>for this pin?
>
>p.s.
>
>Or maybe we should simply stop touching all the glitch-filter
>related bits, in the same way as that old commit has already
>removed the code setting the timing of the filter ?
>
>At least is seems that forcing the filter to be on without
>sanitizing the de-bounce time is not a good idea.
>
Today I find an inconsistency in drivers/pinctrl/pinctrl-amd.c
so there must be a bug. As far as I can understand pinctrl-amd,
"pin_reg & ~DB_CNTRl_MASK" is used to mask out the debouncing
feature,

static int amd_gpio_set_debounce(struct gpio_chip *gc, unsigned offset,
		unsigned debounce)
{
     ...
	if (debounce) {
         ...
		if (debounce < 61) {
			pin_reg |= 1;
			pin_reg &= ~BIT(DB_TMR_OUT_UNIT_OFF);
			pin_reg &= ~BIT(DB_TMR_LARGE_OFF);
		...
		} else if (debounce < 1000000) {
			time = debounce / 62500;
			pin_reg |= time & DB_TMR_OUT_MASK;
			pin_reg |= BIT(DB_TMR_OUT_UNIT_OFF);
			pin_reg |= BIT(DB_TMR_LARGE_OFF);
		} else {
			pin_reg &= ~DB_CNTRl_MASK;
			ret = -EINVAL;
		}

	} else {
         ...
		pin_reg &= ~DB_CNTRl_MASK;
	}
     ...
}

However in amd_gpio_irq_set_type, "ping_reg & ~(DB_CNTRl_MASK << DB_CNTRL_OFF)"
is used,

static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
{

     ...
	case IRQ_TYPE_LEVEL_LOW:
		pin_reg |= LEVEL_TRIGGER << LEVEL_TRIG_OFF;
		pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
		pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
		pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
		pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
		irq_set_handler_locked(d, handle_level_irq);
		break;
     ...
}

If "pin_reg & ~DB_CNTRl_MASK" is used instead, the touchpad will work
flawlessly. So I believe "pin_reg & ~DB_CNTRl_MASK" is the correct way
to mask out the debouncing filter and the bug lies in amd_gpio_set_type.

Btw, can you explain what's the difference between glitch filter and
debouncing filter? Or can you point to some references? I've gain some
experience about how to configure the GPIO controller by studying the
code of pinctrl-amd and pinctrl-baytrail (I can't find the hardware
reference manual for baytrail either). I also tweaked the configuration
in pinctrl-amd, for example, setting the debounce timeout to 976 usec
and 3.9 msec without disabling the glitch filter could also save the
touchpad. But I need some knowledge to understand why this touchpad [1]
which also uses the buggy pinctrl-amd isn't affected.

[1] https://github.com/Syniurge/i2c-amd-mp2/issues/11#issuecomment-707427095

>Regards,
>
>Hans
>

--
Best regards,
Coiby
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-14  4:24                                   ` [Linux-kernel-mentees] " Coiby Xu
@ 2020-10-14 11:34                                     ` Coiby Xu
  -1 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-14 11:34 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Linus Walleij, open list:GPIO SUBSYSTEM, wang jun, Nehal Shah,
	Shyam Sundar S K, linux-kernel-mentees

On Wed, Oct 14, 2020 at 12:24:20PM +0800, Coiby Xu wrote:
>On Tue, Oct 06, 2020 at 11:29:40AM +0200, Hans de Goede wrote:
>>
>>
>>On 10/6/20 11:28 AM, Hans de Goede wrote:
>>>Hi,
>>>
>>>On 10/6/20 10:55 AM, Hans de Goede wrote:
>>>>Hi,
>>>>
>>>>On 10/6/20 10:31 AM, Coiby Xu wrote:
>>>>>On Tue, Oct 06, 2020 at 08:28:40AM +0200, Hans de Goede wrote:
>>>>>>Hi,
>>>>>>
>>>>>>On 10/6/20 6:49 AM, Coiby Xu wrote:
>>>>>>>Hi Hans and Linus,
>>>>>>>
>>>>>>>I've found the direct evidence proving the GPIO interrupt controller is
>>>>>>>malfunctioning.
>>>>>>>
>>>>>>>I've found a way to let the GPIO chip trigger an interrupt by accident
>>>>>>>when playing with the GPIO sysfs interface,
>>>>>>>
>>>>>>> - export pin130 which is used by the touchad
>>>>>>> - set the direction to be "out"
>>>>>>> - `echo 0 > value` will trigger the GPIO controller's parent irq and
>>>>>>>   "echo 1 > value" will make it stop firing
>>>>>>>
>>>>>>>(I'm not sure if this is yet another bug of the GPIO chip. Anyway I can
>>>>>>>manually trigger an interrupt now.)
>>>>>>>
>>>>>>>I wrote a C program is to let GPIO controller quickly generate some
>>>>>>>interrupts then disable the firing of interrupts by toggling pin#130's
>>>>>>>value with an specified time interval, i.e., set the value to 0 first
>>>>>>>and then after some time, re-set the value to 1. There is no interrupt
>>>>>>>firing unless time internal > 120ms (~7Hz). This explains why we can
>>>>>>>only see 7 interrupts for the GPIO controller's parent irq.
>>>>>>
>>>>>>That is a great find, well done.
>>>>>>
>>>>>>>My hypothesis is the GPIO doesn't have proper power setting so it stays
>>>>>>>in an idle state or its clock frequency is too low by default thus not
>>>>>>>quick enough to read interrupt input. Then pinctrl-amd must miss some
>>>>>>>code to configure the chip and I need a hardware reference manual of this
>>>>>>>GPIO chip (HID: AMDI0030) or reverse-engineer the driver for Windows
>>>>>>>since I couldn't find a copy of reference manual online? What would you
>>>>>>>suggest?
>>>>>>
>>>>>>This sounds like it might have something to do with the glitch filter.
>>>>>>The code in pinctrl-amd.c to setup the trigger-type also configures
>>>>>>the glitch filter, you could try changing that code to disable the
>>>>>>glitch-filter. The defines for setting the glitch-filter bits to
>>>>>>disabled are already there.
>>>>>>
>>>>>
>>>>>Disabling the glitch filter works like a charm! Other enthusiastic
>>>>>Linux users who have been troubled by this issue for months would
>>>>>also feel great to know this small tweaking could bring their
>>>>>touchpad back to life:) Thank you!
>>>>
>>>>That is good to hear, I'm glad that we have finally found a solution.
>>>>
>>>>>$ git diff
>>>>>diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
>>>>>index 9a760f5cd7ed..e786d779d6c8 100644
>>>>>--- a/drivers/pinctrl/pinctrl-amd.c
>>>>>+++ b/drivers/pinctrl/pinctrl-amd.c
>>>>>@@ -463,7 +463,7 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>>>>>                 pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>>>>>                 pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>>>>>                 pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>>>>>-               pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>>>>>+               /** pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; */
>>>>>                 irq_set_handler_locked(d, handle_level_irq);
>>>>>                 break;
>>>>>
>>>>>I will learn more about the glitch filter and the implementation of
>>>>>pinctrl and see if I can disable glitch filter only for this touchpad.
>>>>
>>>>The glitch filter likely also has settings for how long a glitch
>>>>lasts, which apparently goes all the way up to 120ms. If it would
>>>>only delay reporting by say 0.1ms and consider any pulse longer
>>>>then 0.1s not a glitch, then having it enabled would be fine.
>>>>
>>>>I don't think we want some sort of quirk here to only disable the
>>>>glitch filter for some touchpads. One approach might be to simply
>>>>disable it completely for level type irqs.
>>>>
>>>>What we really need here is some input from AMD engineers with how
>>>>this is all supposed to work.
>>>>
>>>>E.g. maybe the glitch-filter is setup by the BIOS and we should not
>>>>touch it all ?
>>>>
>>>>Or maybe instead of DB_TYPE_PRESERVE_HIGH_GLITCH low level interrupts
>>>>should use DB_TYPE_PRESERVE_LOW_GLITCH ?   Some docs for the hw
>>>>would really help here ...
>>>
>>>So I've been digging through the history of the pinctrl-amd.c driver
>>>and once upon a time it used to set a default debounce time of
>>>2.75 ms.
>>>
>>>See the patch generated by doing:
>>>
>>>git format-patch 8cf4345575a416e6856a6856ac6eaa31ad883126~..8cf4345575a416e6856a6856ac6eaa31ad883126
>>>
>>>In a linux kernel checkout.
>>>
>>>So it would be interesting to add a debugging printk to see
>>>what the value of pin_reg & DB_TMR_OUT_MASK is for the troublesome
>>>GPIO.
>>>
>>>I guess that it might be all 1s (0xfffffffff) or some such which
>>>might be a way to check that we should disable the glitch-filter
>>>for this pin?
>>
>>p.s.
>>
>>Or maybe we should simply stop touching all the glitch-filter
>>related bits, in the same way as that old commit has already
>>removed the code setting the timing of the filter ?
>>
>>At least is seems that forcing the filter to be on without
>>sanitizing the de-bounce time is not a good idea.
>>
>Today I find an inconsistency in drivers/pinctrl/pinctrl-amd.c
>so there must be a bug. As far as I can understand pinctrl-amd,
>"pin_reg & ~DB_CNTRl_MASK" is used to mask out the debouncing
>feature,
>
>static int amd_gpio_set_debounce(struct gpio_chip *gc, unsigned offset,
>		unsigned debounce)
>{
>    ...
>	if (debounce) {
>        ...
>		if (debounce < 61) {
>			pin_reg |= 1;
>			pin_reg &= ~BIT(DB_TMR_OUT_UNIT_OFF);
>			pin_reg &= ~BIT(DB_TMR_LARGE_OFF);
>		...
>		} else if (debounce < 1000000) {
>			time = debounce / 62500;
>			pin_reg |= time & DB_TMR_OUT_MASK;
>			pin_reg |= BIT(DB_TMR_OUT_UNIT_OFF);
>			pin_reg |= BIT(DB_TMR_LARGE_OFF);
>		} else {
>			pin_reg &= ~DB_CNTRl_MASK;
>			ret = -EINVAL;
>		}
>
>	} else {
>        ...
>		pin_reg &= ~DB_CNTRl_MASK;
>	}
>    ...
>}
>
>However in amd_gpio_irq_set_type, "ping_reg & ~(DB_CNTRl_MASK << DB_CNTRL_OFF)"
>is used,
>
>static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>{
>
>    ...
>	case IRQ_TYPE_LEVEL_LOW:
>		pin_reg |= LEVEL_TRIGGER << LEVEL_TRIG_OFF;
>		pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>		pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>		pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>		pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>		irq_set_handler_locked(d, handle_level_irq);
>		break;
>    ...
>}
>
>If "pin_reg & ~DB_CNTRl_MASK" is used instead, the touchpad will work
>flawlessly. So I believe "pin_reg & ~DB_CNTRl_MASK" is the correct way
>to mask out the debouncing filter and the bug lies in amd_gpio_set_type.
>
Sorry, I notice the touchpad is not as sensitive as before when using
"pin_reg & ~DB_CNTRl_MASK". When I use hid-recorder to read the HID
reports of the touchpad, several duplicates would be read. I interpret
it as spurious interrupts are fired because the debouncing filter is
disabled. So it seems there are two mistakes in pinctrl-amd. One mistake
is it shouldn't disable the debouncing filter here and the other mistake
is the way to disable the debouncing filer is incorrect.
>Btw, can you explain what's the difference between glitch filter and
>debouncing filter? Or can you point to some references? I've gain some
>experience about how to configure the GPIO controller by studying the
>code of pinctrl-amd and pinctrl-baytrail (I can't find the hardware
>reference manual for baytrail either). I also tweaked the configuration
>in pinctrl-amd, for example, setting the debounce timeout to 976 usec
>and 3.9 msec without disabling the glitch filter could also save the
>touchpad. But I need some knowledge to understand why this touchpad [1]
>which also uses the buggy pinctrl-amd isn't affected.
>
>[1] https://github.com/Syniurge/i2c-amd-mp2/issues/11#issuecomment-707427095
>
>>Regards,
>>
>>Hans
>>
>
>--
>Best regards,
>Coiby

--
Best regards,
Coiby

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-14 11:34                                     ` Coiby Xu
  0 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-14 11:34 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Shyam Sundar S K, Linus Walleij, wang jun,
	open list:GPIO SUBSYSTEM, linux-kernel-mentees, Nehal Shah

On Wed, Oct 14, 2020 at 12:24:20PM +0800, Coiby Xu wrote:
>On Tue, Oct 06, 2020 at 11:29:40AM +0200, Hans de Goede wrote:
>>
>>
>>On 10/6/20 11:28 AM, Hans de Goede wrote:
>>>Hi,
>>>
>>>On 10/6/20 10:55 AM, Hans de Goede wrote:
>>>>Hi,
>>>>
>>>>On 10/6/20 10:31 AM, Coiby Xu wrote:
>>>>>On Tue, Oct 06, 2020 at 08:28:40AM +0200, Hans de Goede wrote:
>>>>>>Hi,
>>>>>>
>>>>>>On 10/6/20 6:49 AM, Coiby Xu wrote:
>>>>>>>Hi Hans and Linus,
>>>>>>>
>>>>>>>I've found the direct evidence proving the GPIO interrupt controller is
>>>>>>>malfunctioning.
>>>>>>>
>>>>>>>I've found a way to let the GPIO chip trigger an interrupt by accident
>>>>>>>when playing with the GPIO sysfs interface,
>>>>>>>
>>>>>>> - export pin130 which is used by the touchad
>>>>>>> - set the direction to be "out"
>>>>>>> - `echo 0 > value` will trigger the GPIO controller's parent irq and
>>>>>>>   "echo 1 > value" will make it stop firing
>>>>>>>
>>>>>>>(I'm not sure if this is yet another bug of the GPIO chip. Anyway I can
>>>>>>>manually trigger an interrupt now.)
>>>>>>>
>>>>>>>I wrote a C program is to let GPIO controller quickly generate some
>>>>>>>interrupts then disable the firing of interrupts by toggling pin#130's
>>>>>>>value with an specified time interval, i.e., set the value to 0 first
>>>>>>>and then after some time, re-set the value to 1. There is no interrupt
>>>>>>>firing unless time internal > 120ms (~7Hz). This explains why we can
>>>>>>>only see 7 interrupts for the GPIO controller's parent irq.
>>>>>>
>>>>>>That is a great find, well done.
>>>>>>
>>>>>>>My hypothesis is the GPIO doesn't have proper power setting so it stays
>>>>>>>in an idle state or its clock frequency is too low by default thus not
>>>>>>>quick enough to read interrupt input. Then pinctrl-amd must miss some
>>>>>>>code to configure the chip and I need a hardware reference manual of this
>>>>>>>GPIO chip (HID: AMDI0030) or reverse-engineer the driver for Windows
>>>>>>>since I couldn't find a copy of reference manual online? What would you
>>>>>>>suggest?
>>>>>>
>>>>>>This sounds like it might have something to do with the glitch filter.
>>>>>>The code in pinctrl-amd.c to setup the trigger-type also configures
>>>>>>the glitch filter, you could try changing that code to disable the
>>>>>>glitch-filter. The defines for setting the glitch-filter bits to
>>>>>>disabled are already there.
>>>>>>
>>>>>
>>>>>Disabling the glitch filter works like a charm! Other enthusiastic
>>>>>Linux users who have been troubled by this issue for months would
>>>>>also feel great to know this small tweaking could bring their
>>>>>touchpad back to life:) Thank you!
>>>>
>>>>That is good to hear, I'm glad that we have finally found a solution.
>>>>
>>>>>$ git diff
>>>>>diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
>>>>>index 9a760f5cd7ed..e786d779d6c8 100644
>>>>>--- a/drivers/pinctrl/pinctrl-amd.c
>>>>>+++ b/drivers/pinctrl/pinctrl-amd.c
>>>>>@@ -463,7 +463,7 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>>>>>                 pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>>>>>                 pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>>>>>                 pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>>>>>-               pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>>>>>+               /** pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; */
>>>>>                 irq_set_handler_locked(d, handle_level_irq);
>>>>>                 break;
>>>>>
>>>>>I will learn more about the glitch filter and the implementation of
>>>>>pinctrl and see if I can disable glitch filter only for this touchpad.
>>>>
>>>>The glitch filter likely also has settings for how long a glitch
>>>>lasts, which apparently goes all the way up to 120ms. If it would
>>>>only delay reporting by say 0.1ms and consider any pulse longer
>>>>then 0.1s not a glitch, then having it enabled would be fine.
>>>>
>>>>I don't think we want some sort of quirk here to only disable the
>>>>glitch filter for some touchpads. One approach might be to simply
>>>>disable it completely for level type irqs.
>>>>
>>>>What we really need here is some input from AMD engineers with how
>>>>this is all supposed to work.
>>>>
>>>>E.g. maybe the glitch-filter is setup by the BIOS and we should not
>>>>touch it all ?
>>>>
>>>>Or maybe instead of DB_TYPE_PRESERVE_HIGH_GLITCH low level interrupts
>>>>should use DB_TYPE_PRESERVE_LOW_GLITCH ?   Some docs for the hw
>>>>would really help here ...
>>>
>>>So I've been digging through the history of the pinctrl-amd.c driver
>>>and once upon a time it used to set a default debounce time of
>>>2.75 ms.
>>>
>>>See the patch generated by doing:
>>>
>>>git format-patch 8cf4345575a416e6856a6856ac6eaa31ad883126~..8cf4345575a416e6856a6856ac6eaa31ad883126
>>>
>>>In a linux kernel checkout.
>>>
>>>So it would be interesting to add a debugging printk to see
>>>what the value of pin_reg & DB_TMR_OUT_MASK is for the troublesome
>>>GPIO.
>>>
>>>I guess that it might be all 1s (0xfffffffff) or some such which
>>>might be a way to check that we should disable the glitch-filter
>>>for this pin?
>>
>>p.s.
>>
>>Or maybe we should simply stop touching all the glitch-filter
>>related bits, in the same way as that old commit has already
>>removed the code setting the timing of the filter ?
>>
>>At least is seems that forcing the filter to be on without
>>sanitizing the de-bounce time is not a good idea.
>>
>Today I find an inconsistency in drivers/pinctrl/pinctrl-amd.c
>so there must be a bug. As far as I can understand pinctrl-amd,
>"pin_reg & ~DB_CNTRl_MASK" is used to mask out the debouncing
>feature,
>
>static int amd_gpio_set_debounce(struct gpio_chip *gc, unsigned offset,
>		unsigned debounce)
>{
>    ...
>	if (debounce) {
>        ...
>		if (debounce < 61) {
>			pin_reg |= 1;
>			pin_reg &= ~BIT(DB_TMR_OUT_UNIT_OFF);
>			pin_reg &= ~BIT(DB_TMR_LARGE_OFF);
>		...
>		} else if (debounce < 1000000) {
>			time = debounce / 62500;
>			pin_reg |= time & DB_TMR_OUT_MASK;
>			pin_reg |= BIT(DB_TMR_OUT_UNIT_OFF);
>			pin_reg |= BIT(DB_TMR_LARGE_OFF);
>		} else {
>			pin_reg &= ~DB_CNTRl_MASK;
>			ret = -EINVAL;
>		}
>
>	} else {
>        ...
>		pin_reg &= ~DB_CNTRl_MASK;
>	}
>    ...
>}
>
>However in amd_gpio_irq_set_type, "ping_reg & ~(DB_CNTRl_MASK << DB_CNTRL_OFF)"
>is used,
>
>static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>{
>
>    ...
>	case IRQ_TYPE_LEVEL_LOW:
>		pin_reg |= LEVEL_TRIGGER << LEVEL_TRIG_OFF;
>		pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>		pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>		pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>		pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>		irq_set_handler_locked(d, handle_level_irq);
>		break;
>    ...
>}
>
>If "pin_reg & ~DB_CNTRl_MASK" is used instead, the touchpad will work
>flawlessly. So I believe "pin_reg & ~DB_CNTRl_MASK" is the correct way
>to mask out the debouncing filter and the bug lies in amd_gpio_set_type.
>
Sorry, I notice the touchpad is not as sensitive as before when using
"pin_reg & ~DB_CNTRl_MASK". When I use hid-recorder to read the HID
reports of the touchpad, several duplicates would be read. I interpret
it as spurious interrupts are fired because the debouncing filter is
disabled. So it seems there are two mistakes in pinctrl-amd. One mistake
is it shouldn't disable the debouncing filter here and the other mistake
is the way to disable the debouncing filer is incorrect.
>Btw, can you explain what's the difference between glitch filter and
>debouncing filter? Or can you point to some references? I've gain some
>experience about how to configure the GPIO controller by studying the
>code of pinctrl-amd and pinctrl-baytrail (I can't find the hardware
>reference manual for baytrail either). I also tweaked the configuration
>in pinctrl-amd, for example, setting the debounce timeout to 976 usec
>and 3.9 msec without disabling the glitch filter could also save the
>touchpad. But I need some knowledge to understand why this touchpad [1]
>which also uses the buggy pinctrl-amd isn't affected.
>
>[1] https://github.com/Syniurge/i2c-amd-mp2/issues/11#issuecomment-707427095
>
>>Regards,
>>
>>Hans
>>
>
>--
>Best regards,
>Coiby

--
Best regards,
Coiby
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-14  4:24                                   ` [Linux-kernel-mentees] " Coiby Xu
@ 2020-10-14 11:46                                     ` Hans de Goede
  -1 siblings, 0 replies; 84+ messages in thread
From: Hans de Goede @ 2020-10-14 11:46 UTC (permalink / raw)
  To: Coiby Xu
  Cc: Linus Walleij, open list:GPIO SUBSYSTEM, wang jun, Nehal Shah,
	Shyam Sundar S K, linux-kernel-mentees

Hi,

On 10/14/20 6:24 AM, Coiby Xu wrote:
> On Tue, Oct 06, 2020 at 11:29:40AM +0200, Hans de Goede wrote:
>>
>>
>> On 10/6/20 11:28 AM, Hans de Goede wrote:
>>> Hi,
>>>
>>> On 10/6/20 10:55 AM, Hans de Goede wrote:
>>>> Hi,
>>>>
>>>> On 10/6/20 10:31 AM, Coiby Xu wrote:
>>>>> On Tue, Oct 06, 2020 at 08:28:40AM +0200, Hans de Goede wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 10/6/20 6:49 AM, Coiby Xu wrote:
>>>>>>> Hi Hans and Linus,
>>>>>>>
>>>>>>> I've found the direct evidence proving the GPIO interrupt controller is
>>>>>>> malfunctioning.
>>>>>>>
>>>>>>> I've found a way to let the GPIO chip trigger an interrupt by accident
>>>>>>> when playing with the GPIO sysfs interface,
>>>>>>>
>>>>>>>  - export pin130 which is used by the touchad
>>>>>>>  - set the direction to be "out"
>>>>>>>  - `echo 0 > value` will trigger the GPIO controller's parent irq and
>>>>>>>    "echo 1 > value" will make it stop firing
>>>>>>>
>>>>>>> (I'm not sure if this is yet another bug of the GPIO chip. Anyway I can
>>>>>>> manually trigger an interrupt now.)
>>>>>>>
>>>>>>> I wrote a C program is to let GPIO controller quickly generate some
>>>>>>> interrupts then disable the firing of interrupts by toggling pin#130's
>>>>>>> value with an specified time interval, i.e., set the value to 0 first
>>>>>>> and then after some time, re-set the value to 1. There is no interrupt
>>>>>>> firing unless time internal > 120ms (~7Hz). This explains why we can
>>>>>>> only see 7 interrupts for the GPIO controller's parent irq.
>>>>>>
>>>>>> That is a great find, well done.
>>>>>>
>>>>>>> My hypothesis is the GPIO doesn't have proper power setting so it stays
>>>>>>> in an idle state or its clock frequency is too low by default thus not
>>>>>>> quick enough to read interrupt input. Then pinctrl-amd must miss some
>>>>>>> code to configure the chip and I need a hardware reference manual of this
>>>>>>> GPIO chip (HID: AMDI0030) or reverse-engineer the driver for Windows
>>>>>>> since I couldn't find a copy of reference manual online? What would you
>>>>>>> suggest?
>>>>>>
>>>>>> This sounds like it might have something to do with the glitch filter.
>>>>>> The code in pinctrl-amd.c to setup the trigger-type also configures
>>>>>> the glitch filter, you could try changing that code to disable the
>>>>>> glitch-filter. The defines for setting the glitch-filter bits to
>>>>>> disabled are already there.
>>>>>>
>>>>>
>>>>> Disabling the glitch filter works like a charm! Other enthusiastic
>>>>> Linux users who have been troubled by this issue for months would
>>>>> also feel great to know this small tweaking could bring their
>>>>> touchpad back to life:) Thank you!
>>>>
>>>> That is good to hear, I'm glad that we have finally found a solution.
>>>>
>>>>> $ git diff
>>>>> diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
>>>>> index 9a760f5cd7ed..e786d779d6c8 100644
>>>>> --- a/drivers/pinctrl/pinctrl-amd.c
>>>>> +++ b/drivers/pinctrl/pinctrl-amd.c
>>>>> @@ -463,7 +463,7 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>>>>>                  pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>>>>>                  pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>>>>>                  pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>>>>> -               pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>>>>> +               /** pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; */
>>>>>                  irq_set_handler_locked(d, handle_level_irq);
>>>>>                  break;
>>>>>
>>>>> I will learn more about the glitch filter and the implementation of
>>>>> pinctrl and see if I can disable glitch filter only for this touchpad.
>>>>
>>>> The glitch filter likely also has settings for how long a glitch
>>>> lasts, which apparently goes all the way up to 120ms. If it would
>>>> only delay reporting by say 0.1ms and consider any pulse longer
>>>> then 0.1s not a glitch, then having it enabled would be fine.
>>>>
>>>> I don't think we want some sort of quirk here to only disable the
>>>> glitch filter for some touchpads. One approach might be to simply
>>>> disable it completely for level type irqs.
>>>>
>>>> What we really need here is some input from AMD engineers with how
>>>> this is all supposed to work.
>>>>
>>>> E.g. maybe the glitch-filter is setup by the BIOS and we should not
>>>> touch it all ?
>>>>
>>>> Or maybe instead of DB_TYPE_PRESERVE_HIGH_GLITCH low level interrupts
>>>> should use DB_TYPE_PRESERVE_LOW_GLITCH ?   Some docs for the hw
>>>> would really help here ...
>>>
>>> So I've been digging through the history of the pinctrl-amd.c driver
>>> and once upon a time it used to set a default debounce time of
>>> 2.75 ms.
>>>
>>> See the patch generated by doing:
>>>
>>> git format-patch 8cf4345575a416e6856a6856ac6eaa31ad883126~..8cf4345575a416e6856a6856ac6eaa31ad883126
>>>
>>> In a linux kernel checkout.
>>>
>>> So it would be interesting to add a debugging printk to see
>>> what the value of pin_reg & DB_TMR_OUT_MASK is for the troublesome
>>> GPIO.
>>>
>>> I guess that it might be all 1s (0xfffffffff) or some such which
>>> might be a way to check that we should disable the glitch-filter
>>> for this pin?
>>
>> p.s.
>>
>> Or maybe we should simply stop touching all the glitch-filter
>> related bits, in the same way as that old commit has already
>> removed the code setting the timing of the filter ?
>>
>> At least is seems that forcing the filter to be on without
>> sanitizing the de-bounce time is not a good idea.
>>
> Today I find an inconsistency in drivers/pinctrl/pinctrl-amd.c
> so there must be a bug. As far as I can understand pinctrl-amd,
> "pin_reg & ~DB_CNTRl_MASK" is used to mask out the debouncing
> feature,
> 
> static int amd_gpio_set_debounce(struct gpio_chip *gc, unsigned offset,
>          unsigned debounce)
> {
>      ...
>      if (debounce) {
>          ...
>          if (debounce < 61) {
>              pin_reg |= 1;
>              pin_reg &= ~BIT(DB_TMR_OUT_UNIT_OFF);
>              pin_reg &= ~BIT(DB_TMR_LARGE_OFF);
>          ...
>          } else if (debounce < 1000000) {
>              time = debounce / 62500;
>              pin_reg |= time & DB_TMR_OUT_MASK;
>              pin_reg |= BIT(DB_TMR_OUT_UNIT_OFF);
>              pin_reg |= BIT(DB_TMR_LARGE_OFF);
>          } else {
>              pin_reg &= ~DB_CNTRl_MASK;
>              ret = -EINVAL;
>          }
> 
>      } else {
>          ...
>          pin_reg &= ~DB_CNTRl_MASK;
>      }
>      ...
> }
> 
> However in amd_gpio_irq_set_type, "ping_reg & ~(DB_CNTRl_MASK << DB_CNTRL_OFF)"
> is used,
> 
> static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
> {
> 
>      ...
>      case IRQ_TYPE_LEVEL_LOW:
>          pin_reg |= LEVEL_TRIGGER << LEVEL_TRIG_OFF;
>          pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>          pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>          pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>          pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>          irq_set_handler_locked(d, handle_level_irq);
>          break;
>      ...
> }
> 
> If "pin_reg & ~DB_CNTRl_MASK" is used instead, the touchpad will work
> flawlessly. So I believe "pin_reg & ~DB_CNTRl_MASK" is the correct way
> to mask out the debouncing filter and the bug lies in amd_gpio_set_type.

I'm afraid that that is not the case, the current code is correct,
it clears bit 5 and 6 of the register which are the bits which control
the debounce type.

You mentioned in an earlier mail that the value of the register is
0x500e8 before this function runs.

If you drop the "<< DB_CNTRL_OFF" part then instead you are masking out
bits 0 and 1 which are already 0, so the mask becomes a no-op.

> Btw, can you explain what's the difference between glitch filter and
> debouncing filter?

There is no difference the driver mixes the terms, but they both refer
to the same thing this is most clear in the defines for the DB_CNTRL bits
(bits 5 and 6 of the register):

#define DB_TYPE_NO_DEBOUNCE               0x0UL
#define DB_TYPE_PRESERVE_LOW_GLITCH       0x1UL
#define DB_TYPE_PRESERVE_HIGH_GLITCH      0x2UL
#define DB_TYPE_REMOVE_GLITCH             0x3UL

Which is interesting because bits 5 and 6 are both 1 as set by the BIOS,
so with your little hack to dro the "<< DB_CNTRL_OFF" you are in essence
keeping bits 5 and 6 as DB_TYPE_REMOVE_GLITCH.

So it seems that the problem is that the irq_set_type code changes
the glitch filter type from DB_TYPE_REMOVE_GLITCH (filter out all
glitches) to DB_TYPE_PRESERVE_HIGH_GLITCH, which apperently breaks
things.

To test this you could replace the:

DB_TYPE_PRESERVE_HIGH_GLITCH

bit in the case IRQ_TYPE_LEVEL_LOW path with:

DB_TYPE_REMOVE_GLITCH

Which I would expect to also fix your touchpad.

If that is the case an interesting experiment would be to
replace DB_TYPE_PRESERVE_HIGH_GLITCH with
DB_TYPE_PRESERVE_LOW_GLITCH instead.

I've never seen this kinda glitch/debounce filter where
you can filter out only one type of level before, so
I wonder if the code maybe simply got it wrong, also for
a level type irq I really see no objection to just
use DB_TYPE_REMOVE_GLITCH instead of the weird "half"
filters.

So I just ran a git blame and the DB_TYPE_PRESERVE_HIGH_GLITCH
has been there from the very first commit of this driver,
I wonder if it has been wrong all this time and should be
inverted (so DB_TYPE_PRESERVE_LOW_GLITCH instead).

I think we may want to just play it safe though and simply
switch to DB_TYPE_REMOVE_GLITCH as we already do for all
edge types and when amd_gpio_set_config() gets called!

Linus, what do you think about just switching to
DB_TYPE_REMOVE_GLITCH for level type irqs (unifying them
with all the other modes) and not mucking with this weird,
undocumented "half" filter modes ?

> Or can you point to some references? I've gain some
> experience about how to configure the GPIO controller by studying the
> code of pinctrl-amd and pinctrl-baytrail (I can't find the hardware
> reference manual for baytrail either). I also tweaked the configuration
> in pinctrl-amd, for example, setting the debounce timeout to 976 usec
> and 3.9 msec without disabling the glitch filter could also save the
> touchpad. But I need some knowledge to understand why this touchpad [1]
> which also uses the buggy pinctrl-amd isn't affected.
> 
> [1] https://github.com/Syniurge/i2c-amd-mp2/issues/11#issuecomment-707427095

My guess would be that it uses edge types interrupts instead ?
I have seen that quite a few times, even though it is weird
to do that for i2c devices.

Regards,

Hans


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-14 11:46                                     ` Hans de Goede
  0 siblings, 0 replies; 84+ messages in thread
From: Hans de Goede @ 2020-10-14 11:46 UTC (permalink / raw)
  To: Coiby Xu
  Cc: Shyam Sundar S K, Linus Walleij, wang jun,
	open list:GPIO SUBSYSTEM, linux-kernel-mentees, Nehal Shah

Hi,

On 10/14/20 6:24 AM, Coiby Xu wrote:
> On Tue, Oct 06, 2020 at 11:29:40AM +0200, Hans de Goede wrote:
>>
>>
>> On 10/6/20 11:28 AM, Hans de Goede wrote:
>>> Hi,
>>>
>>> On 10/6/20 10:55 AM, Hans de Goede wrote:
>>>> Hi,
>>>>
>>>> On 10/6/20 10:31 AM, Coiby Xu wrote:
>>>>> On Tue, Oct 06, 2020 at 08:28:40AM +0200, Hans de Goede wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 10/6/20 6:49 AM, Coiby Xu wrote:
>>>>>>> Hi Hans and Linus,
>>>>>>>
>>>>>>> I've found the direct evidence proving the GPIO interrupt controller is
>>>>>>> malfunctioning.
>>>>>>>
>>>>>>> I've found a way to let the GPIO chip trigger an interrupt by accident
>>>>>>> when playing with the GPIO sysfs interface,
>>>>>>>
>>>>>>>  - export pin130 which is used by the touchad
>>>>>>>  - set the direction to be "out"
>>>>>>>  - `echo 0 > value` will trigger the GPIO controller's parent irq and
>>>>>>>    "echo 1 > value" will make it stop firing
>>>>>>>
>>>>>>> (I'm not sure if this is yet another bug of the GPIO chip. Anyway I can
>>>>>>> manually trigger an interrupt now.)
>>>>>>>
>>>>>>> I wrote a C program is to let GPIO controller quickly generate some
>>>>>>> interrupts then disable the firing of interrupts by toggling pin#130's
>>>>>>> value with an specified time interval, i.e., set the value to 0 first
>>>>>>> and then after some time, re-set the value to 1. There is no interrupt
>>>>>>> firing unless time internal > 120ms (~7Hz). This explains why we can
>>>>>>> only see 7 interrupts for the GPIO controller's parent irq.
>>>>>>
>>>>>> That is a great find, well done.
>>>>>>
>>>>>>> My hypothesis is the GPIO doesn't have proper power setting so it stays
>>>>>>> in an idle state or its clock frequency is too low by default thus not
>>>>>>> quick enough to read interrupt input. Then pinctrl-amd must miss some
>>>>>>> code to configure the chip and I need a hardware reference manual of this
>>>>>>> GPIO chip (HID: AMDI0030) or reverse-engineer the driver for Windows
>>>>>>> since I couldn't find a copy of reference manual online? What would you
>>>>>>> suggest?
>>>>>>
>>>>>> This sounds like it might have something to do with the glitch filter.
>>>>>> The code in pinctrl-amd.c to setup the trigger-type also configures
>>>>>> the glitch filter, you could try changing that code to disable the
>>>>>> glitch-filter. The defines for setting the glitch-filter bits to
>>>>>> disabled are already there.
>>>>>>
>>>>>
>>>>> Disabling the glitch filter works like a charm! Other enthusiastic
>>>>> Linux users who have been troubled by this issue for months would
>>>>> also feel great to know this small tweaking could bring their
>>>>> touchpad back to life:) Thank you!
>>>>
>>>> That is good to hear, I'm glad that we have finally found a solution.
>>>>
>>>>> $ git diff
>>>>> diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
>>>>> index 9a760f5cd7ed..e786d779d6c8 100644
>>>>> --- a/drivers/pinctrl/pinctrl-amd.c
>>>>> +++ b/drivers/pinctrl/pinctrl-amd.c
>>>>> @@ -463,7 +463,7 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>>>>>                  pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>>>>>                  pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>>>>>                  pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>>>>> -               pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>>>>> +               /** pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; */
>>>>>                  irq_set_handler_locked(d, handle_level_irq);
>>>>>                  break;
>>>>>
>>>>> I will learn more about the glitch filter and the implementation of
>>>>> pinctrl and see if I can disable glitch filter only for this touchpad.
>>>>
>>>> The glitch filter likely also has settings for how long a glitch
>>>> lasts, which apparently goes all the way up to 120ms. If it would
>>>> only delay reporting by say 0.1ms and consider any pulse longer
>>>> then 0.1s not a glitch, then having it enabled would be fine.
>>>>
>>>> I don't think we want some sort of quirk here to only disable the
>>>> glitch filter for some touchpads. One approach might be to simply
>>>> disable it completely for level type irqs.
>>>>
>>>> What we really need here is some input from AMD engineers with how
>>>> this is all supposed to work.
>>>>
>>>> E.g. maybe the glitch-filter is setup by the BIOS and we should not
>>>> touch it all ?
>>>>
>>>> Or maybe instead of DB_TYPE_PRESERVE_HIGH_GLITCH low level interrupts
>>>> should use DB_TYPE_PRESERVE_LOW_GLITCH ?   Some docs for the hw
>>>> would really help here ...
>>>
>>> So I've been digging through the history of the pinctrl-amd.c driver
>>> and once upon a time it used to set a default debounce time of
>>> 2.75 ms.
>>>
>>> See the patch generated by doing:
>>>
>>> git format-patch 8cf4345575a416e6856a6856ac6eaa31ad883126~..8cf4345575a416e6856a6856ac6eaa31ad883126
>>>
>>> In a linux kernel checkout.
>>>
>>> So it would be interesting to add a debugging printk to see
>>> what the value of pin_reg & DB_TMR_OUT_MASK is for the troublesome
>>> GPIO.
>>>
>>> I guess that it might be all 1s (0xfffffffff) or some such which
>>> might be a way to check that we should disable the glitch-filter
>>> for this pin?
>>
>> p.s.
>>
>> Or maybe we should simply stop touching all the glitch-filter
>> related bits, in the same way as that old commit has already
>> removed the code setting the timing of the filter ?
>>
>> At least is seems that forcing the filter to be on without
>> sanitizing the de-bounce time is not a good idea.
>>
> Today I find an inconsistency in drivers/pinctrl/pinctrl-amd.c
> so there must be a bug. As far as I can understand pinctrl-amd,
> "pin_reg & ~DB_CNTRl_MASK" is used to mask out the debouncing
> feature,
> 
> static int amd_gpio_set_debounce(struct gpio_chip *gc, unsigned offset,
>          unsigned debounce)
> {
>      ...
>      if (debounce) {
>          ...
>          if (debounce < 61) {
>              pin_reg |= 1;
>              pin_reg &= ~BIT(DB_TMR_OUT_UNIT_OFF);
>              pin_reg &= ~BIT(DB_TMR_LARGE_OFF);
>          ...
>          } else if (debounce < 1000000) {
>              time = debounce / 62500;
>              pin_reg |= time & DB_TMR_OUT_MASK;
>              pin_reg |= BIT(DB_TMR_OUT_UNIT_OFF);
>              pin_reg |= BIT(DB_TMR_LARGE_OFF);
>          } else {
>              pin_reg &= ~DB_CNTRl_MASK;
>              ret = -EINVAL;
>          }
> 
>      } else {
>          ...
>          pin_reg &= ~DB_CNTRl_MASK;
>      }
>      ...
> }
> 
> However in amd_gpio_irq_set_type, "ping_reg & ~(DB_CNTRl_MASK << DB_CNTRL_OFF)"
> is used,
> 
> static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
> {
> 
>      ...
>      case IRQ_TYPE_LEVEL_LOW:
>          pin_reg |= LEVEL_TRIGGER << LEVEL_TRIG_OFF;
>          pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>          pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>          pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>          pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>          irq_set_handler_locked(d, handle_level_irq);
>          break;
>      ...
> }
> 
> If "pin_reg & ~DB_CNTRl_MASK" is used instead, the touchpad will work
> flawlessly. So I believe "pin_reg & ~DB_CNTRl_MASK" is the correct way
> to mask out the debouncing filter and the bug lies in amd_gpio_set_type.

I'm afraid that that is not the case, the current code is correct,
it clears bit 5 and 6 of the register which are the bits which control
the debounce type.

You mentioned in an earlier mail that the value of the register is
0x500e8 before this function runs.

If you drop the "<< DB_CNTRL_OFF" part then instead you are masking out
bits 0 and 1 which are already 0, so the mask becomes a no-op.

> Btw, can you explain what's the difference between glitch filter and
> debouncing filter?

There is no difference the driver mixes the terms, but they both refer
to the same thing this is most clear in the defines for the DB_CNTRL bits
(bits 5 and 6 of the register):

#define DB_TYPE_NO_DEBOUNCE               0x0UL
#define DB_TYPE_PRESERVE_LOW_GLITCH       0x1UL
#define DB_TYPE_PRESERVE_HIGH_GLITCH      0x2UL
#define DB_TYPE_REMOVE_GLITCH             0x3UL

Which is interesting because bits 5 and 6 are both 1 as set by the BIOS,
so with your little hack to dro the "<< DB_CNTRL_OFF" you are in essence
keeping bits 5 and 6 as DB_TYPE_REMOVE_GLITCH.

So it seems that the problem is that the irq_set_type code changes
the glitch filter type from DB_TYPE_REMOVE_GLITCH (filter out all
glitches) to DB_TYPE_PRESERVE_HIGH_GLITCH, which apperently breaks
things.

To test this you could replace the:

DB_TYPE_PRESERVE_HIGH_GLITCH

bit in the case IRQ_TYPE_LEVEL_LOW path with:

DB_TYPE_REMOVE_GLITCH

Which I would expect to also fix your touchpad.

If that is the case an interesting experiment would be to
replace DB_TYPE_PRESERVE_HIGH_GLITCH with
DB_TYPE_PRESERVE_LOW_GLITCH instead.

I've never seen this kinda glitch/debounce filter where
you can filter out only one type of level before, so
I wonder if the code maybe simply got it wrong, also for
a level type irq I really see no objection to just
use DB_TYPE_REMOVE_GLITCH instead of the weird "half"
filters.

So I just ran a git blame and the DB_TYPE_PRESERVE_HIGH_GLITCH
has been there from the very first commit of this driver,
I wonder if it has been wrong all this time and should be
inverted (so DB_TYPE_PRESERVE_LOW_GLITCH instead).

I think we may want to just play it safe though and simply
switch to DB_TYPE_REMOVE_GLITCH as we already do for all
edge types and when amd_gpio_set_config() gets called!

Linus, what do you think about just switching to
DB_TYPE_REMOVE_GLITCH for level type irqs (unifying them
with all the other modes) and not mucking with this weird,
undocumented "half" filter modes ?

> Or can you point to some references? I've gain some
> experience about how to configure the GPIO controller by studying the
> code of pinctrl-amd and pinctrl-baytrail (I can't find the hardware
> reference manual for baytrail either). I also tweaked the configuration
> in pinctrl-amd, for example, setting the debounce timeout to 976 usec
> and 3.9 msec without disabling the glitch filter could also save the
> touchpad. But I need some knowledge to understand why this touchpad [1]
> which also uses the buggy pinctrl-amd isn't affected.
> 
> [1] https://github.com/Syniurge/i2c-amd-mp2/issues/11#issuecomment-707427095

My guess would be that it uses edge types interrupts instead ?
I have seen that quite a few times, even though it is weird
to do that for i2c devices.

Regards,

Hans

_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-14 11:46                                     ` [Linux-kernel-mentees] " Hans de Goede
@ 2020-10-15  3:27                                       ` Coiby Xu
  -1 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-15  3:27 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Linus Walleij, open list:GPIO SUBSYSTEM, wang jun, Nehal Shah,
	Shyam Sundar S K, linux-kernel-mentees

On Wed, Oct 14, 2020 at 01:46:14PM +0200, Hans de Goede wrote:
>Hi,
>
>On 10/14/20 6:24 AM, Coiby Xu wrote:
>>On Tue, Oct 06, 2020 at 11:29:40AM +0200, Hans de Goede wrote:
>>>
>>>
>>>On 10/6/20 11:28 AM, Hans de Goede wrote:
>>>>Hi,
>>>>
>>>>On 10/6/20 10:55 AM, Hans de Goede wrote:
>>>>>Hi,
>>>>>
>>>>>On 10/6/20 10:31 AM, Coiby Xu wrote:
>>>>>>On Tue, Oct 06, 2020 at 08:28:40AM +0200, Hans de Goede wrote:
>>>>>>>Hi,
>>>>>>>
>>>>>>>On 10/6/20 6:49 AM, Coiby Xu wrote:
>>>>>>>>Hi Hans and Linus,
>>>>>>>>
>>>>>>>>I've found the direct evidence proving the GPIO interrupt controller is
>>>>>>>>malfunctioning.
>>>>>>>>
>>>>>>>>I've found a way to let the GPIO chip trigger an interrupt by accident
>>>>>>>>when playing with the GPIO sysfs interface,
>>>>>>>>
>>>>>>>> - export pin130 which is used by the touchad
>>>>>>>> - set the direction to be "out"
>>>>>>>> - `echo 0 > value` will trigger the GPIO controller's parent irq and
>>>>>>>>   "echo 1 > value" will make it stop firing
>>>>>>>>
>>>>>>>>(I'm not sure if this is yet another bug of the GPIO chip. Anyway I can
>>>>>>>>manually trigger an interrupt now.)
>>>>>>>>
>>>>>>>>I wrote a C program is to let GPIO controller quickly generate some
>>>>>>>>interrupts then disable the firing of interrupts by toggling pin#130's
>>>>>>>>value with an specified time interval, i.e., set the value to 0 first
>>>>>>>>and then after some time, re-set the value to 1. There is no interrupt
>>>>>>>>firing unless time internal > 120ms (~7Hz). This explains why we can
>>>>>>>>only see 7 interrupts for the GPIO controller's parent irq.
>>>>>>>
>>>>>>>That is a great find, well done.
>>>>>>>
>>>>>>>>My hypothesis is the GPIO doesn't have proper power setting so it stays
>>>>>>>>in an idle state or its clock frequency is too low by default thus not
>>>>>>>>quick enough to read interrupt input. Then pinctrl-amd must miss some
>>>>>>>>code to configure the chip and I need a hardware reference manual of this
>>>>>>>>GPIO chip (HID: AMDI0030) or reverse-engineer the driver for Windows
>>>>>>>>since I couldn't find a copy of reference manual online? What would you
>>>>>>>>suggest?
>>>>>>>
>>>>>>>This sounds like it might have something to do with the glitch filter.
>>>>>>>The code in pinctrl-amd.c to setup the trigger-type also configures
>>>>>>>the glitch filter, you could try changing that code to disable the
>>>>>>>glitch-filter. The defines for setting the glitch-filter bits to
>>>>>>>disabled are already there.
>>>>>>>
>>>>>>
>>>>>>Disabling the glitch filter works like a charm! Other enthusiastic
>>>>>>Linux users who have been troubled by this issue for months would
>>>>>>also feel great to know this small tweaking could bring their
>>>>>>touchpad back to life:) Thank you!
>>>>>
>>>>>That is good to hear, I'm glad that we have finally found a solution.
>>>>>
>>>>>>$ git diff
>>>>>>diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
>>>>>>index 9a760f5cd7ed..e786d779d6c8 100644
>>>>>>--- a/drivers/pinctrl/pinctrl-amd.c
>>>>>>+++ b/drivers/pinctrl/pinctrl-amd.c
>>>>>>@@ -463,7 +463,7 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>>>>>>                 pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>>>>>>                 pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>>>>>>                 pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>>>>>>-               pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>>>>>>+               /** pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; */
>>>>>>                 irq_set_handler_locked(d, handle_level_irq);
>>>>>>                 break;
>>>>>>
>>>>>>I will learn more about the glitch filter and the implementation of
>>>>>>pinctrl and see if I can disable glitch filter only for this touchpad.
>>>>>
>>>>>The glitch filter likely also has settings for how long a glitch
>>>>>lasts, which apparently goes all the way up to 120ms. If it would
>>>>>only delay reporting by say 0.1ms and consider any pulse longer
>>>>>then 0.1s not a glitch, then having it enabled would be fine.
>>>>>
>>>>>I don't think we want some sort of quirk here to only disable the
>>>>>glitch filter for some touchpads. One approach might be to simply
>>>>>disable it completely for level type irqs.
>>>>>
>>>>>What we really need here is some input from AMD engineers with how
>>>>>this is all supposed to work.
>>>>>
>>>>>E.g. maybe the glitch-filter is setup by the BIOS and we should not
>>>>>touch it all ?
>>>>>
>>>>>Or maybe instead of DB_TYPE_PRESERVE_HIGH_GLITCH low level interrupts
>>>>>should use DB_TYPE_PRESERVE_LOW_GLITCH ?   Some docs for the hw
>>>>>would really help here ...
>>>>
>>>>So I've been digging through the history of the pinctrl-amd.c driver
>>>>and once upon a time it used to set a default debounce time of
>>>>2.75 ms.
>>>>
>>>>See the patch generated by doing:
>>>>
>>>>git format-patch 8cf4345575a416e6856a6856ac6eaa31ad883126~..8cf4345575a416e6856a6856ac6eaa31ad883126
>>>>
>>>>In a linux kernel checkout.
>>>>
>>>>So it would be interesting to add a debugging printk to see
>>>>what the value of pin_reg & DB_TMR_OUT_MASK is for the troublesome
>>>>GPIO.
>>>>
>>>>I guess that it might be all 1s (0xfffffffff) or some such which
>>>>might be a way to check that we should disable the glitch-filter
>>>>for this pin?
>>>
>>>p.s.
>>>
>>>Or maybe we should simply stop touching all the glitch-filter
>>>related bits, in the same way as that old commit has already
>>>removed the code setting the timing of the filter ?
>>>
>>>At least is seems that forcing the filter to be on without
>>>sanitizing the de-bounce time is not a good idea.
>>>
>>Today I find an inconsistency in drivers/pinctrl/pinctrl-amd.c
>>so there must be a bug. As far as I can understand pinctrl-amd,
>>"pin_reg & ~DB_CNTRl_MASK" is used to mask out the debouncing
>>feature,
>>
>>static int amd_gpio_set_debounce(struct gpio_chip *gc, unsigned offset,
>>         unsigned debounce)
>>{
>>     ...
>>     if (debounce) {
>>         ...
>>         if (debounce < 61) {
>>             pin_reg |= 1;
>>             pin_reg &= ~BIT(DB_TMR_OUT_UNIT_OFF);
>>             pin_reg &= ~BIT(DB_TMR_LARGE_OFF);
>>         ...
>>         } else if (debounce < 1000000) {
>>             time = debounce / 62500;
>>             pin_reg |= time & DB_TMR_OUT_MASK;
>>             pin_reg |= BIT(DB_TMR_OUT_UNIT_OFF);
>>             pin_reg |= BIT(DB_TMR_LARGE_OFF);
>>         } else {
>>             pin_reg &= ~DB_CNTRl_MASK;
>>             ret = -EINVAL;
>>         }
>>
>>     } else {
>>         ...
>>         pin_reg &= ~DB_CNTRl_MASK;
>>     }
>>     ...
>>}
>>
>>However in amd_gpio_irq_set_type, "ping_reg & ~(DB_CNTRl_MASK << DB_CNTRL_OFF)"
>>is used,
>>
>>static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>>{
>>
>>     ...
>>     case IRQ_TYPE_LEVEL_LOW:
>>         pin_reg |= LEVEL_TRIGGER << LEVEL_TRIG_OFF;
>>         pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>>         pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>>         pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>>         pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>>         irq_set_handler_locked(d, handle_level_irq);
>>         break;
>>     ...
>>}
>>
>>If "pin_reg & ~DB_CNTRl_MASK" is used instead, the touchpad will work
>>flawlessly. So I believe "pin_reg & ~DB_CNTRl_MASK" is the correct way
>>to mask out the debouncing filter and the bug lies in amd_gpio_set_type.
>
>I'm afraid that that is not the case, the current code is correct,
>it clears bit 5 and 6 of the register which are the bits which control
>the debounce type.
>
Thank you for the explanation. As mentioned in another email (that email
was supposed to be delivered much yesterday, but I forgot to run
msmtp-runqueue.sh to send offline emails), this hack led to some issues.
So it must be amd_gpio_set_debounce that makes the mistake of incorrectly
masking out the bits of controlling the debounce type. Btw,
amd_gpio_set_debounce seems to be never used because
"struct acpi_gpio_info" doesn't has the debounce_timeout field. So the
bug has never been exposed.
>You mentioned in an earlier mail that the value of the register is
>0x500e8 before this function runs.
>
>If you drop the "<< DB_CNTRL_OFF" part then instead you are masking out
>bits 0 and 1 which are already 0, so the mask becomes a no-op.
>
>>Btw, can you explain what's the difference between glitch filter and
>>debouncing filter?
>
>There is no difference the driver mixes the terms, but they both refer
>to the same thing this is most clear in the defines for the DB_CNTRL bits
>(bits 5 and 6 of the register):
>
>#define DB_TYPE_NO_DEBOUNCE               0x0UL
>#define DB_TYPE_PRESERVE_LOW_GLITCH       0x1UL
>#define DB_TYPE_PRESERVE_HIGH_GLITCH      0x2UL
>#define DB_TYPE_REMOVE_GLITCH             0x3UL
>
Thank you for the clarification! This makes it much easier to
understanding the behaviour of the GPIO controller.

>Which is interesting because bits 5 and 6 are both 1 as set by the BIOS,
>so with your little hack to dro the "<< DB_CNTRL_OFF" you are in essence
>keeping bits 5 and 6 as DB_TYPE_REMOVE_GLITCH.
>
But the line before the hacked line is,
                 pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);

which will mask out bits 5 and 6. So my little hack essentially disables
the glitch filter.
>So it seems that the problem is that the irq_set_type code changes
>the glitch filter type from DB_TYPE_REMOVE_GLITCH (filter out all
>glitches) to DB_TYPE_PRESERVE_HIGH_GLITCH, which apperently breaks
>things.
>
>To test this you could replace the:
>
>DB_TYPE_PRESERVE_HIGH_GLITCH
>
>bit in the case IRQ_TYPE_LEVEL_LOW path with:
>
>DB_TYPE_REMOVE_GLITCH
>
>Which I would expect to also fix your touchpad.
>
Changing to DB_TYPE_REMOVE_GLITCH doesn't completely fix the
touchpad. The touchpad is not as sensitive as the hack of disabling
the glitch filter, for example, often two fingers touching will
trigger right-mouse action. hid-recorder shows there are duplicate
HID reports being received.

However, if I set the debounce timeout to be 610us, the touchpad would
work flawlessly and no issue of duplicate HID reports.
>If that is the case an interesting experiment would be to
>replace DB_TYPE_PRESERVE_HIGH_GLITCH with
>DB_TYPE_PRESERVE_LOW_GLITCH instead.
>
Changing to DB_TYPE_PRESERVE_LOW_GLITCH could save the touchpad.
Although hid-recorder shows there are also duplicate HID reports, but
the touchpad work flawlessly (at least I couldn't notice any problem).

I also did other experiments and found if we use
DB_TYPE_PRESERVE_HIGH_GLITCH but set the debounce timeout to 610us, this
could save the touchpad.

Btw, based on on the code of set_debounce, I calculated the debounce
timeout set by the BIOS and found the value is 124.8ms. This may explain
why ~7 interrupts are fired when DB_TYPE_PRESERVE_HIGH_GLITCH is used.

I tried to come up with a minimum set of rules to explain all observations
about this GPIO controller,
  - the read value from the register equal to the written value written
    to the register
  - when the touchpad sends signal through its interrupt line to indicate
    arrival of new data, there are multiple cycles of signal debouncing,
    i.e., spurious interrupts would fired
  - When the debounce timeout is set, the GPIO chip will wait for
    specified time to collect enough data to judge if this is a valid
    signal thus eliminating spurious interrupts
  - DB_TYPE_PRESERVE_HIGH_GLITCH is for filtering high input while
    DB_TYPE_PRESERVE_LOW_GLITCH for filtering low input

but obviously the above set of rules could not explain,
  - when debounce filter is disabled, no duplicate HID reports read by
    hid-recorder which indicates no spurious interrupts
  - with DB_TYPE_REMOVE_GLITCH and the default debounce timeout of
    124.8ms, the interrupt fires at a much higher rate than 7Hz

>I've never seen this kinda glitch/debounce filter where
>you can filter out only one type of level before, so
>I wonder if the code maybe simply got it wrong, also for
>a level type irq I really see no objection to just
>use DB_TYPE_REMOVE_GLITCH instead of the weird "half"
>filters.
>
>So I just ran a git blame and the DB_TYPE_PRESERVE_HIGH_GLITCH
>has been there from the very first commit of this driver,
>I wonder if it has been wrong all this time and should be
>inverted (so DB_TYPE_PRESERVE_LOW_GLITCH instead).
>
>I think we may want to just play it safe though and simply
>switch to DB_TYPE_REMOVE_GLITCH as we already do for all
>edge types and when amd_gpio_set_config() gets called!
>
>Linus, what do you think about just switching to
>DB_TYPE_REMOVE_GLITCH for level type irqs (unifying them
>with all the other modes) and not mucking with this weird,
>undocumented "half" filter modes ?
>
>>Or can you point to some references? I've gain some
>>experience about how to configure the GPIO controller by studying the
>>code of pinctrl-amd and pinctrl-baytrail (I can't find the hardware
>>reference manual for baytrail either). I also tweaked the configuration
>>in pinctrl-amd, for example, setting the debounce timeout to 976 usec
>>and 3.9 msec without disabling the glitch filter could also save the
>>touchpad. But I need some knowledge to understand why this touchpad [1]
>>which also uses the buggy pinctrl-amd isn't affected.
>>
>>[1] https://github.com/Syniurge/i2c-amd-mp2/issues/11#issuecomment-707427095
>
>My guess would be that it uses edge types interrupts instead ?
>I have seen that quite a few times, even though it is weird
>to do that for i2c devices.
>
>Regards,
>
>Hans
>

--
Best regards,
Coiby

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-15  3:27                                       ` Coiby Xu
  0 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-15  3:27 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Shyam Sundar S K, Linus Walleij, wang jun,
	open list:GPIO SUBSYSTEM, linux-kernel-mentees, Nehal Shah

On Wed, Oct 14, 2020 at 01:46:14PM +0200, Hans de Goede wrote:
>Hi,
>
>On 10/14/20 6:24 AM, Coiby Xu wrote:
>>On Tue, Oct 06, 2020 at 11:29:40AM +0200, Hans de Goede wrote:
>>>
>>>
>>>On 10/6/20 11:28 AM, Hans de Goede wrote:
>>>>Hi,
>>>>
>>>>On 10/6/20 10:55 AM, Hans de Goede wrote:
>>>>>Hi,
>>>>>
>>>>>On 10/6/20 10:31 AM, Coiby Xu wrote:
>>>>>>On Tue, Oct 06, 2020 at 08:28:40AM +0200, Hans de Goede wrote:
>>>>>>>Hi,
>>>>>>>
>>>>>>>On 10/6/20 6:49 AM, Coiby Xu wrote:
>>>>>>>>Hi Hans and Linus,
>>>>>>>>
>>>>>>>>I've found the direct evidence proving the GPIO interrupt controller is
>>>>>>>>malfunctioning.
>>>>>>>>
>>>>>>>>I've found a way to let the GPIO chip trigger an interrupt by accident
>>>>>>>>when playing with the GPIO sysfs interface,
>>>>>>>>
>>>>>>>> - export pin130 which is used by the touchad
>>>>>>>> - set the direction to be "out"
>>>>>>>> - `echo 0 > value` will trigger the GPIO controller's parent irq and
>>>>>>>>   "echo 1 > value" will make it stop firing
>>>>>>>>
>>>>>>>>(I'm not sure if this is yet another bug of the GPIO chip. Anyway I can
>>>>>>>>manually trigger an interrupt now.)
>>>>>>>>
>>>>>>>>I wrote a C program is to let GPIO controller quickly generate some
>>>>>>>>interrupts then disable the firing of interrupts by toggling pin#130's
>>>>>>>>value with an specified time interval, i.e., set the value to 0 first
>>>>>>>>and then after some time, re-set the value to 1. There is no interrupt
>>>>>>>>firing unless time internal > 120ms (~7Hz). This explains why we can
>>>>>>>>only see 7 interrupts for the GPIO controller's parent irq.
>>>>>>>
>>>>>>>That is a great find, well done.
>>>>>>>
>>>>>>>>My hypothesis is the GPIO doesn't have proper power setting so it stays
>>>>>>>>in an idle state or its clock frequency is too low by default thus not
>>>>>>>>quick enough to read interrupt input. Then pinctrl-amd must miss some
>>>>>>>>code to configure the chip and I need a hardware reference manual of this
>>>>>>>>GPIO chip (HID: AMDI0030) or reverse-engineer the driver for Windows
>>>>>>>>since I couldn't find a copy of reference manual online? What would you
>>>>>>>>suggest?
>>>>>>>
>>>>>>>This sounds like it might have something to do with the glitch filter.
>>>>>>>The code in pinctrl-amd.c to setup the trigger-type also configures
>>>>>>>the glitch filter, you could try changing that code to disable the
>>>>>>>glitch-filter. The defines for setting the glitch-filter bits to
>>>>>>>disabled are already there.
>>>>>>>
>>>>>>
>>>>>>Disabling the glitch filter works like a charm! Other enthusiastic
>>>>>>Linux users who have been troubled by this issue for months would
>>>>>>also feel great to know this small tweaking could bring their
>>>>>>touchpad back to life:) Thank you!
>>>>>
>>>>>That is good to hear, I'm glad that we have finally found a solution.
>>>>>
>>>>>>$ git diff
>>>>>>diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
>>>>>>index 9a760f5cd7ed..e786d779d6c8 100644
>>>>>>--- a/drivers/pinctrl/pinctrl-amd.c
>>>>>>+++ b/drivers/pinctrl/pinctrl-amd.c
>>>>>>@@ -463,7 +463,7 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>>>>>>                 pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>>>>>>                 pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>>>>>>                 pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>>>>>>-               pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>>>>>>+               /** pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; */
>>>>>>                 irq_set_handler_locked(d, handle_level_irq);
>>>>>>                 break;
>>>>>>
>>>>>>I will learn more about the glitch filter and the implementation of
>>>>>>pinctrl and see if I can disable glitch filter only for this touchpad.
>>>>>
>>>>>The glitch filter likely also has settings for how long a glitch
>>>>>lasts, which apparently goes all the way up to 120ms. If it would
>>>>>only delay reporting by say 0.1ms and consider any pulse longer
>>>>>then 0.1s not a glitch, then having it enabled would be fine.
>>>>>
>>>>>I don't think we want some sort of quirk here to only disable the
>>>>>glitch filter for some touchpads. One approach might be to simply
>>>>>disable it completely for level type irqs.
>>>>>
>>>>>What we really need here is some input from AMD engineers with how
>>>>>this is all supposed to work.
>>>>>
>>>>>E.g. maybe the glitch-filter is setup by the BIOS and we should not
>>>>>touch it all ?
>>>>>
>>>>>Or maybe instead of DB_TYPE_PRESERVE_HIGH_GLITCH low level interrupts
>>>>>should use DB_TYPE_PRESERVE_LOW_GLITCH ?   Some docs for the hw
>>>>>would really help here ...
>>>>
>>>>So I've been digging through the history of the pinctrl-amd.c driver
>>>>and once upon a time it used to set a default debounce time of
>>>>2.75 ms.
>>>>
>>>>See the patch generated by doing:
>>>>
>>>>git format-patch 8cf4345575a416e6856a6856ac6eaa31ad883126~..8cf4345575a416e6856a6856ac6eaa31ad883126
>>>>
>>>>In a linux kernel checkout.
>>>>
>>>>So it would be interesting to add a debugging printk to see
>>>>what the value of pin_reg & DB_TMR_OUT_MASK is for the troublesome
>>>>GPIO.
>>>>
>>>>I guess that it might be all 1s (0xfffffffff) or some such which
>>>>might be a way to check that we should disable the glitch-filter
>>>>for this pin?
>>>
>>>p.s.
>>>
>>>Or maybe we should simply stop touching all the glitch-filter
>>>related bits, in the same way as that old commit has already
>>>removed the code setting the timing of the filter ?
>>>
>>>At least is seems that forcing the filter to be on without
>>>sanitizing the de-bounce time is not a good idea.
>>>
>>Today I find an inconsistency in drivers/pinctrl/pinctrl-amd.c
>>so there must be a bug. As far as I can understand pinctrl-amd,
>>"pin_reg & ~DB_CNTRl_MASK" is used to mask out the debouncing
>>feature,
>>
>>static int amd_gpio_set_debounce(struct gpio_chip *gc, unsigned offset,
>>         unsigned debounce)
>>{
>>     ...
>>     if (debounce) {
>>         ...
>>         if (debounce < 61) {
>>             pin_reg |= 1;
>>             pin_reg &= ~BIT(DB_TMR_OUT_UNIT_OFF);
>>             pin_reg &= ~BIT(DB_TMR_LARGE_OFF);
>>         ...
>>         } else if (debounce < 1000000) {
>>             time = debounce / 62500;
>>             pin_reg |= time & DB_TMR_OUT_MASK;
>>             pin_reg |= BIT(DB_TMR_OUT_UNIT_OFF);
>>             pin_reg |= BIT(DB_TMR_LARGE_OFF);
>>         } else {
>>             pin_reg &= ~DB_CNTRl_MASK;
>>             ret = -EINVAL;
>>         }
>>
>>     } else {
>>         ...
>>         pin_reg &= ~DB_CNTRl_MASK;
>>     }
>>     ...
>>}
>>
>>However in amd_gpio_irq_set_type, "ping_reg & ~(DB_CNTRl_MASK << DB_CNTRL_OFF)"
>>is used,
>>
>>static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>>{
>>
>>     ...
>>     case IRQ_TYPE_LEVEL_LOW:
>>         pin_reg |= LEVEL_TRIGGER << LEVEL_TRIG_OFF;
>>         pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>>         pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>>         pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>>         pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>>         irq_set_handler_locked(d, handle_level_irq);
>>         break;
>>     ...
>>}
>>
>>If "pin_reg & ~DB_CNTRl_MASK" is used instead, the touchpad will work
>>flawlessly. So I believe "pin_reg & ~DB_CNTRl_MASK" is the correct way
>>to mask out the debouncing filter and the bug lies in amd_gpio_set_type.
>
>I'm afraid that that is not the case, the current code is correct,
>it clears bit 5 and 6 of the register which are the bits which control
>the debounce type.
>
Thank you for the explanation. As mentioned in another email (that email
was supposed to be delivered much yesterday, but I forgot to run
msmtp-runqueue.sh to send offline emails), this hack led to some issues.
So it must be amd_gpio_set_debounce that makes the mistake of incorrectly
masking out the bits of controlling the debounce type. Btw,
amd_gpio_set_debounce seems to be never used because
"struct acpi_gpio_info" doesn't has the debounce_timeout field. So the
bug has never been exposed.
>You mentioned in an earlier mail that the value of the register is
>0x500e8 before this function runs.
>
>If you drop the "<< DB_CNTRL_OFF" part then instead you are masking out
>bits 0 and 1 which are already 0, so the mask becomes a no-op.
>
>>Btw, can you explain what's the difference between glitch filter and
>>debouncing filter?
>
>There is no difference the driver mixes the terms, but they both refer
>to the same thing this is most clear in the defines for the DB_CNTRL bits
>(bits 5 and 6 of the register):
>
>#define DB_TYPE_NO_DEBOUNCE               0x0UL
>#define DB_TYPE_PRESERVE_LOW_GLITCH       0x1UL
>#define DB_TYPE_PRESERVE_HIGH_GLITCH      0x2UL
>#define DB_TYPE_REMOVE_GLITCH             0x3UL
>
Thank you for the clarification! This makes it much easier to
understanding the behaviour of the GPIO controller.

>Which is interesting because bits 5 and 6 are both 1 as set by the BIOS,
>so with your little hack to dro the "<< DB_CNTRL_OFF" you are in essence
>keeping bits 5 and 6 as DB_TYPE_REMOVE_GLITCH.
>
But the line before the hacked line is,
                 pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);

which will mask out bits 5 and 6. So my little hack essentially disables
the glitch filter.
>So it seems that the problem is that the irq_set_type code changes
>the glitch filter type from DB_TYPE_REMOVE_GLITCH (filter out all
>glitches) to DB_TYPE_PRESERVE_HIGH_GLITCH, which apperently breaks
>things.
>
>To test this you could replace the:
>
>DB_TYPE_PRESERVE_HIGH_GLITCH
>
>bit in the case IRQ_TYPE_LEVEL_LOW path with:
>
>DB_TYPE_REMOVE_GLITCH
>
>Which I would expect to also fix your touchpad.
>
Changing to DB_TYPE_REMOVE_GLITCH doesn't completely fix the
touchpad. The touchpad is not as sensitive as the hack of disabling
the glitch filter, for example, often two fingers touching will
trigger right-mouse action. hid-recorder shows there are duplicate
HID reports being received.

However, if I set the debounce timeout to be 610us, the touchpad would
work flawlessly and no issue of duplicate HID reports.
>If that is the case an interesting experiment would be to
>replace DB_TYPE_PRESERVE_HIGH_GLITCH with
>DB_TYPE_PRESERVE_LOW_GLITCH instead.
>
Changing to DB_TYPE_PRESERVE_LOW_GLITCH could save the touchpad.
Although hid-recorder shows there are also duplicate HID reports, but
the touchpad work flawlessly (at least I couldn't notice any problem).

I also did other experiments and found if we use
DB_TYPE_PRESERVE_HIGH_GLITCH but set the debounce timeout to 610us, this
could save the touchpad.

Btw, based on on the code of set_debounce, I calculated the debounce
timeout set by the BIOS and found the value is 124.8ms. This may explain
why ~7 interrupts are fired when DB_TYPE_PRESERVE_HIGH_GLITCH is used.

I tried to come up with a minimum set of rules to explain all observations
about this GPIO controller,
  - the read value from the register equal to the written value written
    to the register
  - when the touchpad sends signal through its interrupt line to indicate
    arrival of new data, there are multiple cycles of signal debouncing,
    i.e., spurious interrupts would fired
  - When the debounce timeout is set, the GPIO chip will wait for
    specified time to collect enough data to judge if this is a valid
    signal thus eliminating spurious interrupts
  - DB_TYPE_PRESERVE_HIGH_GLITCH is for filtering high input while
    DB_TYPE_PRESERVE_LOW_GLITCH for filtering low input

but obviously the above set of rules could not explain,
  - when debounce filter is disabled, no duplicate HID reports read by
    hid-recorder which indicates no spurious interrupts
  - with DB_TYPE_REMOVE_GLITCH and the default debounce timeout of
    124.8ms, the interrupt fires at a much higher rate than 7Hz

>I've never seen this kinda glitch/debounce filter where
>you can filter out only one type of level before, so
>I wonder if the code maybe simply got it wrong, also for
>a level type irq I really see no objection to just
>use DB_TYPE_REMOVE_GLITCH instead of the weird "half"
>filters.
>
>So I just ran a git blame and the DB_TYPE_PRESERVE_HIGH_GLITCH
>has been there from the very first commit of this driver,
>I wonder if it has been wrong all this time and should be
>inverted (so DB_TYPE_PRESERVE_LOW_GLITCH instead).
>
>I think we may want to just play it safe though and simply
>switch to DB_TYPE_REMOVE_GLITCH as we already do for all
>edge types and when amd_gpio_set_config() gets called!
>
>Linus, what do you think about just switching to
>DB_TYPE_REMOVE_GLITCH for level type irqs (unifying them
>with all the other modes) and not mucking with this weird,
>undocumented "half" filter modes ?
>
>>Or can you point to some references? I've gain some
>>experience about how to configure the GPIO controller by studying the
>>code of pinctrl-amd and pinctrl-baytrail (I can't find the hardware
>>reference manual for baytrail either). I also tweaked the configuration
>>in pinctrl-amd, for example, setting the debounce timeout to 976 usec
>>and 3.9 msec without disabling the glitch filter could also save the
>>touchpad. But I need some knowledge to understand why this touchpad [1]
>>which also uses the buggy pinctrl-amd isn't affected.
>>
>>[1] https://github.com/Syniurge/i2c-amd-mp2/issues/11#issuecomment-707427095
>
>My guess would be that it uses edge types interrupts instead ?
>I have seen that quite a few times, even though it is weird
>to do that for i2c devices.
>
>Regards,
>
>Hans
>

--
Best regards,
Coiby
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-14 11:46                                     ` [Linux-kernel-mentees] " Hans de Goede
@ 2020-10-15  4:06                                       ` Coiby Xu
  -1 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-15  4:06 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Linus Walleij, open list:GPIO SUBSYSTEM, wang jun, Nehal Shah,
	Shyam Sundar S K, linux-kernel-mentees

On Wed, Oct 14, 2020 at 01:46:14PM +0200, Hans de Goede wrote:
>Hi,
>
[...]
>
>I've never seen this kinda glitch/debounce filter where
>you can filter out only one type of level before, so
>I wonder if the code maybe simply got it wrong, also for
>a level type irq I really see no objection to just
>use DB_TYPE_REMOVE_GLITCH instead of the weird "half"
>filters.
>
>So I just ran a git blame and the DB_TYPE_PRESERVE_HIGH_GLITCH
>has been there from the very first commit of this driver,
>I wonder if it has been wrong all this time and should be
>inverted (so DB_TYPE_PRESERVE_LOW_GLITCH instead).
>
>I think we may want to just play it safe though and simply
>switch to DB_TYPE_REMOVE_GLITCH as we already do for all
>edge types and when amd_gpio_set_config() gets called!
>
>Linus, what do you think about just switching to
>DB_TYPE_REMOVE_GLITCH for level type irqs (unifying them
>with all the other modes) and not mucking with this weird,
>undocumented "half" filter modes ?
>
>>Or can you point to some references? I've gain some
>>experience about how to configure the GPIO controller by studying the
>>code of pinctrl-amd and pinctrl-baytrail (I can't find the hardware
>>reference manual for baytrail either). I also tweaked the configuration
>>in pinctrl-amd, for example, setting the debounce timeout to 976 usec
>>and 3.9 msec without disabling the glitch filter could also save the
>>touchpad. But I need some knowledge to understand why this touchpad [1]
>>which also uses the buggy pinctrl-amd isn't affected.
>>
>>[1] https://github.com/Syniurge/i2c-amd-mp2/issues/11#issuecomment-707427095
>
>My guess would be that it uses edge types interrupts instead ?
>I have seen that quite a few times, even though it is weird
>to do that for i2c devices.
>
Actually it uses the level type interrupt according to the shared
DSDT.dsl,

         Device (TPDA)
         {
             Name (_HID, "SYNA2B3F")  // _HID: Hardware ID
             Name (_CID, "PNP0C50" /* HID Protocol Device (I2C bus) */)  // _CID: Compatible ID
             Name (_UID, 0x08)  // _UID: Unique ID
             Method (_CRS, 0, NotSerialized)  // _CRS: Current Resource Settings
             {
                 Name (RBUF, ResourceTemplate ()
                 {
                     I2cSerialBusV2 (0x002C, ControllerInitiated, 0x00061A80,
                         AddressingMode7Bit, "\\_SB.I2CD",
                         0x00, ResourceConsumer, , Exclusive,
                         )
                     GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
                         "\\_SB.GPIO", 0x00, ResourceConsumer, ,
                         )
                         {   // Pin list
                             0x0003
                         }
                 })
                 Return (RBUF) /* \_SB_.I2CD.TPDA._CRS.RBUF */
             }

>Regards,
>
>Hans
>

--
Best regards,
Coiby

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-15  4:06                                       ` Coiby Xu
  0 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-15  4:06 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Shyam Sundar S K, Linus Walleij, wang jun,
	open list:GPIO SUBSYSTEM, linux-kernel-mentees, Nehal Shah

On Wed, Oct 14, 2020 at 01:46:14PM +0200, Hans de Goede wrote:
>Hi,
>
[...]
>
>I've never seen this kinda glitch/debounce filter where
>you can filter out only one type of level before, so
>I wonder if the code maybe simply got it wrong, also for
>a level type irq I really see no objection to just
>use DB_TYPE_REMOVE_GLITCH instead of the weird "half"
>filters.
>
>So I just ran a git blame and the DB_TYPE_PRESERVE_HIGH_GLITCH
>has been there from the very first commit of this driver,
>I wonder if it has been wrong all this time and should be
>inverted (so DB_TYPE_PRESERVE_LOW_GLITCH instead).
>
>I think we may want to just play it safe though and simply
>switch to DB_TYPE_REMOVE_GLITCH as we already do for all
>edge types and when amd_gpio_set_config() gets called!
>
>Linus, what do you think about just switching to
>DB_TYPE_REMOVE_GLITCH for level type irqs (unifying them
>with all the other modes) and not mucking with this weird,
>undocumented "half" filter modes ?
>
>>Or can you point to some references? I've gain some
>>experience about how to configure the GPIO controller by studying the
>>code of pinctrl-amd and pinctrl-baytrail (I can't find the hardware
>>reference manual for baytrail either). I also tweaked the configuration
>>in pinctrl-amd, for example, setting the debounce timeout to 976 usec
>>and 3.9 msec without disabling the glitch filter could also save the
>>touchpad. But I need some knowledge to understand why this touchpad [1]
>>which also uses the buggy pinctrl-amd isn't affected.
>>
>>[1] https://github.com/Syniurge/i2c-amd-mp2/issues/11#issuecomment-707427095
>
>My guess would be that it uses edge types interrupts instead ?
>I have seen that quite a few times, even though it is weird
>to do that for i2c devices.
>
Actually it uses the level type interrupt according to the shared
DSDT.dsl,

         Device (TPDA)
         {
             Name (_HID, "SYNA2B3F")  // _HID: Hardware ID
             Name (_CID, "PNP0C50" /* HID Protocol Device (I2C bus) */)  // _CID: Compatible ID
             Name (_UID, 0x08)  // _UID: Unique ID
             Method (_CRS, 0, NotSerialized)  // _CRS: Current Resource Settings
             {
                 Name (RBUF, ResourceTemplate ()
                 {
                     I2cSerialBusV2 (0x002C, ControllerInitiated, 0x00061A80,
                         AddressingMode7Bit, "\\_SB.I2CD",
                         0x00, ResourceConsumer, , Exclusive,
                         )
                     GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
                         "\\_SB.GPIO", 0x00, ResourceConsumer, ,
                         )
                         {   // Pin list
                             0x0003
                         }
                 })
                 Return (RBUF) /* \_SB_.I2CD.TPDA._CRS.RBUF */
             }

>Regards,
>
>Hans
>

--
Best regards,
Coiby
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-14 11:46                                     ` [Linux-kernel-mentees] " Hans de Goede
@ 2020-10-26 22:54                                       ` Coiby Xu
  -1 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-26 22:54 UTC (permalink / raw)
  To: Hans de Goede, Linus Walleij
  Cc: open list:GPIO SUBSYSTEM, wang jun, Nehal Shah, Shyam Sundar S K,
	linux-kernel-mentees

Hi Hans and Linus,

Will you interpret the 0x0000 value for debounce timeout in GPIO
Interrupt Connection Resource Descriptor as disabling debouncing
filter?

GpioInt (EdgeLevel, ActiveLevel, Shared, PinConfig, DebounceTimeout, ResourceSource,
ResourceSourceIndex, ResourceUsage, DescriptorName, VendorData) {PinList}

I'm not sure if Windows' implementation is the de facto standard like
i2c-hid. But if we are going to conform to the ACPI specs and we would
regard 0x0000 debounce timeout as disabling debouncing filter, then we
can fix this touchpad issue and potentially some related issues by
implementing the feature of supporting configuring debounce timeout in
drivers/gpio/gpiolib-acpi.c and removing all debounce filter
configuration in amd_gpio_irq_set_type of drivers/pinctrl/pinctrl-amd.c.
What do you think?

A favorable evidence is I've collected five DSDT tables when
investigating this issue. All 5 DSDT tables have an GpioInt specifying
an non-zero debounce timeout value for the edge type irq and for all
the level type irq, the debounce timeout is set to 0x0000.


GpioInt in 5 five collected DSDT tables
=======================================

$ grep -rIi -A1 gpioint 1/dsdt.dsl
              GpioInt (Edge, ActiveBoth, SharedAndWake, PullUp, 0x0BB8,
                  "\\_SB.GPIO", 0x00, ResourceConsumer, ,
--
              GpioInt (Level, ActiveLow, ExclusiveAndWake, PullNone, 0x0000,
                  "\\_SB.GPIO", 0x00, ResourceConsumer, _Y24,
--
                      GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
                          "\\_SB.GPIO", 0x00, ResourceConsumer, ,
--
                      GpioInt (Edge, ActiveLow, ExclusiveAndWake, PullNone, 0x0000,
                          "\\_SB.GPIO", 0x00, ResourceConsumer, ,
--
                      GpioInt (Edge, ActiveHigh, SharedAndWake, PullNone, 0x0000,
                          "\\_SB.GPIO", 0x00, ResourceConsumer, ,

$ grep -rIi -A1 gpioint 2/dsdt.dsl
              GpioInt (Edge, ActiveBoth, SharedAndWake, PullUp, 0x0BB8,
                  "\\_SB.GPIO", 0x00, ResourceConsumer, ,
--
                      GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
                          "\\_SB.GPIO", 0x00, ResourceConsumer, ,
--
                      GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
                          "\\_SB.GPIO", 0x00, ResourceConsumer, ,
--
                      GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
                          "\\_SB.GPIO", 0x00, ResourceConsumer, ,
--
                      GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
                          "\\_SB.GPIO", 0x00, ResourceConsumer, ,
--
                      GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
                          "\\_SB.GPIO", 0x00, ResourceConsumer, ,
--
                      GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
                          "\\_SB.GPIO", 0x00, ResourceConsumer, ,


$ grep -rIi -A1 gpioint 3/dsdt.dsl
                      GpioInt (Edge, ActiveBoth, SharedAndWake, PullNone, 0x2710,
                          "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, ,
--
                  GpioInt (Level, ActiveLow, ExclusiveAndWake, PullDefault, 0x0000,
                      "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, ,
--
                  GpioInt (Level, ActiveLow, Exclusive, PullDefault, 0x0000,
                      "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, ,
--
                  GpioInt (Level, ActiveLow, ExclusiveAndWake, PullDefault, 0x0000,
                      "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, ,
--
                  GpioInt (Level, ActiveLow, ExclusiveAndWake, PullDefault, 0x0000,
                      "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, ,
--
                  GpioInt (Level, ActiveLow, Exclusive, PullDefault, 0x0000,
                      "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, ,
--
                  GpioInt (Level, ActiveLow, ExclusiveAndWake, PullDefault, 0x0000,
                      "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, ,
--
                  GpioInt (Level, ActiveLow, Exclusive, PullDefault, 0x0000,
                      "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, ,
--
                  GpioInt (Level, ActiveLow, ExclusiveAndWake, PullDefault, 0x0000,
                      "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, ,
--
                  GpioInt (Level, ActiveLow, Exclusive, PullDefault, 0x0000,
                      "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, ,
--
                      GpioInt (Level, ActiveLow, ExclusiveAndWake, PullDefault, 0x0000,
                          "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, _Y5B,
--
                      GpioInt (Level, ActiveLow, ExclusiveAndWake, PullDefault, 0x0000,
                          "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, _Y5D,

$ grep -rIi -A1 gpioint 4/dsdt.dsl
                      GpioInt (Edge, ActiveBoth, SharedAndWake, PullNone, 0x2710,
                          "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, ,
--
                  GpioInt (Level, ActiveLow, Exclusive, PullDefault, 0x0000,
                      "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, ,
--
                  GpioInt (Level, ActiveLow, ExclusiveAndWake, PullDefault, 0x0000,
                      "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, ,
--
                          GpioInt (Level, ActiveLow, ExclusiveAndWake, PullDefault, 0x0000,
                              "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, _Y3F,
--
                          GpioInt (Level, ActiveLow, ExclusiveAndWake, PullDefault, 0x0000,
                              "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, _Y41,

$ grep -rIi -A1 gpioint 5/dsdt.dsl
              GpioInt (Edge, ActiveBoth, SharedAndWake, PullUp, 0x0BB8,
                  "\\_SB.GPIO", 0x00, ResourceConsumer, ,
--
              GpioInt (Level, ActiveLow, ExclusiveAndWake, PullNone, 0x0000,
                  "\\_SB.GPIO", 0x00, ResourceConsumer, _Y24,
--
                  GpioInt (Level, ActiveLow, Exclusive, PullNone, 0x0000,
                      "\\_SB.GPIO", 0x00, ResourceConsumer, ,
On Wed, Oct 14, 2020 at 01:46:14PM +0200, Hans de Goede wrote:
>Hi,
>
>On 10/14/20 6:24 AM, Coiby Xu wrote:
>>On Tue, Oct 06, 2020 at 11:29:40AM +0200, Hans de Goede wrote:
>>>
>>>
>>>On 10/6/20 11:28 AM, Hans de Goede wrote:
>>>>Hi,
>>>>
>>>>On 10/6/20 10:55 AM, Hans de Goede wrote:
>>>>>Hi,
>>>>>
>>>>>On 10/6/20 10:31 AM, Coiby Xu wrote:
>>>>>>On Tue, Oct 06, 2020 at 08:28:40AM +0200, Hans de Goede wrote:
>>>>>>>Hi,
>>>>>>>
>>>>>>>On 10/6/20 6:49 AM, Coiby Xu wrote:
>>>>>>>>Hi Hans and Linus,
>>>>>>>>
>>>>>>>>I've found the direct evidence proving the GPIO interrupt controller is
>>>>>>>>malfunctioning.
>>>>>>>>
>>>>>>>>I've found a way to let the GPIO chip trigger an interrupt by accident
>>>>>>>>when playing with the GPIO sysfs interface,
>>>>>>>>
>>>>>>>> - export pin130 which is used by the touchad
>>>>>>>> - set the direction to be "out"
>>>>>>>> - `echo 0 > value` will trigger the GPIO controller's parent irq and
>>>>>>>>   "echo 1 > value" will make it stop firing
>>>>>>>>
>>>>>>>>(I'm not sure if this is yet another bug of the GPIO chip. Anyway I can
>>>>>>>>manually trigger an interrupt now.)
>>>>>>>>
>>>>>>>>I wrote a C program is to let GPIO controller quickly generate some
>>>>>>>>interrupts then disable the firing of interrupts by toggling pin#130's
>>>>>>>>value with an specified time interval, i.e., set the value to 0 first
>>>>>>>>and then after some time, re-set the value to 1. There is no interrupt
>>>>>>>>firing unless time internal > 120ms (~7Hz). This explains why we can
>>>>>>>>only see 7 interrupts for the GPIO controller's parent irq.
>>>>>>>
>>>>>>>That is a great find, well done.
>>>>>>>
>>>>>>>>My hypothesis is the GPIO doesn't have proper power setting so it stays
>>>>>>>>in an idle state or its clock frequency is too low by default thus not
>>>>>>>>quick enough to read interrupt input. Then pinctrl-amd must miss some
>>>>>>>>code to configure the chip and I need a hardware reference manual of this
>>>>>>>>GPIO chip (HID: AMDI0030) or reverse-engineer the driver for Windows
>>>>>>>>since I couldn't find a copy of reference manual online? What would you
>>>>>>>>suggest?
>>>>>>>
>>>>>>>This sounds like it might have something to do with the glitch filter.
>>>>>>>The code in pinctrl-amd.c to setup the trigger-type also configures
>>>>>>>the glitch filter, you could try changing that code to disable the
>>>>>>>glitch-filter. The defines for setting the glitch-filter bits to
>>>>>>>disabled are already there.
>>>>>>>
>>>>>>
>>>>>>Disabling the glitch filter works like a charm! Other enthusiastic
>>>>>>Linux users who have been troubled by this issue for months would
>>>>>>also feel great to know this small tweaking could bring their
>>>>>>touchpad back to life:) Thank you!
>>>>>
>>>>>That is good to hear, I'm glad that we have finally found a solution.
>>>>>
>>>>>>$ git diff
>>>>>>diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
>>>>>>index 9a760f5cd7ed..e786d779d6c8 100644
>>>>>>--- a/drivers/pinctrl/pinctrl-amd.c
>>>>>>+++ b/drivers/pinctrl/pinctrl-amd.c
>>>>>>@@ -463,7 +463,7 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>>>>>>                 pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>>>>>>                 pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>>>>>>                 pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>>>>>>-               pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>>>>>>+               /** pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; */
>>>>>>                 irq_set_handler_locked(d, handle_level_irq);
>>>>>>                 break;
>>>>>>
>>>>>>I will learn more about the glitch filter and the implementation of
>>>>>>pinctrl and see if I can disable glitch filter only for this touchpad.
>>>>>
>>>>>The glitch filter likely also has settings for how long a glitch
>>>>>lasts, which apparently goes all the way up to 120ms. If it would
>>>>>only delay reporting by say 0.1ms and consider any pulse longer
>>>>>then 0.1s not a glitch, then having it enabled would be fine.
>>>>>
>>>>>I don't think we want some sort of quirk here to only disable the
>>>>>glitch filter for some touchpads. One approach might be to simply
>>>>>disable it completely for level type irqs.
>>>>>
>>>>>What we really need here is some input from AMD engineers with how
>>>>>this is all supposed to work.
>>>>>
>>>>>E.g. maybe the glitch-filter is setup by the BIOS and we should not
>>>>>touch it all ?
>>>>>
>>>>>Or maybe instead of DB_TYPE_PRESERVE_HIGH_GLITCH low level interrupts
>>>>>should use DB_TYPE_PRESERVE_LOW_GLITCH ?   Some docs for the hw
>>>>>would really help here ...
>>>>
>>>>So I've been digging through the history of the pinctrl-amd.c driver
>>>>and once upon a time it used to set a default debounce time of
>>>>2.75 ms.
>>>>
>>>>See the patch generated by doing:
>>>>
>>>>git format-patch 8cf4345575a416e6856a6856ac6eaa31ad883126~..8cf4345575a416e6856a6856ac6eaa31ad883126
>>>>
>>>>In a linux kernel checkout.
>>>>
>>>>So it would be interesting to add a debugging printk to see
>>>>what the value of pin_reg & DB_TMR_OUT_MASK is for the troublesome
>>>>GPIO.
>>>>
>>>>I guess that it might be all 1s (0xfffffffff) or some such which
>>>>might be a way to check that we should disable the glitch-filter
>>>>for this pin?
>>>
>>>p.s.
>>>
>>>Or maybe we should simply stop touching all the glitch-filter
>>>related bits, in the same way as that old commit has already
>>>removed the code setting the timing of the filter ?
>>>
>>>At least is seems that forcing the filter to be on without
>>>sanitizing the de-bounce time is not a good idea.
>>>
>>Today I find an inconsistency in drivers/pinctrl/pinctrl-amd.c
>>so there must be a bug. As far as I can understand pinctrl-amd,
>>"pin_reg & ~DB_CNTRl_MASK" is used to mask out the debouncing
>>feature,
>>
>>static int amd_gpio_set_debounce(struct gpio_chip *gc, unsigned offset,
>>         unsigned debounce)
>>{
>>     ...
>>     if (debounce) {
>>         ...
>>         if (debounce < 61) {
>>             pin_reg |= 1;
>>             pin_reg &= ~BIT(DB_TMR_OUT_UNIT_OFF);
>>             pin_reg &= ~BIT(DB_TMR_LARGE_OFF);
>>         ...
>>         } else if (debounce < 1000000) {
>>             time = debounce / 62500;
>>             pin_reg |= time & DB_TMR_OUT_MASK;
>>             pin_reg |= BIT(DB_TMR_OUT_UNIT_OFF);
>>             pin_reg |= BIT(DB_TMR_LARGE_OFF);
>>         } else {
>>             pin_reg &= ~DB_CNTRl_MASK;
>>             ret = -EINVAL;
>>         }
>>
>>     } else {
>>         ...
>>         pin_reg &= ~DB_CNTRl_MASK;
>>     }
>>     ...
>>}
>>
>>However in amd_gpio_irq_set_type, "ping_reg & ~(DB_CNTRl_MASK << DB_CNTRL_OFF)"
>>is used,
>>
>>static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>>{
>>
>>     ...
>>     case IRQ_TYPE_LEVEL_LOW:
>>         pin_reg |= LEVEL_TRIGGER << LEVEL_TRIG_OFF;
>>         pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>>         pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>>         pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>>         pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>>         irq_set_handler_locked(d, handle_level_irq);
>>         break;
>>     ...
>>}
>>
>>If "pin_reg & ~DB_CNTRl_MASK" is used instead, the touchpad will work
>>flawlessly. So I believe "pin_reg & ~DB_CNTRl_MASK" is the correct way
>>to mask out the debouncing filter and the bug lies in amd_gpio_set_type.
>
>I'm afraid that that is not the case, the current code is correct,
>it clears bit 5 and 6 of the register which are the bits which control
>the debounce type.
>
>You mentioned in an earlier mail that the value of the register is
>0x500e8 before this function runs.
>
>If you drop the "<< DB_CNTRL_OFF" part then instead you are masking out
>bits 0 and 1 which are already 0, so the mask becomes a no-op.
>
>>Btw, can you explain what's the difference between glitch filter and
>>debouncing filter?
>
>There is no difference the driver mixes the terms, but they both refer
>to the same thing this is most clear in the defines for the DB_CNTRL bits
>(bits 5 and 6 of the register):
>
>#define DB_TYPE_NO_DEBOUNCE               0x0UL
>#define DB_TYPE_PRESERVE_LOW_GLITCH       0x1UL
>#define DB_TYPE_PRESERVE_HIGH_GLITCH      0x2UL
>#define DB_TYPE_REMOVE_GLITCH             0x3UL
>
>Which is interesting because bits 5 and 6 are both 1 as set by the BIOS,
>so with your little hack to dro the "<< DB_CNTRL_OFF" you are in essence
>keeping bits 5 and 6 as DB_TYPE_REMOVE_GLITCH.
>
>So it seems that the problem is that the irq_set_type code changes
>the glitch filter type from DB_TYPE_REMOVE_GLITCH (filter out all
>glitches) to DB_TYPE_PRESERVE_HIGH_GLITCH, which apperently breaks
>things.
>
>To test this you could replace the:
>
>DB_TYPE_PRESERVE_HIGH_GLITCH
>
>bit in the case IRQ_TYPE_LEVEL_LOW path with:
>
>DB_TYPE_REMOVE_GLITCH
>
>Which I would expect to also fix your touchpad.
>
>If that is the case an interesting experiment would be to
>replace DB_TYPE_PRESERVE_HIGH_GLITCH with
>DB_TYPE_PRESERVE_LOW_GLITCH instead.
>
>I've never seen this kinda glitch/debounce filter where
>you can filter out only one type of level before, so
>I wonder if the code maybe simply got it wrong, also for
>a level type irq I really see no objection to just
>use DB_TYPE_REMOVE_GLITCH instead of the weird "half"
>filters.
>
>So I just ran a git blame and the DB_TYPE_PRESERVE_HIGH_GLITCH
>has been there from the very first commit of this driver,
>I wonder if it has been wrong all this time and should be
>inverted (so DB_TYPE_PRESERVE_LOW_GLITCH instead).
>
>I think we may want to just play it safe though and simply
>switch to DB_TYPE_REMOVE_GLITCH as we already do for all
>edge types and when amd_gpio_set_config() gets called!
>
>Linus, what do you think about just switching to
>DB_TYPE_REMOVE_GLITCH for level type irqs (unifying them
>with all the other modes) and not mucking with this weird,
>undocumented "half" filter modes ?
>
>>Or can you point to some references? I've gain some
>>experience about how to configure the GPIO controller by studying the
>>code of pinctrl-amd and pinctrl-baytrail (I can't find the hardware
>>reference manual for baytrail either). I also tweaked the configuration
>>in pinctrl-amd, for example, setting the debounce timeout to 976 usec
>>and 3.9 msec without disabling the glitch filter could also save the
>>touchpad. But I need some knowledge to understand why this touchpad [1]
>>which also uses the buggy pinctrl-amd isn't affected.
>>
>>[1] https://github.com/Syniurge/i2c-amd-mp2/issues/11#issuecomment-707427095
>
>My guess would be that it uses edge types interrupts instead ?
>I have seen that quite a few times, even though it is weird
>to do that for i2c devices.
>
>Regards,
>
>Hans
>

--
Best regards,
Coiby

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-26 22:54                                       ` Coiby Xu
  0 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-26 22:54 UTC (permalink / raw)
  To: Hans de Goede, Linus Walleij
  Cc: wang jun, open list:GPIO SUBSYSTEM, Shyam Sundar S K, Nehal Shah,
	linux-kernel-mentees

Hi Hans and Linus,

Will you interpret the 0x0000 value for debounce timeout in GPIO
Interrupt Connection Resource Descriptor as disabling debouncing
filter?

GpioInt (EdgeLevel, ActiveLevel, Shared, PinConfig, DebounceTimeout, ResourceSource,
ResourceSourceIndex, ResourceUsage, DescriptorName, VendorData) {PinList}

I'm not sure if Windows' implementation is the de facto standard like
i2c-hid. But if we are going to conform to the ACPI specs and we would
regard 0x0000 debounce timeout as disabling debouncing filter, then we
can fix this touchpad issue and potentially some related issues by
implementing the feature of supporting configuring debounce timeout in
drivers/gpio/gpiolib-acpi.c and removing all debounce filter
configuration in amd_gpio_irq_set_type of drivers/pinctrl/pinctrl-amd.c.
What do you think?

A favorable evidence is I've collected five DSDT tables when
investigating this issue. All 5 DSDT tables have an GpioInt specifying
an non-zero debounce timeout value for the edge type irq and for all
the level type irq, the debounce timeout is set to 0x0000.


GpioInt in 5 five collected DSDT tables
=======================================

$ grep -rIi -A1 gpioint 1/dsdt.dsl
              GpioInt (Edge, ActiveBoth, SharedAndWake, PullUp, 0x0BB8,
                  "\\_SB.GPIO", 0x00, ResourceConsumer, ,
--
              GpioInt (Level, ActiveLow, ExclusiveAndWake, PullNone, 0x0000,
                  "\\_SB.GPIO", 0x00, ResourceConsumer, _Y24,
--
                      GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
                          "\\_SB.GPIO", 0x00, ResourceConsumer, ,
--
                      GpioInt (Edge, ActiveLow, ExclusiveAndWake, PullNone, 0x0000,
                          "\\_SB.GPIO", 0x00, ResourceConsumer, ,
--
                      GpioInt (Edge, ActiveHigh, SharedAndWake, PullNone, 0x0000,
                          "\\_SB.GPIO", 0x00, ResourceConsumer, ,

$ grep -rIi -A1 gpioint 2/dsdt.dsl
              GpioInt (Edge, ActiveBoth, SharedAndWake, PullUp, 0x0BB8,
                  "\\_SB.GPIO", 0x00, ResourceConsumer, ,
--
                      GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
                          "\\_SB.GPIO", 0x00, ResourceConsumer, ,
--
                      GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
                          "\\_SB.GPIO", 0x00, ResourceConsumer, ,
--
                      GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
                          "\\_SB.GPIO", 0x00, ResourceConsumer, ,
--
                      GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
                          "\\_SB.GPIO", 0x00, ResourceConsumer, ,
--
                      GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
                          "\\_SB.GPIO", 0x00, ResourceConsumer, ,
--
                      GpioInt (Level, ActiveLow, ExclusiveAndWake, PullUp, 0x0000,
                          "\\_SB.GPIO", 0x00, ResourceConsumer, ,


$ grep -rIi -A1 gpioint 3/dsdt.dsl
                      GpioInt (Edge, ActiveBoth, SharedAndWake, PullNone, 0x2710,
                          "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, ,
--
                  GpioInt (Level, ActiveLow, ExclusiveAndWake, PullDefault, 0x0000,
                      "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, ,
--
                  GpioInt (Level, ActiveLow, Exclusive, PullDefault, 0x0000,
                      "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, ,
--
                  GpioInt (Level, ActiveLow, ExclusiveAndWake, PullDefault, 0x0000,
                      "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, ,
--
                  GpioInt (Level, ActiveLow, ExclusiveAndWake, PullDefault, 0x0000,
                      "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, ,
--
                  GpioInt (Level, ActiveLow, Exclusive, PullDefault, 0x0000,
                      "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, ,
--
                  GpioInt (Level, ActiveLow, ExclusiveAndWake, PullDefault, 0x0000,
                      "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, ,
--
                  GpioInt (Level, ActiveLow, Exclusive, PullDefault, 0x0000,
                      "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, ,
--
                  GpioInt (Level, ActiveLow, ExclusiveAndWake, PullDefault, 0x0000,
                      "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, ,
--
                  GpioInt (Level, ActiveLow, Exclusive, PullDefault, 0x0000,
                      "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, ,
--
                      GpioInt (Level, ActiveLow, ExclusiveAndWake, PullDefault, 0x0000,
                          "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, _Y5B,
--
                      GpioInt (Level, ActiveLow, ExclusiveAndWake, PullDefault, 0x0000,
                          "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, _Y5D,

$ grep -rIi -A1 gpioint 4/dsdt.dsl
                      GpioInt (Edge, ActiveBoth, SharedAndWake, PullNone, 0x2710,
                          "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, ,
--
                  GpioInt (Level, ActiveLow, Exclusive, PullDefault, 0x0000,
                      "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, ,
--
                  GpioInt (Level, ActiveLow, ExclusiveAndWake, PullDefault, 0x0000,
                      "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, ,
--
                          GpioInt (Level, ActiveLow, ExclusiveAndWake, PullDefault, 0x0000,
                              "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, _Y3F,
--
                          GpioInt (Level, ActiveLow, ExclusiveAndWake, PullDefault, 0x0000,
                              "\\_SB.PCI0.GPI0", 0x00, ResourceConsumer, _Y41,

$ grep -rIi -A1 gpioint 5/dsdt.dsl
              GpioInt (Edge, ActiveBoth, SharedAndWake, PullUp, 0x0BB8,
                  "\\_SB.GPIO", 0x00, ResourceConsumer, ,
--
              GpioInt (Level, ActiveLow, ExclusiveAndWake, PullNone, 0x0000,
                  "\\_SB.GPIO", 0x00, ResourceConsumer, _Y24,
--
                  GpioInt (Level, ActiveLow, Exclusive, PullNone, 0x0000,
                      "\\_SB.GPIO", 0x00, ResourceConsumer, ,
On Wed, Oct 14, 2020 at 01:46:14PM +0200, Hans de Goede wrote:
>Hi,
>
>On 10/14/20 6:24 AM, Coiby Xu wrote:
>>On Tue, Oct 06, 2020 at 11:29:40AM +0200, Hans de Goede wrote:
>>>
>>>
>>>On 10/6/20 11:28 AM, Hans de Goede wrote:
>>>>Hi,
>>>>
>>>>On 10/6/20 10:55 AM, Hans de Goede wrote:
>>>>>Hi,
>>>>>
>>>>>On 10/6/20 10:31 AM, Coiby Xu wrote:
>>>>>>On Tue, Oct 06, 2020 at 08:28:40AM +0200, Hans de Goede wrote:
>>>>>>>Hi,
>>>>>>>
>>>>>>>On 10/6/20 6:49 AM, Coiby Xu wrote:
>>>>>>>>Hi Hans and Linus,
>>>>>>>>
>>>>>>>>I've found the direct evidence proving the GPIO interrupt controller is
>>>>>>>>malfunctioning.
>>>>>>>>
>>>>>>>>I've found a way to let the GPIO chip trigger an interrupt by accident
>>>>>>>>when playing with the GPIO sysfs interface,
>>>>>>>>
>>>>>>>> - export pin130 which is used by the touchad
>>>>>>>> - set the direction to be "out"
>>>>>>>> - `echo 0 > value` will trigger the GPIO controller's parent irq and
>>>>>>>>   "echo 1 > value" will make it stop firing
>>>>>>>>
>>>>>>>>(I'm not sure if this is yet another bug of the GPIO chip. Anyway I can
>>>>>>>>manually trigger an interrupt now.)
>>>>>>>>
>>>>>>>>I wrote a C program is to let GPIO controller quickly generate some
>>>>>>>>interrupts then disable the firing of interrupts by toggling pin#130's
>>>>>>>>value with an specified time interval, i.e., set the value to 0 first
>>>>>>>>and then after some time, re-set the value to 1. There is no interrupt
>>>>>>>>firing unless time internal > 120ms (~7Hz). This explains why we can
>>>>>>>>only see 7 interrupts for the GPIO controller's parent irq.
>>>>>>>
>>>>>>>That is a great find, well done.
>>>>>>>
>>>>>>>>My hypothesis is the GPIO doesn't have proper power setting so it stays
>>>>>>>>in an idle state or its clock frequency is too low by default thus not
>>>>>>>>quick enough to read interrupt input. Then pinctrl-amd must miss some
>>>>>>>>code to configure the chip and I need a hardware reference manual of this
>>>>>>>>GPIO chip (HID: AMDI0030) or reverse-engineer the driver for Windows
>>>>>>>>since I couldn't find a copy of reference manual online? What would you
>>>>>>>>suggest?
>>>>>>>
>>>>>>>This sounds like it might have something to do with the glitch filter.
>>>>>>>The code in pinctrl-amd.c to setup the trigger-type also configures
>>>>>>>the glitch filter, you could try changing that code to disable the
>>>>>>>glitch-filter. The defines for setting the glitch-filter bits to
>>>>>>>disabled are already there.
>>>>>>>
>>>>>>
>>>>>>Disabling the glitch filter works like a charm! Other enthusiastic
>>>>>>Linux users who have been troubled by this issue for months would
>>>>>>also feel great to know this small tweaking could bring their
>>>>>>touchpad back to life:) Thank you!
>>>>>
>>>>>That is good to hear, I'm glad that we have finally found a solution.
>>>>>
>>>>>>$ git diff
>>>>>>diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
>>>>>>index 9a760f5cd7ed..e786d779d6c8 100644
>>>>>>--- a/drivers/pinctrl/pinctrl-amd.c
>>>>>>+++ b/drivers/pinctrl/pinctrl-amd.c
>>>>>>@@ -463,7 +463,7 @@ static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>>>>>>                 pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>>>>>>                 pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>>>>>>                 pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>>>>>>-               pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>>>>>>+               /** pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF; */
>>>>>>                 irq_set_handler_locked(d, handle_level_irq);
>>>>>>                 break;
>>>>>>
>>>>>>I will learn more about the glitch filter and the implementation of
>>>>>>pinctrl and see if I can disable glitch filter only for this touchpad.
>>>>>
>>>>>The glitch filter likely also has settings for how long a glitch
>>>>>lasts, which apparently goes all the way up to 120ms. If it would
>>>>>only delay reporting by say 0.1ms and consider any pulse longer
>>>>>then 0.1s not a glitch, then having it enabled would be fine.
>>>>>
>>>>>I don't think we want some sort of quirk here to only disable the
>>>>>glitch filter for some touchpads. One approach might be to simply
>>>>>disable it completely for level type irqs.
>>>>>
>>>>>What we really need here is some input from AMD engineers with how
>>>>>this is all supposed to work.
>>>>>
>>>>>E.g. maybe the glitch-filter is setup by the BIOS and we should not
>>>>>touch it all ?
>>>>>
>>>>>Or maybe instead of DB_TYPE_PRESERVE_HIGH_GLITCH low level interrupts
>>>>>should use DB_TYPE_PRESERVE_LOW_GLITCH ?   Some docs for the hw
>>>>>would really help here ...
>>>>
>>>>So I've been digging through the history of the pinctrl-amd.c driver
>>>>and once upon a time it used to set a default debounce time of
>>>>2.75 ms.
>>>>
>>>>See the patch generated by doing:
>>>>
>>>>git format-patch 8cf4345575a416e6856a6856ac6eaa31ad883126~..8cf4345575a416e6856a6856ac6eaa31ad883126
>>>>
>>>>In a linux kernel checkout.
>>>>
>>>>So it would be interesting to add a debugging printk to see
>>>>what the value of pin_reg & DB_TMR_OUT_MASK is for the troublesome
>>>>GPIO.
>>>>
>>>>I guess that it might be all 1s (0xfffffffff) or some such which
>>>>might be a way to check that we should disable the glitch-filter
>>>>for this pin?
>>>
>>>p.s.
>>>
>>>Or maybe we should simply stop touching all the glitch-filter
>>>related bits, in the same way as that old commit has already
>>>removed the code setting the timing of the filter ?
>>>
>>>At least is seems that forcing the filter to be on without
>>>sanitizing the de-bounce time is not a good idea.
>>>
>>Today I find an inconsistency in drivers/pinctrl/pinctrl-amd.c
>>so there must be a bug. As far as I can understand pinctrl-amd,
>>"pin_reg & ~DB_CNTRl_MASK" is used to mask out the debouncing
>>feature,
>>
>>static int amd_gpio_set_debounce(struct gpio_chip *gc, unsigned offset,
>>         unsigned debounce)
>>{
>>     ...
>>     if (debounce) {
>>         ...
>>         if (debounce < 61) {
>>             pin_reg |= 1;
>>             pin_reg &= ~BIT(DB_TMR_OUT_UNIT_OFF);
>>             pin_reg &= ~BIT(DB_TMR_LARGE_OFF);
>>         ...
>>         } else if (debounce < 1000000) {
>>             time = debounce / 62500;
>>             pin_reg |= time & DB_TMR_OUT_MASK;
>>             pin_reg |= BIT(DB_TMR_OUT_UNIT_OFF);
>>             pin_reg |= BIT(DB_TMR_LARGE_OFF);
>>         } else {
>>             pin_reg &= ~DB_CNTRl_MASK;
>>             ret = -EINVAL;
>>         }
>>
>>     } else {
>>         ...
>>         pin_reg &= ~DB_CNTRl_MASK;
>>     }
>>     ...
>>}
>>
>>However in amd_gpio_irq_set_type, "ping_reg & ~(DB_CNTRl_MASK << DB_CNTRL_OFF)"
>>is used,
>>
>>static int amd_gpio_irq_set_type(struct irq_data *d, unsigned int type)
>>{
>>
>>     ...
>>     case IRQ_TYPE_LEVEL_LOW:
>>         pin_reg |= LEVEL_TRIGGER << LEVEL_TRIG_OFF;
>>         pin_reg &= ~(ACTIVE_LEVEL_MASK << ACTIVE_LEVEL_OFF);
>>         pin_reg |= ACTIVE_LOW << ACTIVE_LEVEL_OFF;
>>         pin_reg &= ~(DB_CNTRl_MASK << DB_CNTRL_OFF);
>>         pin_reg |= DB_TYPE_PRESERVE_HIGH_GLITCH << DB_CNTRL_OFF;
>>         irq_set_handler_locked(d, handle_level_irq);
>>         break;
>>     ...
>>}
>>
>>If "pin_reg & ~DB_CNTRl_MASK" is used instead, the touchpad will work
>>flawlessly. So I believe "pin_reg & ~DB_CNTRl_MASK" is the correct way
>>to mask out the debouncing filter and the bug lies in amd_gpio_set_type.
>
>I'm afraid that that is not the case, the current code is correct,
>it clears bit 5 and 6 of the register which are the bits which control
>the debounce type.
>
>You mentioned in an earlier mail that the value of the register is
>0x500e8 before this function runs.
>
>If you drop the "<< DB_CNTRL_OFF" part then instead you are masking out
>bits 0 and 1 which are already 0, so the mask becomes a no-op.
>
>>Btw, can you explain what's the difference between glitch filter and
>>debouncing filter?
>
>There is no difference the driver mixes the terms, but they both refer
>to the same thing this is most clear in the defines for the DB_CNTRL bits
>(bits 5 and 6 of the register):
>
>#define DB_TYPE_NO_DEBOUNCE               0x0UL
>#define DB_TYPE_PRESERVE_LOW_GLITCH       0x1UL
>#define DB_TYPE_PRESERVE_HIGH_GLITCH      0x2UL
>#define DB_TYPE_REMOVE_GLITCH             0x3UL
>
>Which is interesting because bits 5 and 6 are both 1 as set by the BIOS,
>so with your little hack to dro the "<< DB_CNTRL_OFF" you are in essence
>keeping bits 5 and 6 as DB_TYPE_REMOVE_GLITCH.
>
>So it seems that the problem is that the irq_set_type code changes
>the glitch filter type from DB_TYPE_REMOVE_GLITCH (filter out all
>glitches) to DB_TYPE_PRESERVE_HIGH_GLITCH, which apperently breaks
>things.
>
>To test this you could replace the:
>
>DB_TYPE_PRESERVE_HIGH_GLITCH
>
>bit in the case IRQ_TYPE_LEVEL_LOW path with:
>
>DB_TYPE_REMOVE_GLITCH
>
>Which I would expect to also fix your touchpad.
>
>If that is the case an interesting experiment would be to
>replace DB_TYPE_PRESERVE_HIGH_GLITCH with
>DB_TYPE_PRESERVE_LOW_GLITCH instead.
>
>I've never seen this kinda glitch/debounce filter where
>you can filter out only one type of level before, so
>I wonder if the code maybe simply got it wrong, also for
>a level type irq I really see no objection to just
>use DB_TYPE_REMOVE_GLITCH instead of the weird "half"
>filters.
>
>So I just ran a git blame and the DB_TYPE_PRESERVE_HIGH_GLITCH
>has been there from the very first commit of this driver,
>I wonder if it has been wrong all this time and should be
>inverted (so DB_TYPE_PRESERVE_LOW_GLITCH instead).
>
>I think we may want to just play it safe though and simply
>switch to DB_TYPE_REMOVE_GLITCH as we already do for all
>edge types and when amd_gpio_set_config() gets called!
>
>Linus, what do you think about just switching to
>DB_TYPE_REMOVE_GLITCH for level type irqs (unifying them
>with all the other modes) and not mucking with this weird,
>undocumented "half" filter modes ?
>
>>Or can you point to some references? I've gain some
>>experience about how to configure the GPIO controller by studying the
>>code of pinctrl-amd and pinctrl-baytrail (I can't find the hardware
>>reference manual for baytrail either). I also tweaked the configuration
>>in pinctrl-amd, for example, setting the debounce timeout to 976 usec
>>and 3.9 msec without disabling the glitch filter could also save the
>>touchpad. But I need some knowledge to understand why this touchpad [1]
>>which also uses the buggy pinctrl-amd isn't affected.
>>
>>[1] https://github.com/Syniurge/i2c-amd-mp2/issues/11#issuecomment-707427095
>
>My guess would be that it uses edge types interrupts instead ?
>I have seen that quite a few times, even though it is weird
>to do that for i2c devices.
>
>Regards,
>
>Hans
>

--
Best regards,
Coiby
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-26 22:54                                       ` [Linux-kernel-mentees] " Coiby Xu
@ 2020-10-27  9:52                                         ` Andy Shevchenko
  -1 siblings, 0 replies; 84+ messages in thread
From: Andy Shevchenko @ 2020-10-27  9:52 UTC (permalink / raw)
  To: Coiby Xu
  Cc: Hans de Goede, Linus Walleij, open list:GPIO SUBSYSTEM, wang jun,
	Nehal Shah, Shyam Sundar S K, linux-kernel-mentees

On Tue, Oct 27, 2020 at 2:07 AM Coiby Xu <coiby.xu@gmail.com> wrote:
>
> Hi Hans and Linus,
>
> Will you interpret the 0x0000 value for debounce timeout in GPIO
> Interrupt Connection Resource Descriptor as disabling debouncing
> filter?
>
> GpioInt (EdgeLevel, ActiveLevel, Shared, PinConfig, DebounceTimeout, ResourceSource,
> ResourceSourceIndex, ResourceUsage, DescriptorName, VendorData) {PinList}

According to the spec

DebounceTimeout is an optional argument specifying the debounce wait
time, in hundredths of
milliseconds. The bit field name _DBT is automatically created to
refer to this portion of the resource
descriptor.

I interpret this as 0 == no debounce (or a minimum that hardware has
if there is no possibility to disable).

> I'm not sure if Windows' implementation is the de facto standard like
> i2c-hid. But if we are going to conform to the ACPI specs and we would
> regard 0x0000 debounce timeout as disabling debouncing filter, then we
> can fix this touchpad issue and potentially some related issues by
> implementing the feature of supporting configuring debounce timeout in
> drivers/gpio/gpiolib-acpi.c and removing all debounce filter
> configuration in amd_gpio_irq_set_type of drivers/pinctrl/pinctrl-amd.c.
> What do you think?
>
> A favorable evidence is I've collected five DSDT tables when
> investigating this issue. All 5 DSDT tables have an GpioInt specifying
> an non-zero debounce timeout value for the edge type irq and for all
> the level type irq, the debounce timeout is set to 0x0000.

To the future mails: please, do not top-post.
And please remove a huge amount of unrelated lines in the reply.

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-27  9:52                                         ` Andy Shevchenko
  0 siblings, 0 replies; 84+ messages in thread
From: Andy Shevchenko @ 2020-10-27  9:52 UTC (permalink / raw)
  To: Coiby Xu
  Cc: Shyam Sundar S K, open list:GPIO SUBSYSTEM, Linus Walleij,
	wang jun, Hans de Goede, linux-kernel-mentees, Nehal Shah

On Tue, Oct 27, 2020 at 2:07 AM Coiby Xu <coiby.xu@gmail.com> wrote:
>
> Hi Hans and Linus,
>
> Will you interpret the 0x0000 value for debounce timeout in GPIO
> Interrupt Connection Resource Descriptor as disabling debouncing
> filter?
>
> GpioInt (EdgeLevel, ActiveLevel, Shared, PinConfig, DebounceTimeout, ResourceSource,
> ResourceSourceIndex, ResourceUsage, DescriptorName, VendorData) {PinList}

According to the spec

DebounceTimeout is an optional argument specifying the debounce wait
time, in hundredths of
milliseconds. The bit field name _DBT is automatically created to
refer to this portion of the resource
descriptor.

I interpret this as 0 == no debounce (or a minimum that hardware has
if there is no possibility to disable).

> I'm not sure if Windows' implementation is the de facto standard like
> i2c-hid. But if we are going to conform to the ACPI specs and we would
> regard 0x0000 debounce timeout as disabling debouncing filter, then we
> can fix this touchpad issue and potentially some related issues by
> implementing the feature of supporting configuring debounce timeout in
> drivers/gpio/gpiolib-acpi.c and removing all debounce filter
> configuration in amd_gpio_irq_set_type of drivers/pinctrl/pinctrl-amd.c.
> What do you think?
>
> A favorable evidence is I've collected five DSDT tables when
> investigating this issue. All 5 DSDT tables have an GpioInt specifying
> an non-zero debounce timeout value for the edge type irq and for all
> the level type irq, the debounce timeout is set to 0x0000.

To the future mails: please, do not top-post.
And please remove a huge amount of unrelated lines in the reply.

-- 
With Best Regards,
Andy Shevchenko
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-26 22:54                                       ` [Linux-kernel-mentees] " Coiby Xu
@ 2020-10-27 10:09                                         ` Hans de Goede
  -1 siblings, 0 replies; 84+ messages in thread
From: Hans de Goede @ 2020-10-27 10:09 UTC (permalink / raw)
  To: Coiby Xu, Linus Walleij
  Cc: open list:GPIO SUBSYSTEM, wang jun, Nehal Shah, Shyam Sundar S K,
	linux-kernel-mentees

Hi,

On 10/26/20 11:54 PM, Coiby Xu wrote:
> Hi Hans and Linus,
> 
> Will you interpret the 0x0000 value for debounce timeout in GPIO
> Interrupt Connection Resource Descriptor as disabling debouncing
> filter?
> 
> GpioInt (EdgeLevel, ActiveLevel, Shared, PinConfig, DebounceTimeout, ResourceSource,
> ResourceSourceIndex, ResourceUsage, DescriptorName, VendorData) {PinList}
> 
> I'm not sure if Windows' implementation is the de facto standard like
> i2c-hid. But if we are going to conform to the ACPI specs and we would
> regard 0x0000 debounce timeout as disabling debouncing filter, then we
> can fix this touchpad issue and potentially some related issues by
> implementing the feature of supporting configuring debounce timeout in
> drivers/gpio/gpiolib-acpi.c and removing all debounce filter
> configuration in amd_gpio_irq_set_type of drivers/pinctrl/pinctrl-amd.c.
> What do you think?
> 
> A favorable evidence is I've collected five DSDT tables when
> investigating this issue. All 5 DSDT tables have an GpioInt specifying
> an non-zero debounce timeout value for the edge type irq and for all
> the level type irq, the debounce timeout is set to 0x0000.

That is a very interesting observation and this matches with my
instincts which say that we should just disable the debounce filter
for level triggered interrupts in pinctrl-amd.c

Yes that is a bit of a shortcut vs reading the valie from the ACPI
table, but I'm not sure that 0 always means disabled.

Specifically the ACPI 6.2 spec also has a notion of pinconf settings
and the docs on "PinConfig()"  say:

Note: There is some overlap between the properties set by GpioIo/GpioInt/ PinFunction and
PinConfig descriptors. For example, both are setting properties such as pull-ups. If the same
property is specified by multiple descriptors for the same pins, the order in which these properties
are applied is undetermined. To avoid any conflicts, GpioInt/GpioIo/PinFunction should provide a
default value for these properties when PinConfig is used. If PinConfig is used to set pin bias,
PullDefault should be used for GpioIo/GpioInt/ PinFunction. *If PinConfig is used to set debounce
timeout, 0 should be used for GpioIo/GpioInt.*

So that suggests that a value of 0 does not necessarily mean "disabled" but
it means use a default, or possibly get the value from somewhere else such
as from a ACPI PinConfig description (if present).

So I see 2 ways to move forward with his:

1. Just disable the debounce filter for level type IRQs; or
2. Add a helper to sanitize the debounce pulse-duration setting and
   call that when setting the IRQ type.
   This helper would read the setting check it is not crazy long for
   an IRQ-line (lets say anything above 1 ms is crazy long) and if it
   is crazy long then overwrite it with a saner value.

2. is a bit tricky, because if the IRQ line comes from a chip then
obviously max 1ms debouncing to catch eletrical interference should be
fine. But sometimes cheap buttons for things like volume up/down on tablets
are directly connected to GPIOs and then we may want longer debouncing...

So if we do 2. we may want to limit it to only level type IRQs too.

Note I have contacted AMD about this and asked them for some input on this,
ideally they can tell us how exactly we should program the debounce filter
and based on which data we should do that.

Regards,

Hans


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-27 10:09                                         ` Hans de Goede
  0 siblings, 0 replies; 84+ messages in thread
From: Hans de Goede @ 2020-10-27 10:09 UTC (permalink / raw)
  To: Coiby Xu, Linus Walleij
  Cc: wang jun, open list:GPIO SUBSYSTEM, Shyam Sundar S K, Nehal Shah,
	linux-kernel-mentees

Hi,

On 10/26/20 11:54 PM, Coiby Xu wrote:
> Hi Hans and Linus,
> 
> Will you interpret the 0x0000 value for debounce timeout in GPIO
> Interrupt Connection Resource Descriptor as disabling debouncing
> filter?
> 
> GpioInt (EdgeLevel, ActiveLevel, Shared, PinConfig, DebounceTimeout, ResourceSource,
> ResourceSourceIndex, ResourceUsage, DescriptorName, VendorData) {PinList}
> 
> I'm not sure if Windows' implementation is the de facto standard like
> i2c-hid. But if we are going to conform to the ACPI specs and we would
> regard 0x0000 debounce timeout as disabling debouncing filter, then we
> can fix this touchpad issue and potentially some related issues by
> implementing the feature of supporting configuring debounce timeout in
> drivers/gpio/gpiolib-acpi.c and removing all debounce filter
> configuration in amd_gpio_irq_set_type of drivers/pinctrl/pinctrl-amd.c.
> What do you think?
> 
> A favorable evidence is I've collected five DSDT tables when
> investigating this issue. All 5 DSDT tables have an GpioInt specifying
> an non-zero debounce timeout value for the edge type irq and for all
> the level type irq, the debounce timeout is set to 0x0000.

That is a very interesting observation and this matches with my
instincts which say that we should just disable the debounce filter
for level triggered interrupts in pinctrl-amd.c

Yes that is a bit of a shortcut vs reading the valie from the ACPI
table, but I'm not sure that 0 always means disabled.

Specifically the ACPI 6.2 spec also has a notion of pinconf settings
and the docs on "PinConfig()"  say:

Note: There is some overlap between the properties set by GpioIo/GpioInt/ PinFunction and
PinConfig descriptors. For example, both are setting properties such as pull-ups. If the same
property is specified by multiple descriptors for the same pins, the order in which these properties
are applied is undetermined. To avoid any conflicts, GpioInt/GpioIo/PinFunction should provide a
default value for these properties when PinConfig is used. If PinConfig is used to set pin bias,
PullDefault should be used for GpioIo/GpioInt/ PinFunction. *If PinConfig is used to set debounce
timeout, 0 should be used for GpioIo/GpioInt.*

So that suggests that a value of 0 does not necessarily mean "disabled" but
it means use a default, or possibly get the value from somewhere else such
as from a ACPI PinConfig description (if present).

So I see 2 ways to move forward with his:

1. Just disable the debounce filter for level type IRQs; or
2. Add a helper to sanitize the debounce pulse-duration setting and
   call that when setting the IRQ type.
   This helper would read the setting check it is not crazy long for
   an IRQ-line (lets say anything above 1 ms is crazy long) and if it
   is crazy long then overwrite it with a saner value.

2. is a bit tricky, because if the IRQ line comes from a chip then
obviously max 1ms debouncing to catch eletrical interference should be
fine. But sometimes cheap buttons for things like volume up/down on tablets
are directly connected to GPIOs and then we may want longer debouncing...

So if we do 2. we may want to limit it to only level type IRQs too.

Note I have contacted AMD about this and asked them for some input on this,
ideally they can tell us how exactly we should program the debounce filter
and based on which data we should do that.

Regards,

Hans

_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-27 10:09                                         ` [Linux-kernel-mentees] " Hans de Goede
@ 2020-10-27 15:13                                           ` Andy Shevchenko
  -1 siblings, 0 replies; 84+ messages in thread
From: Andy Shevchenko @ 2020-10-27 15:13 UTC (permalink / raw)
  To: Hans de Goede, Mika Westerberg
  Cc: Coiby Xu, Linus Walleij, open list:GPIO SUBSYSTEM, wang jun,
	Nehal Shah, Shyam Sundar S K, linux-kernel-mentees

On Tue, Oct 27, 2020 at 4:31 PM Hans de Goede <hdegoede@redhat.com> wrote:
> On 10/26/20 11:54 PM, Coiby Xu wrote:
> > Hi Hans and Linus,
> >
> > Will you interpret the 0x0000 value for debounce timeout in GPIO
> > Interrupt Connection Resource Descriptor as disabling debouncing
> > filter?
> >
> > GpioInt (EdgeLevel, ActiveLevel, Shared, PinConfig, DebounceTimeout, ResourceSource,
> > ResourceSourceIndex, ResourceUsage, DescriptorName, VendorData) {PinList}
> >
> > I'm not sure if Windows' implementation is the de facto standard like
> > i2c-hid. But if we are going to conform to the ACPI specs and we would
> > regard 0x0000 debounce timeout as disabling debouncing filter, then we
> > can fix this touchpad issue and potentially some related issues by
> > implementing the feature of supporting configuring debounce timeout in
> > drivers/gpio/gpiolib-acpi.c and removing all debounce filter
> > configuration in amd_gpio_irq_set_type of drivers/pinctrl/pinctrl-amd.c.
> > What do you think?
> >
> > A favorable evidence is I've collected five DSDT tables when
> > investigating this issue. All 5 DSDT tables have an GpioInt specifying
> > an non-zero debounce timeout value for the edge type irq and for all
> > the level type irq, the debounce timeout is set to 0x0000.
>
> That is a very interesting observation and this matches with my
> instincts which say that we should just disable the debounce filter
> for level triggered interrupts in pinctrl-amd.c
>
> Yes that is a bit of a shortcut vs reading the valie from the ACPI
> table, but I'm not sure that 0 always means disabled.
>
> Specifically the ACPI 6.2 spec also has a notion of pinconf settings
> and the docs on "PinConfig()"  say:
>
> Note: There is some overlap between the properties set by GpioIo/GpioInt/ PinFunction and
> PinConfig descriptors. For example, both are setting properties such as pull-ups. If the same
> property is specified by multiple descriptors for the same pins, the order in which these properties
> are applied is undetermined. To avoid any conflicts, GpioInt/GpioIo/PinFunction should provide a
> default value for these properties when PinConfig is used. If PinConfig is used to set pin bias,
> PullDefault should be used for GpioIo/GpioInt/ PinFunction. *If PinConfig is used to set debounce
> timeout, 0 should be used for GpioIo/GpioInt.*
>
> So that suggests that a value of 0 does not necessarily mean "disabled" but
> it means use a default, or possibly get the value from somewhere else such
> as from a ACPI PinConfig description (if present).

Nope, it was added to get rid of disambiguation when both Gpio*() and
PinConfig() are given.
So, 0 means default *if and only if* PinConfig() is present.

I.o.w. the OS layers should do this:

 - if Gpio*() provides Debounce != 0, we use it, otherwise
 - if PinConfig() is present for this pin with a debounce set, use it, otherwise
 - debounce is disabled.

Now we missed a midentry implementation in the Linux kernel, hence go
to last, i.e. disable debounce.
But it should be rather done in gpiolib-acpi.c.

Hope this helps.

I Cc'ed this to Mika as co-author of that part of specification, he
may correct me if I'm wrong.

P.S. Does RedHat have a representative in ASWG? If any ambiguity is
still present, feel free to propose ECR (IIRC abbreviation correctly)
to ASWG.

> So I see 2 ways to move forward with his:
>
> 1. Just disable the debounce filter for level type IRQs; or
> 2. Add a helper to sanitize the debounce pulse-duration setting and
>    call that when setting the IRQ type.
>    This helper would read the setting check it is not crazy long for
>    an IRQ-line (lets say anything above 1 ms is crazy long) and if it
>    is crazy long then overwrite it with a saner value.
>
> 2. is a bit tricky, because if the IRQ line comes from a chip then
> obviously max 1ms debouncing to catch eletrical interference should be
> fine. But sometimes cheap buttons for things like volume up/down on tablets
> are directly connected to GPIOs and then we may want longer debouncing...
>
> So if we do 2. we may want to limit it to only level type IRQs too.
>
> Note I have contacted AMD about this and asked them for some input on this,
> ideally they can tell us how exactly we should program the debounce filter
> and based on which data we should do that.


-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-27 15:13                                           ` Andy Shevchenko
  0 siblings, 0 replies; 84+ messages in thread
From: Andy Shevchenko @ 2020-10-27 15:13 UTC (permalink / raw)
  To: Hans de Goede, Mika Westerberg
  Cc: Shyam Sundar S K, Linus Walleij, Coiby Xu, wang jun,
	open list:GPIO SUBSYSTEM, linux-kernel-mentees, Nehal Shah

On Tue, Oct 27, 2020 at 4:31 PM Hans de Goede <hdegoede@redhat.com> wrote:
> On 10/26/20 11:54 PM, Coiby Xu wrote:
> > Hi Hans and Linus,
> >
> > Will you interpret the 0x0000 value for debounce timeout in GPIO
> > Interrupt Connection Resource Descriptor as disabling debouncing
> > filter?
> >
> > GpioInt (EdgeLevel, ActiveLevel, Shared, PinConfig, DebounceTimeout, ResourceSource,
> > ResourceSourceIndex, ResourceUsage, DescriptorName, VendorData) {PinList}
> >
> > I'm not sure if Windows' implementation is the de facto standard like
> > i2c-hid. But if we are going to conform to the ACPI specs and we would
> > regard 0x0000 debounce timeout as disabling debouncing filter, then we
> > can fix this touchpad issue and potentially some related issues by
> > implementing the feature of supporting configuring debounce timeout in
> > drivers/gpio/gpiolib-acpi.c and removing all debounce filter
> > configuration in amd_gpio_irq_set_type of drivers/pinctrl/pinctrl-amd.c.
> > What do you think?
> >
> > A favorable evidence is I've collected five DSDT tables when
> > investigating this issue. All 5 DSDT tables have an GpioInt specifying
> > an non-zero debounce timeout value for the edge type irq and for all
> > the level type irq, the debounce timeout is set to 0x0000.
>
> That is a very interesting observation and this matches with my
> instincts which say that we should just disable the debounce filter
> for level triggered interrupts in pinctrl-amd.c
>
> Yes that is a bit of a shortcut vs reading the valie from the ACPI
> table, but I'm not sure that 0 always means disabled.
>
> Specifically the ACPI 6.2 spec also has a notion of pinconf settings
> and the docs on "PinConfig()"  say:
>
> Note: There is some overlap between the properties set by GpioIo/GpioInt/ PinFunction and
> PinConfig descriptors. For example, both are setting properties such as pull-ups. If the same
> property is specified by multiple descriptors for the same pins, the order in which these properties
> are applied is undetermined. To avoid any conflicts, GpioInt/GpioIo/PinFunction should provide a
> default value for these properties when PinConfig is used. If PinConfig is used to set pin bias,
> PullDefault should be used for GpioIo/GpioInt/ PinFunction. *If PinConfig is used to set debounce
> timeout, 0 should be used for GpioIo/GpioInt.*
>
> So that suggests that a value of 0 does not necessarily mean "disabled" but
> it means use a default, or possibly get the value from somewhere else such
> as from a ACPI PinConfig description (if present).

Nope, it was added to get rid of disambiguation when both Gpio*() and
PinConfig() are given.
So, 0 means default *if and only if* PinConfig() is present.

I.o.w. the OS layers should do this:

 - if Gpio*() provides Debounce != 0, we use it, otherwise
 - if PinConfig() is present for this pin with a debounce set, use it, otherwise
 - debounce is disabled.

Now we missed a midentry implementation in the Linux kernel, hence go
to last, i.e. disable debounce.
But it should be rather done in gpiolib-acpi.c.

Hope this helps.

I Cc'ed this to Mika as co-author of that part of specification, he
may correct me if I'm wrong.

P.S. Does RedHat have a representative in ASWG? If any ambiguity is
still present, feel free to propose ECR (IIRC abbreviation correctly)
to ASWG.

> So I see 2 ways to move forward with his:
>
> 1. Just disable the debounce filter for level type IRQs; or
> 2. Add a helper to sanitize the debounce pulse-duration setting and
>    call that when setting the IRQ type.
>    This helper would read the setting check it is not crazy long for
>    an IRQ-line (lets say anything above 1 ms is crazy long) and if it
>    is crazy long then overwrite it with a saner value.
>
> 2. is a bit tricky, because if the IRQ line comes from a chip then
> obviously max 1ms debouncing to catch eletrical interference should be
> fine. But sometimes cheap buttons for things like volume up/down on tablets
> are directly connected to GPIOs and then we may want longer debouncing...
>
> So if we do 2. we may want to limit it to only level type IRQs too.
>
> Note I have contacted AMD about this and asked them for some input on this,
> ideally they can tell us how exactly we should program the debounce filter
> and based on which data we should do that.


-- 
With Best Regards,
Andy Shevchenko
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-27 15:13                                           ` [Linux-kernel-mentees] " Andy Shevchenko
@ 2020-10-27 16:00                                             ` Hans de Goede
  -1 siblings, 0 replies; 84+ messages in thread
From: Hans de Goede @ 2020-10-27 16:00 UTC (permalink / raw)
  To: Andy Shevchenko, Mika Westerberg
  Cc: Coiby Xu, Linus Walleij, open list:GPIO SUBSYSTEM, wang jun,
	Nehal Shah, Shyam Sundar S K, linux-kernel-mentees

Hi,

On 10/27/20 4:13 PM, Andy Shevchenko wrote:
> On Tue, Oct 27, 2020 at 4:31 PM Hans de Goede <hdegoede@redhat.com> wrote:
>> On 10/26/20 11:54 PM, Coiby Xu wrote:
>>> Hi Hans and Linus,
>>>
>>> Will you interpret the 0x0000 value for debounce timeout in GPIO
>>> Interrupt Connection Resource Descriptor as disabling debouncing
>>> filter?
>>>
>>> GpioInt (EdgeLevel, ActiveLevel, Shared, PinConfig, DebounceTimeout, ResourceSource,
>>> ResourceSourceIndex, ResourceUsage, DescriptorName, VendorData) {PinList}
>>>
>>> I'm not sure if Windows' implementation is the de facto standard like
>>> i2c-hid. But if we are going to conform to the ACPI specs and we would
>>> regard 0x0000 debounce timeout as disabling debouncing filter, then we
>>> can fix this touchpad issue and potentially some related issues by
>>> implementing the feature of supporting configuring debounce timeout in
>>> drivers/gpio/gpiolib-acpi.c and removing all debounce filter
>>> configuration in amd_gpio_irq_set_type of drivers/pinctrl/pinctrl-amd.c.
>>> What do you think?
>>>
>>> A favorable evidence is I've collected five DSDT tables when
>>> investigating this issue. All 5 DSDT tables have an GpioInt specifying
>>> an non-zero debounce timeout value for the edge type irq and for all
>>> the level type irq, the debounce timeout is set to 0x0000.
>>
>> That is a very interesting observation and this matches with my
>> instincts which say that we should just disable the debounce filter
>> for level triggered interrupts in pinctrl-amd.c
>>
>> Yes that is a bit of a shortcut vs reading the valie from the ACPI
>> table, but I'm not sure that 0 always means disabled.
>>
>> Specifically the ACPI 6.2 spec also has a notion of pinconf settings
>> and the docs on "PinConfig()"  say:
>>
>> Note: There is some overlap between the properties set by GpioIo/GpioInt/ PinFunction and
>> PinConfig descriptors. For example, both are setting properties such as pull-ups. If the same
>> property is specified by multiple descriptors for the same pins, the order in which these properties
>> are applied is undetermined. To avoid any conflicts, GpioInt/GpioIo/PinFunction should provide a
>> default value for these properties when PinConfig is used. If PinConfig is used to set pin bias,
>> PullDefault should be used for GpioIo/GpioInt/ PinFunction. *If PinConfig is used to set debounce
>> timeout, 0 should be used for GpioIo/GpioInt.*
>>
>> So that suggests that a value of 0 does not necessarily mean "disabled" but
>> it means use a default, or possibly get the value from somewhere else such
>> as from a ACPI PinConfig description (if present).
> 
> Nope, it was added to get rid of disambiguation when both Gpio*() and
> PinConfig() are given.
> So, 0 means default *if and only if* PinConfig() is present.
> 
> I.o.w. the OS layers should do this:
> 
>  - if Gpio*() provides Debounce != 0, we use it, otherwise
>  - if PinConfig() is present for this pin with a debounce set, use it, otherwise
>  - debounce is disabled.
> 
> Now we missed a midentry implementation in the Linux kernel, hence go
> to last, i.e. disable debounce.
> But it should be rather done in gpiolib-acpi.c.
> 
> Hope this helps.
> 
> I Cc'ed this to Mika as co-author of that part of specification, he
> may correct me if I'm wrong.

I see, so then the right thing to do for the bug which we are seeing
on some AMD platforms would be to honor the debounce setting I guess ?

Can you and/or Mika write a patch(set) for this ?

> P.S. Does RedHat have a representative in ASWG?

I think so yes, but mainly focussed on server related things I guess...

Regards,

Hans


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-27 16:00                                             ` Hans de Goede
  0 siblings, 0 replies; 84+ messages in thread
From: Hans de Goede @ 2020-10-27 16:00 UTC (permalink / raw)
  To: Andy Shevchenko, Mika Westerberg
  Cc: Shyam Sundar S K, Linus Walleij, Coiby Xu, wang jun,
	open list:GPIO SUBSYSTEM, linux-kernel-mentees, Nehal Shah

Hi,

On 10/27/20 4:13 PM, Andy Shevchenko wrote:
> On Tue, Oct 27, 2020 at 4:31 PM Hans de Goede <hdegoede@redhat.com> wrote:
>> On 10/26/20 11:54 PM, Coiby Xu wrote:
>>> Hi Hans and Linus,
>>>
>>> Will you interpret the 0x0000 value for debounce timeout in GPIO
>>> Interrupt Connection Resource Descriptor as disabling debouncing
>>> filter?
>>>
>>> GpioInt (EdgeLevel, ActiveLevel, Shared, PinConfig, DebounceTimeout, ResourceSource,
>>> ResourceSourceIndex, ResourceUsage, DescriptorName, VendorData) {PinList}
>>>
>>> I'm not sure if Windows' implementation is the de facto standard like
>>> i2c-hid. But if we are going to conform to the ACPI specs and we would
>>> regard 0x0000 debounce timeout as disabling debouncing filter, then we
>>> can fix this touchpad issue and potentially some related issues by
>>> implementing the feature of supporting configuring debounce timeout in
>>> drivers/gpio/gpiolib-acpi.c and removing all debounce filter
>>> configuration in amd_gpio_irq_set_type of drivers/pinctrl/pinctrl-amd.c.
>>> What do you think?
>>>
>>> A favorable evidence is I've collected five DSDT tables when
>>> investigating this issue. All 5 DSDT tables have an GpioInt specifying
>>> an non-zero debounce timeout value for the edge type irq and for all
>>> the level type irq, the debounce timeout is set to 0x0000.
>>
>> That is a very interesting observation and this matches with my
>> instincts which say that we should just disable the debounce filter
>> for level triggered interrupts in pinctrl-amd.c
>>
>> Yes that is a bit of a shortcut vs reading the valie from the ACPI
>> table, but I'm not sure that 0 always means disabled.
>>
>> Specifically the ACPI 6.2 spec also has a notion of pinconf settings
>> and the docs on "PinConfig()"  say:
>>
>> Note: There is some overlap between the properties set by GpioIo/GpioInt/ PinFunction and
>> PinConfig descriptors. For example, both are setting properties such as pull-ups. If the same
>> property is specified by multiple descriptors for the same pins, the order in which these properties
>> are applied is undetermined. To avoid any conflicts, GpioInt/GpioIo/PinFunction should provide a
>> default value for these properties when PinConfig is used. If PinConfig is used to set pin bias,
>> PullDefault should be used for GpioIo/GpioInt/ PinFunction. *If PinConfig is used to set debounce
>> timeout, 0 should be used for GpioIo/GpioInt.*
>>
>> So that suggests that a value of 0 does not necessarily mean "disabled" but
>> it means use a default, or possibly get the value from somewhere else such
>> as from a ACPI PinConfig description (if present).
> 
> Nope, it was added to get rid of disambiguation when both Gpio*() and
> PinConfig() are given.
> So, 0 means default *if and only if* PinConfig() is present.
> 
> I.o.w. the OS layers should do this:
> 
>  - if Gpio*() provides Debounce != 0, we use it, otherwise
>  - if PinConfig() is present for this pin with a debounce set, use it, otherwise
>  - debounce is disabled.
> 
> Now we missed a midentry implementation in the Linux kernel, hence go
> to last, i.e. disable debounce.
> But it should be rather done in gpiolib-acpi.c.
> 
> Hope this helps.
> 
> I Cc'ed this to Mika as co-author of that part of specification, he
> may correct me if I'm wrong.

I see, so then the right thing to do for the bug which we are seeing
on some AMD platforms would be to honor the debounce setting I guess ?

Can you and/or Mika write a patch(set) for this ?

> P.S. Does RedHat have a representative in ASWG?

I think so yes, but mainly focussed on server related things I guess...

Regards,

Hans

_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-27 16:00                                             ` [Linux-kernel-mentees] " Hans de Goede
@ 2020-10-27 16:09                                               ` Andy Shevchenko
  -1 siblings, 0 replies; 84+ messages in thread
From: Andy Shevchenko @ 2020-10-27 16:09 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Mika Westerberg, Coiby Xu, Linus Walleij,
	open list:GPIO SUBSYSTEM, wang jun, Nehal Shah, Shyam Sundar S K,
	linux-kernel-mentees

On Tue, Oct 27, 2020 at 6:01 PM Hans de Goede <hdegoede@redhat.com> wrote:
> On 10/27/20 4:13 PM, Andy Shevchenko wrote:
> > On Tue, Oct 27, 2020 at 4:31 PM Hans de Goede <hdegoede@redhat.com> wrote:
> >> On 10/26/20 11:54 PM, Coiby Xu wrote:
> >>> Hi Hans and Linus,
> >>>
> >>> Will you interpret the 0x0000 value for debounce timeout in GPIO
> >>> Interrupt Connection Resource Descriptor as disabling debouncing
> >>> filter?
> >>>
> >>> GpioInt (EdgeLevel, ActiveLevel, Shared, PinConfig, DebounceTimeout, ResourceSource,
> >>> ResourceSourceIndex, ResourceUsage, DescriptorName, VendorData) {PinList}
> >>>
> >>> I'm not sure if Windows' implementation is the de facto standard like
> >>> i2c-hid. But if we are going to conform to the ACPI specs and we would
> >>> regard 0x0000 debounce timeout as disabling debouncing filter, then we
> >>> can fix this touchpad issue and potentially some related issues by
> >>> implementing the feature of supporting configuring debounce timeout in
> >>> drivers/gpio/gpiolib-acpi.c and removing all debounce filter
> >>> configuration in amd_gpio_irq_set_type of drivers/pinctrl/pinctrl-amd.c.
> >>> What do you think?
> >>>
> >>> A favorable evidence is I've collected five DSDT tables when
> >>> investigating this issue. All 5 DSDT tables have an GpioInt specifying
> >>> an non-zero debounce timeout value for the edge type irq and for all
> >>> the level type irq, the debounce timeout is set to 0x0000.
> >>
> >> That is a very interesting observation and this matches with my
> >> instincts which say that we should just disable the debounce filter
> >> for level triggered interrupts in pinctrl-amd.c
> >>
> >> Yes that is a bit of a shortcut vs reading the valie from the ACPI
> >> table, but I'm not sure that 0 always means disabled.
> >>
> >> Specifically the ACPI 6.2 spec also has a notion of pinconf settings
> >> and the docs on "PinConfig()"  say:
> >>
> >> Note: There is some overlap between the properties set by GpioIo/GpioInt/ PinFunction and
> >> PinConfig descriptors. For example, both are setting properties such as pull-ups. If the same
> >> property is specified by multiple descriptors for the same pins, the order in which these properties
> >> are applied is undetermined. To avoid any conflicts, GpioInt/GpioIo/PinFunction should provide a
> >> default value for these properties when PinConfig is used. If PinConfig is used to set pin bias,
> >> PullDefault should be used for GpioIo/GpioInt/ PinFunction. *If PinConfig is used to set debounce
> >> timeout, 0 should be used for GpioIo/GpioInt.*
> >>
> >> So that suggests that a value of 0 does not necessarily mean "disabled" but
> >> it means use a default, or possibly get the value from somewhere else such
> >> as from a ACPI PinConfig description (if present).
> >
> > Nope, it was added to get rid of disambiguation when both Gpio*() and
> > PinConfig() are given.
> > So, 0 means default *if and only if* PinConfig() is present.
> >
> > I.o.w. the OS layers should do this:
> >
> >  - if Gpio*() provides Debounce != 0, we use it, otherwise
> >  - if PinConfig() is present for this pin with a debounce set, use it, otherwise
> >  - debounce is disabled.
> >
> > Now we missed a midentry implementation in the Linux kernel, hence go
> > to last, i.e. disable debounce.
> > But it should be rather done in gpiolib-acpi.c.
> >
> > Hope this helps.
> >
> > I Cc'ed this to Mika as co-author of that part of specification, he
> > may correct me if I'm wrong.
>
> I see, so then the right thing to do for the bug which we are seeing
> on some AMD platforms would be to honor the debounce setting I guess ?
>
> Can you and/or Mika write a patch(set) for this ?

I will look at it, but meanwhile I would postpone until having a
Mika's Ack on the action that my understanding and course of actions
is correct.

> > P.S. Does RedHat have a representative in ASWG?
>
> I think so yes, but mainly focussed on server related things I guess...


-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-27 16:09                                               ` Andy Shevchenko
  0 siblings, 0 replies; 84+ messages in thread
From: Andy Shevchenko @ 2020-10-27 16:09 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Shyam Sundar S K, linux-kernel-mentees, Linus Walleij, Coiby Xu,
	wang jun, open list:GPIO SUBSYSTEM, Mika Westerberg, Nehal Shah

On Tue, Oct 27, 2020 at 6:01 PM Hans de Goede <hdegoede@redhat.com> wrote:
> On 10/27/20 4:13 PM, Andy Shevchenko wrote:
> > On Tue, Oct 27, 2020 at 4:31 PM Hans de Goede <hdegoede@redhat.com> wrote:
> >> On 10/26/20 11:54 PM, Coiby Xu wrote:
> >>> Hi Hans and Linus,
> >>>
> >>> Will you interpret the 0x0000 value for debounce timeout in GPIO
> >>> Interrupt Connection Resource Descriptor as disabling debouncing
> >>> filter?
> >>>
> >>> GpioInt (EdgeLevel, ActiveLevel, Shared, PinConfig, DebounceTimeout, ResourceSource,
> >>> ResourceSourceIndex, ResourceUsage, DescriptorName, VendorData) {PinList}
> >>>
> >>> I'm not sure if Windows' implementation is the de facto standard like
> >>> i2c-hid. But if we are going to conform to the ACPI specs and we would
> >>> regard 0x0000 debounce timeout as disabling debouncing filter, then we
> >>> can fix this touchpad issue and potentially some related issues by
> >>> implementing the feature of supporting configuring debounce timeout in
> >>> drivers/gpio/gpiolib-acpi.c and removing all debounce filter
> >>> configuration in amd_gpio_irq_set_type of drivers/pinctrl/pinctrl-amd.c.
> >>> What do you think?
> >>>
> >>> A favorable evidence is I've collected five DSDT tables when
> >>> investigating this issue. All 5 DSDT tables have an GpioInt specifying
> >>> an non-zero debounce timeout value for the edge type irq and for all
> >>> the level type irq, the debounce timeout is set to 0x0000.
> >>
> >> That is a very interesting observation and this matches with my
> >> instincts which say that we should just disable the debounce filter
> >> for level triggered interrupts in pinctrl-amd.c
> >>
> >> Yes that is a bit of a shortcut vs reading the valie from the ACPI
> >> table, but I'm not sure that 0 always means disabled.
> >>
> >> Specifically the ACPI 6.2 spec also has a notion of pinconf settings
> >> and the docs on "PinConfig()"  say:
> >>
> >> Note: There is some overlap between the properties set by GpioIo/GpioInt/ PinFunction and
> >> PinConfig descriptors. For example, both are setting properties such as pull-ups. If the same
> >> property is specified by multiple descriptors for the same pins, the order in which these properties
> >> are applied is undetermined. To avoid any conflicts, GpioInt/GpioIo/PinFunction should provide a
> >> default value for these properties when PinConfig is used. If PinConfig is used to set pin bias,
> >> PullDefault should be used for GpioIo/GpioInt/ PinFunction. *If PinConfig is used to set debounce
> >> timeout, 0 should be used for GpioIo/GpioInt.*
> >>
> >> So that suggests that a value of 0 does not necessarily mean "disabled" but
> >> it means use a default, or possibly get the value from somewhere else such
> >> as from a ACPI PinConfig description (if present).
> >
> > Nope, it was added to get rid of disambiguation when both Gpio*() and
> > PinConfig() are given.
> > So, 0 means default *if and only if* PinConfig() is present.
> >
> > I.o.w. the OS layers should do this:
> >
> >  - if Gpio*() provides Debounce != 0, we use it, otherwise
> >  - if PinConfig() is present for this pin with a debounce set, use it, otherwise
> >  - debounce is disabled.
> >
> > Now we missed a midentry implementation in the Linux kernel, hence go
> > to last, i.e. disable debounce.
> > But it should be rather done in gpiolib-acpi.c.
> >
> > Hope this helps.
> >
> > I Cc'ed this to Mika as co-author of that part of specification, he
> > may correct me if I'm wrong.
>
> I see, so then the right thing to do for the bug which we are seeing
> on some AMD platforms would be to honor the debounce setting I guess ?
>
> Can you and/or Mika write a patch(set) for this ?

I will look at it, but meanwhile I would postpone until having a
Mika's Ack on the action that my understanding and course of actions
is correct.

> > P.S. Does RedHat have a representative in ASWG?
>
> I think so yes, but mainly focussed on server related things I guess...


-- 
With Best Regards,
Andy Shevchenko
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-27 16:09                                               ` [Linux-kernel-mentees] " Andy Shevchenko
@ 2020-10-29  8:04                                                 ` Mika Westerberg
  -1 siblings, 0 replies; 84+ messages in thread
From: Mika Westerberg @ 2020-10-29  8:04 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Hans de Goede, Coiby Xu, Linus Walleij, open list:GPIO SUBSYSTEM,
	wang jun, Nehal Shah, Shyam Sundar S K, linux-kernel-mentees

On Tue, Oct 27, 2020 at 06:09:49PM +0200, Andy Shevchenko wrote:
> On Tue, Oct 27, 2020 at 6:01 PM Hans de Goede <hdegoede@redhat.com> wrote:
> > On 10/27/20 4:13 PM, Andy Shevchenko wrote:
> > > On Tue, Oct 27, 2020 at 4:31 PM Hans de Goede <hdegoede@redhat.com> wrote:
> > >> On 10/26/20 11:54 PM, Coiby Xu wrote:
> > >>> Hi Hans and Linus,
> > >>>
> > >>> Will you interpret the 0x0000 value for debounce timeout in GPIO
> > >>> Interrupt Connection Resource Descriptor as disabling debouncing
> > >>> filter?
> > >>>
> > >>> GpioInt (EdgeLevel, ActiveLevel, Shared, PinConfig, DebounceTimeout, ResourceSource,
> > >>> ResourceSourceIndex, ResourceUsage, DescriptorName, VendorData) {PinList}
> > >>>
> > >>> I'm not sure if Windows' implementation is the de facto standard like
> > >>> i2c-hid. But if we are going to conform to the ACPI specs and we would
> > >>> regard 0x0000 debounce timeout as disabling debouncing filter, then we
> > >>> can fix this touchpad issue and potentially some related issues by
> > >>> implementing the feature of supporting configuring debounce timeout in
> > >>> drivers/gpio/gpiolib-acpi.c and removing all debounce filter
> > >>> configuration in amd_gpio_irq_set_type of drivers/pinctrl/pinctrl-amd.c.
> > >>> What do you think?
> > >>>
> > >>> A favorable evidence is I've collected five DSDT tables when
> > >>> investigating this issue. All 5 DSDT tables have an GpioInt specifying
> > >>> an non-zero debounce timeout value for the edge type irq and for all
> > >>> the level type irq, the debounce timeout is set to 0x0000.
> > >>
> > >> That is a very interesting observation and this matches with my
> > >> instincts which say that we should just disable the debounce filter
> > >> for level triggered interrupts in pinctrl-amd.c
> > >>
> > >> Yes that is a bit of a shortcut vs reading the valie from the ACPI
> > >> table, but I'm not sure that 0 always means disabled.
> > >>
> > >> Specifically the ACPI 6.2 spec also has a notion of pinconf settings
> > >> and the docs on "PinConfig()"  say:
> > >>
> > >> Note: There is some overlap between the properties set by GpioIo/GpioInt/ PinFunction and
> > >> PinConfig descriptors. For example, both are setting properties such as pull-ups. If the same
> > >> property is specified by multiple descriptors for the same pins, the order in which these properties
> > >> are applied is undetermined. To avoid any conflicts, GpioInt/GpioIo/PinFunction should provide a
> > >> default value for these properties when PinConfig is used. If PinConfig is used to set pin bias,
> > >> PullDefault should be used for GpioIo/GpioInt/ PinFunction. *If PinConfig is used to set debounce
> > >> timeout, 0 should be used for GpioIo/GpioInt.*
> > >>
> > >> So that suggests that a value of 0 does not necessarily mean "disabled" but
> > >> it means use a default, or possibly get the value from somewhere else such
> > >> as from a ACPI PinConfig description (if present).
> > >
> > > Nope, it was added to get rid of disambiguation when both Gpio*() and
> > > PinConfig() are given.
> > > So, 0 means default *if and only if* PinConfig() is present.
> > >
> > > I.o.w. the OS layers should do this:
> > >
> > >  - if Gpio*() provides Debounce != 0, we use it, otherwise
> > >  - if PinConfig() is present for this pin with a debounce set, use it, otherwise
> > >  - debounce is disabled.
> > >
> > > Now we missed a midentry implementation in the Linux kernel, hence go
> > > to last, i.e. disable debounce.
> > > But it should be rather done in gpiolib-acpi.c.
> > >
> > > Hope this helps.
> > >
> > > I Cc'ed this to Mika as co-author of that part of specification, he
> > > may correct me if I'm wrong.
> >
> > I see, so then the right thing to do for the bug which we are seeing
> > on some AMD platforms would be to honor the debounce setting I guess ?
> >
> > Can you and/or Mika write a patch(set) for this ?
> 
> I will look at it, but meanwhile I would postpone until having a
> Mika's Ack on the action that my understanding and course of actions
> is correct.

From what I recall this sounds correct :)

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-29  8:04                                                 ` Mika Westerberg
  0 siblings, 0 replies; 84+ messages in thread
From: Mika Westerberg @ 2020-10-29  8:04 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Shyam Sundar S K, open list:GPIO SUBSYSTEM, Linus Walleij,
	Coiby Xu, wang jun, Hans de Goede, linux-kernel-mentees,
	Nehal Shah

On Tue, Oct 27, 2020 at 06:09:49PM +0200, Andy Shevchenko wrote:
> On Tue, Oct 27, 2020 at 6:01 PM Hans de Goede <hdegoede@redhat.com> wrote:
> > On 10/27/20 4:13 PM, Andy Shevchenko wrote:
> > > On Tue, Oct 27, 2020 at 4:31 PM Hans de Goede <hdegoede@redhat.com> wrote:
> > >> On 10/26/20 11:54 PM, Coiby Xu wrote:
> > >>> Hi Hans and Linus,
> > >>>
> > >>> Will you interpret the 0x0000 value for debounce timeout in GPIO
> > >>> Interrupt Connection Resource Descriptor as disabling debouncing
> > >>> filter?
> > >>>
> > >>> GpioInt (EdgeLevel, ActiveLevel, Shared, PinConfig, DebounceTimeout, ResourceSource,
> > >>> ResourceSourceIndex, ResourceUsage, DescriptorName, VendorData) {PinList}
> > >>>
> > >>> I'm not sure if Windows' implementation is the de facto standard like
> > >>> i2c-hid. But if we are going to conform to the ACPI specs and we would
> > >>> regard 0x0000 debounce timeout as disabling debouncing filter, then we
> > >>> can fix this touchpad issue and potentially some related issues by
> > >>> implementing the feature of supporting configuring debounce timeout in
> > >>> drivers/gpio/gpiolib-acpi.c and removing all debounce filter
> > >>> configuration in amd_gpio_irq_set_type of drivers/pinctrl/pinctrl-amd.c.
> > >>> What do you think?
> > >>>
> > >>> A favorable evidence is I've collected five DSDT tables when
> > >>> investigating this issue. All 5 DSDT tables have an GpioInt specifying
> > >>> an non-zero debounce timeout value for the edge type irq and for all
> > >>> the level type irq, the debounce timeout is set to 0x0000.
> > >>
> > >> That is a very interesting observation and this matches with my
> > >> instincts which say that we should just disable the debounce filter
> > >> for level triggered interrupts in pinctrl-amd.c
> > >>
> > >> Yes that is a bit of a shortcut vs reading the valie from the ACPI
> > >> table, but I'm not sure that 0 always means disabled.
> > >>
> > >> Specifically the ACPI 6.2 spec also has a notion of pinconf settings
> > >> and the docs on "PinConfig()"  say:
> > >>
> > >> Note: There is some overlap between the properties set by GpioIo/GpioInt/ PinFunction and
> > >> PinConfig descriptors. For example, both are setting properties such as pull-ups. If the same
> > >> property is specified by multiple descriptors for the same pins, the order in which these properties
> > >> are applied is undetermined. To avoid any conflicts, GpioInt/GpioIo/PinFunction should provide a
> > >> default value for these properties when PinConfig is used. If PinConfig is used to set pin bias,
> > >> PullDefault should be used for GpioIo/GpioInt/ PinFunction. *If PinConfig is used to set debounce
> > >> timeout, 0 should be used for GpioIo/GpioInt.*
> > >>
> > >> So that suggests that a value of 0 does not necessarily mean "disabled" but
> > >> it means use a default, or possibly get the value from somewhere else such
> > >> as from a ACPI PinConfig description (if present).
> > >
> > > Nope, it was added to get rid of disambiguation when both Gpio*() and
> > > PinConfig() are given.
> > > So, 0 means default *if and only if* PinConfig() is present.
> > >
> > > I.o.w. the OS layers should do this:
> > >
> > >  - if Gpio*() provides Debounce != 0, we use it, otherwise
> > >  - if PinConfig() is present for this pin with a debounce set, use it, otherwise
> > >  - debounce is disabled.
> > >
> > > Now we missed a midentry implementation in the Linux kernel, hence go
> > > to last, i.e. disable debounce.
> > > But it should be rather done in gpiolib-acpi.c.
> > >
> > > Hope this helps.
> > >
> > > I Cc'ed this to Mika as co-author of that part of specification, he
> > > may correct me if I'm wrong.
> >
> > I see, so then the right thing to do for the bug which we are seeing
> > on some AMD platforms would be to honor the debounce setting I guess ?
> >
> > Can you and/or Mika write a patch(set) for this ?
> 
> I will look at it, but meanwhile I would postpone until having a
> Mika's Ack on the action that my understanding and course of actions
> is correct.

From what I recall this sounds correct :)
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-27 16:09                                               ` [Linux-kernel-mentees] " Andy Shevchenko
@ 2020-10-30  4:54                                                 ` Coiby Xu
  -1 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-30  4:54 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Hans de Goede, Mika Westerberg, Linus Walleij,
	open list:GPIO SUBSYSTEM, wang jun, Nehal Shah, Shyam Sundar S K,
	linux-kernel-mentees

On Tue, Oct 27, 2020 at 06:09:49PM +0200, Andy Shevchenko wrote:
>On Tue, Oct 27, 2020 at 6:01 PM Hans de Goede <hdegoede@redhat.com> wrote:
>> On 10/27/20 4:13 PM, Andy Shevchenko wrote:
>> > On Tue, Oct 27, 2020 at 4:31 PM Hans de Goede <hdegoede@redhat.com> wrote:
>> >> On 10/26/20 11:54 PM, Coiby Xu wrote:
>> >>> Hi Hans and Linus,
>> >>>
>> >>> Will you interpret the 0x0000 value for debounce timeout in GPIO
>> >>> Interrupt Connection Resource Descriptor as disabling debouncing
>> >>> filter?
>> >>>
>> >>> GpioInt (EdgeLevel, ActiveLevel, Shared, PinConfig, DebounceTimeout, ResourceSource,
>> >>> ResourceSourceIndex, ResourceUsage, DescriptorName, VendorData) {PinList}
>> >>>
>> >>> I'm not sure if Windows' implementation is the de facto standard like
>> >>> i2c-hid. But if we are going to conform to the ACPI specs and we would
>> >>> regard 0x0000 debounce timeout as disabling debouncing filter, then we
>> >>> can fix this touchpad issue and potentially some related issues by
>> >>> implementing the feature of supporting configuring debounce timeout in
>> >>> drivers/gpio/gpiolib-acpi.c and removing all debounce filter
>> >>> configuration in amd_gpio_irq_set_type of drivers/pinctrl/pinctrl-amd.c.
>> >>> What do you think?
>> >>>
>> >>> A favorable evidence is I've collected five DSDT tables when
>> >>> investigating this issue. All 5 DSDT tables have an GpioInt specifying
>> >>> an non-zero debounce timeout value for the edge type irq and for all
>> >>> the level type irq, the debounce timeout is set to 0x0000.
>> >>
>> >> That is a very interesting observation and this matches with my
>> >> instincts which say that we should just disable the debounce filter
>> >> for level triggered interrupts in pinctrl-amd.c
>> >>
>> >> Yes that is a bit of a shortcut vs reading the valie from the ACPI
>> >> table, but I'm not sure that 0 always means disabled.
>> >>
>> >> Specifically the ACPI 6.2 spec also has a notion of pinconf settings
>> >> and the docs on "PinConfig()"  say:
>> >>
>> >> Note: There is some overlap between the properties set by GpioIo/GpioInt/ PinFunction and
>> >> PinConfig descriptors. For example, both are setting properties such as pull-ups. If the same
>> >> property is specified by multiple descriptors for the same pins, the order in which these properties
>> >> are applied is undetermined. To avoid any conflicts, GpioInt/GpioIo/PinFunction should provide a
>> >> default value for these properties when PinConfig is used. If PinConfig is used to set pin bias,
>> >> PullDefault should be used for GpioIo/GpioInt/ PinFunction. *If PinConfig is used to set debounce
>> >> timeout, 0 should be used for GpioIo/GpioInt.*
>> >>
>> >> So that suggests that a value of 0 does not necessarily mean "disabled" but
>> >> it means use a default, or possibly get the value from somewhere else such
>> >> as from a ACPI PinConfig description (if present).
>> >
>> > Nope, it was added to get rid of disambiguation when both Gpio*() and
>> > PinConfig() are given.
>> > So, 0 means default *if and only if* PinConfig() is present.
>> >
>> > I.o.w. the OS layers should do this:
>> >
>> >  - if Gpio*() provides Debounce != 0, we use it, otherwise
>> >  - if PinConfig() is present for this pin with a debounce set, use it, otherwise
>> >  - debounce is disabled.
>> >
>> > Now we missed a midentry implementation in the Linux kernel, hence go
>> > to last, i.e. disable debounce.
>> > But it should be rather done in gpiolib-acpi.c.
>> >
>> > Hope this helps.
>> >
>> > I Cc'ed this to Mika as co-author of that part of specification, he
>> > may correct me if I'm wrong.
>>
>> I see, so then the right thing to do for the bug which we are seeing
>> on some AMD platforms would be to honor the debounce setting I guess ?
>>
>> Can you and/or Mika write a patch(set) for this ?
>
>I will look at it, but meanwhile I would postpone until having a
>Mika's Ack on the action that my understanding and course of actions
>is correct.
>
If you don't mind, let me write this patch(set) instead:) I feel itchy
to fix this touchpad issue by myself after spending about a month of
my internship at Linux Foundation investigating this touchpad issue.
There are many enthusiastic Linux users waiting to get their touchpads
fixed and I could prioritize this task since I don't have other
obligations. I have provided a fallback solution [1] to save their
touchpads but it seems patches on gpiolib-acpi.c and pinctrl-amd could
reach mainline kernel much earlier.

[1] https://lore.kernel.org/patchwork/patch/1323245/
>> > P.S. Does RedHat have a representative in ASWG?
>>
>> I think so yes, but mainly focussed on server related things I guess...
>
>
>--
>With Best Regards,
>Andy Shevchenko

--
Best regards,
Coiby

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-30  4:54                                                 ` Coiby Xu
  0 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-30  4:54 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Shyam Sundar S K, open list:GPIO SUBSYSTEM, linux-kernel-mentees,
	Linus Walleij, wang jun, Hans de Goede, Mika Westerberg,
	Nehal Shah

On Tue, Oct 27, 2020 at 06:09:49PM +0200, Andy Shevchenko wrote:
>On Tue, Oct 27, 2020 at 6:01 PM Hans de Goede <hdegoede@redhat.com> wrote:
>> On 10/27/20 4:13 PM, Andy Shevchenko wrote:
>> > On Tue, Oct 27, 2020 at 4:31 PM Hans de Goede <hdegoede@redhat.com> wrote:
>> >> On 10/26/20 11:54 PM, Coiby Xu wrote:
>> >>> Hi Hans and Linus,
>> >>>
>> >>> Will you interpret the 0x0000 value for debounce timeout in GPIO
>> >>> Interrupt Connection Resource Descriptor as disabling debouncing
>> >>> filter?
>> >>>
>> >>> GpioInt (EdgeLevel, ActiveLevel, Shared, PinConfig, DebounceTimeout, ResourceSource,
>> >>> ResourceSourceIndex, ResourceUsage, DescriptorName, VendorData) {PinList}
>> >>>
>> >>> I'm not sure if Windows' implementation is the de facto standard like
>> >>> i2c-hid. But if we are going to conform to the ACPI specs and we would
>> >>> regard 0x0000 debounce timeout as disabling debouncing filter, then we
>> >>> can fix this touchpad issue and potentially some related issues by
>> >>> implementing the feature of supporting configuring debounce timeout in
>> >>> drivers/gpio/gpiolib-acpi.c and removing all debounce filter
>> >>> configuration in amd_gpio_irq_set_type of drivers/pinctrl/pinctrl-amd.c.
>> >>> What do you think?
>> >>>
>> >>> A favorable evidence is I've collected five DSDT tables when
>> >>> investigating this issue. All 5 DSDT tables have an GpioInt specifying
>> >>> an non-zero debounce timeout value for the edge type irq and for all
>> >>> the level type irq, the debounce timeout is set to 0x0000.
>> >>
>> >> That is a very interesting observation and this matches with my
>> >> instincts which say that we should just disable the debounce filter
>> >> for level triggered interrupts in pinctrl-amd.c
>> >>
>> >> Yes that is a bit of a shortcut vs reading the valie from the ACPI
>> >> table, but I'm not sure that 0 always means disabled.
>> >>
>> >> Specifically the ACPI 6.2 spec also has a notion of pinconf settings
>> >> and the docs on "PinConfig()"  say:
>> >>
>> >> Note: There is some overlap between the properties set by GpioIo/GpioInt/ PinFunction and
>> >> PinConfig descriptors. For example, both are setting properties such as pull-ups. If the same
>> >> property is specified by multiple descriptors for the same pins, the order in which these properties
>> >> are applied is undetermined. To avoid any conflicts, GpioInt/GpioIo/PinFunction should provide a
>> >> default value for these properties when PinConfig is used. If PinConfig is used to set pin bias,
>> >> PullDefault should be used for GpioIo/GpioInt/ PinFunction. *If PinConfig is used to set debounce
>> >> timeout, 0 should be used for GpioIo/GpioInt.*
>> >>
>> >> So that suggests that a value of 0 does not necessarily mean "disabled" but
>> >> it means use a default, or possibly get the value from somewhere else such
>> >> as from a ACPI PinConfig description (if present).
>> >
>> > Nope, it was added to get rid of disambiguation when both Gpio*() and
>> > PinConfig() are given.
>> > So, 0 means default *if and only if* PinConfig() is present.
>> >
>> > I.o.w. the OS layers should do this:
>> >
>> >  - if Gpio*() provides Debounce != 0, we use it, otherwise
>> >  - if PinConfig() is present for this pin with a debounce set, use it, otherwise
>> >  - debounce is disabled.
>> >
>> > Now we missed a midentry implementation in the Linux kernel, hence go
>> > to last, i.e. disable debounce.
>> > But it should be rather done in gpiolib-acpi.c.
>> >
>> > Hope this helps.
>> >
>> > I Cc'ed this to Mika as co-author of that part of specification, he
>> > may correct me if I'm wrong.
>>
>> I see, so then the right thing to do for the bug which we are seeing
>> on some AMD platforms would be to honor the debounce setting I guess ?
>>
>> Can you and/or Mika write a patch(set) for this ?
>
>I will look at it, but meanwhile I would postpone until having a
>Mika's Ack on the action that my understanding and course of actions
>is correct.
>
If you don't mind, let me write this patch(set) instead:) I feel itchy
to fix this touchpad issue by myself after spending about a month of
my internship at Linux Foundation investigating this touchpad issue.
There are many enthusiastic Linux users waiting to get their touchpads
fixed and I could prioritize this task since I don't have other
obligations. I have provided a fallback solution [1] to save their
touchpads but it seems patches on gpiolib-acpi.c and pinctrl-amd could
reach mainline kernel much earlier.

[1] https://lore.kernel.org/patchwork/patch/1323245/
>> > P.S. Does RedHat have a representative in ASWG?
>>
>> I think so yes, but mainly focussed on server related things I guess...
>
>
>--
>With Best Regards,
>Andy Shevchenko

--
Best regards,
Coiby
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-27  9:52                                         ` [Linux-kernel-mentees] " Andy Shevchenko
@ 2020-10-30  4:58                                           ` Coiby Xu
  -1 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-30  4:58 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Hans de Goede, Linus Walleij, open list:GPIO SUBSYSTEM, wang jun,
	Nehal Shah, Shyam Sundar S K, linux-kernel-mentees

On Tue, Oct 27, 2020 at 11:52:14AM +0200, Andy Shevchenko wrote:
>On Tue, Oct 27, 2020 at 2:07 AM Coiby Xu <coiby.xu@gmail.com> wrote:
>>
>> Hi Hans and Linus,
>>
>> Will you interpret the 0x0000 value for debounce timeout in GPIO
>> Interrupt Connection Resource Descriptor as disabling debouncing
>> filter?
>>
>> GpioInt (EdgeLevel, ActiveLevel, Shared, PinConfig, DebounceTimeout, ResourceSource,
>> ResourceSourceIndex, ResourceUsage, DescriptorName, VendorData) {PinList}
>
>According to the spec
>
>DebounceTimeout is an optional argument specifying the debounce wait
>time, in hundredths of
>milliseconds. The bit field name _DBT is automatically created to
>refer to this portion of the resource
>descriptor.
>
>I interpret this as 0 == no debounce (or a minimum that hardware has
>if there is no possibility to disable).

Thanks for the explanation!
>
>> I'm not sure if Windows' implementation is the de facto standard like
>> i2c-hid. But if we are going to conform to the ACPI specs and we would
>> regard 0x0000 debounce timeout as disabling debouncing filter, then we
>> can fix this touchpad issue and potentially some related issues by
>> implementing the feature of supporting configuring debounce timeout in
>> drivers/gpio/gpiolib-acpi.c and removing all debounce filter
>> configuration in amd_gpio_irq_set_type of drivers/pinctrl/pinctrl-amd.c.
>> What do you think?
>>
>> A favorable evidence is I've collected five DSDT tables when
>> investigating this issue. All 5 DSDT tables have an GpioInt specifying
>> an non-zero debounce timeout value for the edge type irq and for all
>> the level type irq, the debounce timeout is set to 0x0000.
>
>To the future mails: please, do not top-post.
>And please remove a huge amount of unrelated lines in the reply.
>
Thank you for the suggestion!
>--
>With Best Regards,
>Andy Shevchenko

--
Best regards,
Coiby

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-10-30  4:58                                           ` Coiby Xu
  0 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-10-30  4:58 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Shyam Sundar S K, open list:GPIO SUBSYSTEM, Linus Walleij,
	wang jun, Hans de Goede, linux-kernel-mentees, Nehal Shah

On Tue, Oct 27, 2020 at 11:52:14AM +0200, Andy Shevchenko wrote:
>On Tue, Oct 27, 2020 at 2:07 AM Coiby Xu <coiby.xu@gmail.com> wrote:
>>
>> Hi Hans and Linus,
>>
>> Will you interpret the 0x0000 value for debounce timeout in GPIO
>> Interrupt Connection Resource Descriptor as disabling debouncing
>> filter?
>>
>> GpioInt (EdgeLevel, ActiveLevel, Shared, PinConfig, DebounceTimeout, ResourceSource,
>> ResourceSourceIndex, ResourceUsage, DescriptorName, VendorData) {PinList}
>
>According to the spec
>
>DebounceTimeout is an optional argument specifying the debounce wait
>time, in hundredths of
>milliseconds. The bit field name _DBT is automatically created to
>refer to this portion of the resource
>descriptor.
>
>I interpret this as 0 == no debounce (or a minimum that hardware has
>if there is no possibility to disable).

Thanks for the explanation!
>
>> I'm not sure if Windows' implementation is the de facto standard like
>> i2c-hid. But if we are going to conform to the ACPI specs and we would
>> regard 0x0000 debounce timeout as disabling debouncing filter, then we
>> can fix this touchpad issue and potentially some related issues by
>> implementing the feature of supporting configuring debounce timeout in
>> drivers/gpio/gpiolib-acpi.c and removing all debounce filter
>> configuration in amd_gpio_irq_set_type of drivers/pinctrl/pinctrl-amd.c.
>> What do you think?
>>
>> A favorable evidence is I've collected five DSDT tables when
>> investigating this issue. All 5 DSDT tables have an GpioInt specifying
>> an non-zero debounce timeout value for the edge type irq and for all
>> the level type irq, the debounce timeout is set to 0x0000.
>
>To the future mails: please, do not top-post.
>And please remove a huge amount of unrelated lines in the reply.
>
Thank you for the suggestion!
>--
>With Best Regards,
>Andy Shevchenko

--
Best regards,
Coiby
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-30  4:54                                                 ` [Linux-kernel-mentees] " Coiby Xu
@ 2020-11-02 19:06                                                   ` Andy Shevchenko
  -1 siblings, 0 replies; 84+ messages in thread
From: Andy Shevchenko @ 2020-11-02 19:06 UTC (permalink / raw)
  To: Coiby Xu
  Cc: Hans de Goede, Mika Westerberg, Linus Walleij,
	open list:GPIO SUBSYSTEM, wang jun, Nehal Shah, Shyam Sundar S K,
	linux-kernel-mentees

On Fri, Oct 30, 2020 at 6:54 AM Coiby Xu <coiby.xu@gmail.com> wrote:
> On Tue, Oct 27, 2020 at 06:09:49PM +0200, Andy Shevchenko wrote:
> >On Tue, Oct 27, 2020 at 6:01 PM Hans de Goede <hdegoede@redhat.com> wrote:
> >> On 10/27/20 4:13 PM, Andy Shevchenko wrote:

...

> >> I see, so then the right thing to do for the bug which we are seeing
> >> on some AMD platforms would be to honor the debounce setting I guess ?
> >>
> >> Can you and/or Mika write a patch(set) for this ?
> >
> >I will look at it, but meanwhile I would postpone until having a
> >Mika's Ack on the action that my understanding and course of actions
> >is correct.

I will soon send a support patch against ACPI GPIO library code.

> If you don't mind, let me write this patch(set) instead:)

I leave to you whatever AMD code. It will suit our both interests :-)

> I feel itchy
> to fix this touchpad issue by myself after spending about a month of
> my internship at Linux Foundation investigating this touchpad issue.
> There are many enthusiastic Linux users waiting to get their touchpads
> fixed and I could prioritize this task since I don't have other
> obligations. I have provided a fallback solution [1] to save their
> touchpads but it seems patches on gpiolib-acpi.c and pinctrl-amd could
> reach mainline kernel much earlier.
>
> [1] https://lore.kernel.org/patchwork/patch/1323245/


-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-11-02 19:06                                                   ` Andy Shevchenko
  0 siblings, 0 replies; 84+ messages in thread
From: Andy Shevchenko @ 2020-11-02 19:06 UTC (permalink / raw)
  To: Coiby Xu
  Cc: Shyam Sundar S K, open list:GPIO SUBSYSTEM, linux-kernel-mentees,
	Linus Walleij, wang jun, Hans de Goede, Mika Westerberg,
	Nehal Shah

On Fri, Oct 30, 2020 at 6:54 AM Coiby Xu <coiby.xu@gmail.com> wrote:
> On Tue, Oct 27, 2020 at 06:09:49PM +0200, Andy Shevchenko wrote:
> >On Tue, Oct 27, 2020 at 6:01 PM Hans de Goede <hdegoede@redhat.com> wrote:
> >> On 10/27/20 4:13 PM, Andy Shevchenko wrote:

...

> >> I see, so then the right thing to do for the bug which we are seeing
> >> on some AMD platforms would be to honor the debounce setting I guess ?
> >>
> >> Can you and/or Mika write a patch(set) for this ?
> >
> >I will look at it, but meanwhile I would postpone until having a
> >Mika's Ack on the action that my understanding and course of actions
> >is correct.

I will soon send a support patch against ACPI GPIO library code.

> If you don't mind, let me write this patch(set) instead:)

I leave to you whatever AMD code. It will suit our both interests :-)

> I feel itchy
> to fix this touchpad issue by myself after spending about a month of
> my internship at Linux Foundation investigating this touchpad issue.
> There are many enthusiastic Linux users waiting to get their touchpads
> fixed and I could prioritize this task since I don't have other
> obligations. I have provided a fallback solution [1] to save their
> touchpads but it seems patches on gpiolib-acpi.c and pinctrl-amd could
> reach mainline kernel much earlier.
>
> [1] https://lore.kernel.org/patchwork/patch/1323245/


-- 
With Best Regards,
Andy Shevchenko
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-11-02 19:06                                                   ` [Linux-kernel-mentees] " Andy Shevchenko
@ 2020-11-02 22:56                                                     ` Coiby Xu
  -1 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-11-02 22:56 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Hans de Goede, Mika Westerberg, Linus Walleij,
	open list:GPIO SUBSYSTEM, wang jun, Nehal Shah, Shyam Sundar S K,
	linux-kernel-mentees

On Mon, Nov 02, 2020 at 09:06:23PM +0200, Andy Shevchenko wrote:
>On Fri, Oct 30, 2020 at 6:54 AM Coiby Xu <coiby.xu@gmail.com> wrote:
>> On Tue, Oct 27, 2020 at 06:09:49PM +0200, Andy Shevchenko wrote:
>> >On Tue, Oct 27, 2020 at 6:01 PM Hans de Goede <hdegoede@redhat.com> wrote:
>> >> On 10/27/20 4:13 PM, Andy Shevchenko wrote:
>
>...
>
>> >> I see, so then the right thing to do for the bug which we are seeing
>> >> on some AMD platforms would be to honor the debounce setting I guess ?
>> >>
>> >> Can you and/or Mika write a patch(set) for this ?
>> >
>> >I will look at it, but meanwhile I would postpone until having a
>> >Mika's Ack on the action that my understanding and course of actions
>> >is correct.
>
>I will soon send a support patch against ACPI GPIO library code.
>
>> If you don't mind, let me write this patch(set) instead:)
>
>I leave to you whatever AMD code. It will suit our both interests :-)

Excellent! Thank you!
>
>> I feel itchy
>> to fix this touchpad issue by myself after spending about a month of
>> my internship at Linux Foundation investigating this touchpad issue.
>> There are many enthusiastic Linux users waiting to get their touchpads
>> fixed and I could prioritize this task since I don't have other
>> obligations. I have provided a fallback solution [1] to save their
>> touchpads but it seems patches on gpiolib-acpi.c and pinctrl-amd could
>> reach mainline kernel much earlier.
>>
>> [1] https://lore.kernel.org/patchwork/patch/1323245/
>
>
>--
>With Best Regards,
>Andy Shevchenko

--
Best regards,
Coiby

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-11-02 22:56                                                     ` Coiby Xu
  0 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-11-02 22:56 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Shyam Sundar S K, open list:GPIO SUBSYSTEM, linux-kernel-mentees,
	Linus Walleij, wang jun, Hans de Goede, Mika Westerberg,
	Nehal Shah

On Mon, Nov 02, 2020 at 09:06:23PM +0200, Andy Shevchenko wrote:
>On Fri, Oct 30, 2020 at 6:54 AM Coiby Xu <coiby.xu@gmail.com> wrote:
>> On Tue, Oct 27, 2020 at 06:09:49PM +0200, Andy Shevchenko wrote:
>> >On Tue, Oct 27, 2020 at 6:01 PM Hans de Goede <hdegoede@redhat.com> wrote:
>> >> On 10/27/20 4:13 PM, Andy Shevchenko wrote:
>
>...
>
>> >> I see, so then the right thing to do for the bug which we are seeing
>> >> on some AMD platforms would be to honor the debounce setting I guess ?
>> >>
>> >> Can you and/or Mika write a patch(set) for this ?
>> >
>> >I will look at it, but meanwhile I would postpone until having a
>> >Mika's Ack on the action that my understanding and course of actions
>> >is correct.
>
>I will soon send a support patch against ACPI GPIO library code.
>
>> If you don't mind, let me write this patch(set) instead:)
>
>I leave to you whatever AMD code. It will suit our both interests :-)

Excellent! Thank you!
>
>> I feel itchy
>> to fix this touchpad issue by myself after spending about a month of
>> my internship at Linux Foundation investigating this touchpad issue.
>> There are many enthusiastic Linux users waiting to get their touchpads
>> fixed and I could prioritize this task since I don't have other
>> obligations. I have provided a fallback solution [1] to save their
>> touchpads but it seems patches on gpiolib-acpi.c and pinctrl-amd could
>> reach mainline kernel much earlier.
>>
>> [1] https://lore.kernel.org/patchwork/patch/1323245/
>
>
>--
>With Best Regards,
>Andy Shevchenko

--
Best regards,
Coiby
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-10-27 10:09                                         ` [Linux-kernel-mentees] " Hans de Goede
@ 2020-11-03  0:05                                           ` Coiby Xu
  -1 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-11-03  0:05 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Linus Walleij, Andy Shevchenko, open list:GPIO SUBSYSTEM,
	wang jun, Nehal Shah, Shyam Sundar S K, linux-kernel-mentees

On Tue, Oct 27, 2020 at 11:09:11AM +0100, Hans de Goede wrote:
>Hi,
>
...
>
>So I see 2 ways to move forward with his:
>
>1. Just disable the debounce filter for level type IRQs; or
>2. Add a helper to sanitize the debounce pulse-duration setting and
>   call that when setting the IRQ type.
>   This helper would read the setting check it is not crazy long for
>   an IRQ-line (lets say anything above 1 ms is crazy long) and if it
>   is crazy long then overwrite it with a saner value.
>
>2. is a bit tricky, because if the IRQ line comes from a chip then
>obviously max 1ms debouncing to catch eletrical interference should be
>fine. But sometimes cheap buttons for things like volume up/down on tablets
>are directly connected to GPIOs and then we may want longer debouncing...
>
>So if we do 2. we may want to limit it to only level type IRQs too.
>
>Note I have contacted AMD about this and asked them for some input on this,
>ideally they can tell us how exactly we should program the debounce filter
>and based on which data we should do that.

Is there any update from AMD? Based on the discussion, I'm going to
submit a patch to disable debounce filter for both level and edge
type IRQs, i.e. to remove relevant code in amd_gpio_irq_set_type of
drivers/pinctrl/pinctrl-amd.c since setting debounce filter is
orthogonal to setting irq type and Andy has submitted the patch to
support setting debounce setting supplied by ACPI in gpiolib-acpi.c

Btw, did you contact AMD through a representative? Obviously CC them
didn't get their attention. There is an inconsistency for configuring
debounce timeout in pinctrl-amd as was spotted by Andy [1]. I also need
their feedback for this matter.

[1] https://lore.kernel.org/patchwork/comment/1522675/
>
>Regards,
>
>Hans
>

--
Best regards,
Coiby

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-11-03  0:05                                           ` Coiby Xu
  0 siblings, 0 replies; 84+ messages in thread
From: Coiby Xu @ 2020-11-03  0:05 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Shyam Sundar S K, Linus Walleij, wang jun,
	open list:GPIO SUBSYSTEM, Andy Shevchenko, linux-kernel-mentees,
	Nehal Shah

On Tue, Oct 27, 2020 at 11:09:11AM +0100, Hans de Goede wrote:
>Hi,
>
...
>
>So I see 2 ways to move forward with his:
>
>1. Just disable the debounce filter for level type IRQs; or
>2. Add a helper to sanitize the debounce pulse-duration setting and
>   call that when setting the IRQ type.
>   This helper would read the setting check it is not crazy long for
>   an IRQ-line (lets say anything above 1 ms is crazy long) and if it
>   is crazy long then overwrite it with a saner value.
>
>2. is a bit tricky, because if the IRQ line comes from a chip then
>obviously max 1ms debouncing to catch eletrical interference should be
>fine. But sometimes cheap buttons for things like volume up/down on tablets
>are directly connected to GPIOs and then we may want longer debouncing...
>
>So if we do 2. we may want to limit it to only level type IRQs too.
>
>Note I have contacted AMD about this and asked them for some input on this,
>ideally they can tell us how exactly we should program the debounce filter
>and based on which data we should do that.

Is there any update from AMD? Based on the discussion, I'm going to
submit a patch to disable debounce filter for both level and edge
type IRQs, i.e. to remove relevant code in amd_gpio_irq_set_type of
drivers/pinctrl/pinctrl-amd.c since setting debounce filter is
orthogonal to setting irq type and Andy has submitted the patch to
support setting debounce setting supplied by ACPI in gpiolib-acpi.c

Btw, did you contact AMD through a representative? Obviously CC them
didn't get their attention. There is an inconsistency for configuring
debounce timeout in pinctrl-amd as was spotted by Andy [1]. I also need
their feedback for this matter.

[1] https://lore.kernel.org/patchwork/comment/1522675/
>
>Regards,
>
>Hans
>

--
Best regards,
Coiby
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-11-03  0:05                                           ` [Linux-kernel-mentees] " Coiby Xu
@ 2020-11-03 10:12                                             ` Hans de Goede
  -1 siblings, 0 replies; 84+ messages in thread
From: Hans de Goede @ 2020-11-03 10:12 UTC (permalink / raw)
  To: Coiby Xu
  Cc: Linus Walleij, Andy Shevchenko, open list:GPIO SUBSYSTEM,
	wang jun, Nehal Shah, Shyam Sundar S K, linux-kernel-mentees

Hi,

On 11/3/20 1:05 AM, Coiby Xu wrote:
> On Tue, Oct 27, 2020 at 11:09:11AM +0100, Hans de Goede wrote:
>> Hi,
>>
> ...
>>
>> So I see 2 ways to move forward with his:
>>
>> 1. Just disable the debounce filter for level type IRQs; or
>> 2. Add a helper to sanitize the debounce pulse-duration setting and
>>   call that when setting the IRQ type.
>>   This helper would read the setting check it is not crazy long for
>>   an IRQ-line (lets say anything above 1 ms is crazy long) and if it
>>   is crazy long then overwrite it with a saner value.
>>
>> 2. is a bit tricky, because if the IRQ line comes from a chip then
>> obviously max 1ms debouncing to catch eletrical interference should be
>> fine. But sometimes cheap buttons for things like volume up/down on tablets
>> are directly connected to GPIOs and then we may want longer debouncing...
>>
>> So if we do 2. we may want to limit it to only level type IRQs too.
>>
>> Note I have contacted AMD about this and asked them for some input on this,
>> ideally they can tell us how exactly we should program the debounce filter
>> and based on which data we should do that.
> 
> Is there any update from AMD?

I'm afraid not.

> Based on the discussion, I'm going to
> submit a patch to disable debounce filter for both level and edge
> type IRQs, i.e. to remove relevant code in amd_gpio_irq_set_type of
> drivers/pinctrl/pinctrl-amd.c since setting debounce filter is
> orthogonal to setting irq type and Andy has submitted the patch to
> support setting debounce setting supplied by ACPI in gpiolib-acpi.c

Ok.

> Btw, did you contact AMD through a representative?

Yes I'm using Red Hat's contacts in to AMD's server department,
which are putting me in contact with AMD'se client department.

> Obviously CC them
> didn't get their attention. There is an inconsistency for configuring
> debounce timeout in pinctrl-amd as was spotted by Andy [1]. I also need
> their feedback for this matter.
> 
> [1] https://lore.kernel.org/patchwork/comment/1522675/

This is a case where Andy is obviously right and you should just use the
higher precision "unit = 15625" value (except probably that is wrong too,
see below).

We have had similar issues with the docs for getting the TSC frequency
on some Intel chips, where the docs said 16.6 MHz for a certain register
value, where what they meant was 100/6 MHz which really is significantly
different. This was leading to a time drift of 5 minutes / day on non
networked (so no NTP) Linux systems.

I think this is what Andy was referring to when he wrote:
"What the heck with HW companies! (Just an emotion based on the experience)"

So the lesson learned there is when you can be reasonable certain that
the value really is a/b and the resulting digits of the value in the
hw doc match that taking the lousy precision into account then you
should probably assume the value really is a/b and not the lousy
precision value given in the docs (or the code comment in this case).

I mean 15.6 msec has 3 significant numbers, that gives an imprecision /
error of approx. 1000 ppm where as a decent clock crystal is in the order
of 50 ppm, so the hardware has a drift / error of approx. 50 ppm which
makes using a value with an error of 1000 ppm in the code really really
bad.

Actually all the values look somewhat suspect. The comment:

>                 Debounce        Debounce        Timer   Max
>                 TmrLarge        TmrOutUnit      Unit    Debounce
>                                                         Time
>                 0       0       61 usec (2 RtcClk)      976 usec
>                 0       1       244 usec (8 RtcClk)     3.9 msec
>                 1       0       15.6 msec (512 RtcClk)  250 msec
>                 1       1       62.5 msec (2048 RtcClk) 1 sec

Helpfully gives the values in RtcClks. A typical RTC clock crystal
is 32 KHz which gives us 31.25 usec per tick, so I would expect the
values to be:

                 0       0       62.500 usec (2 RtcClk)      
                 0       1       250.00 usec (8 RtcClk)    
                 1       0       16.000 msec (512 RtcClk) 
                 1       1       64.000 msec (2048 RtcClk)

And the max multiplier seems to be 15, not 16 as is used for the
Max Debounce Time's in the comment, so those are wrong too. I have
a feeling the table was build the wrong way around (minus the
RtcClk parts). They started with a Max Debounce Time of 1 sec, then
divided that by 16 given them 62.5 msec, where as in reality
we have 2048 ticks of a 32 KHz clock giving us 64 millisec, etc.

I also wonder if the 0-15 divider really is a 0-15 divider or a
1-16 divider... This suggests:

                if (debounce < 61) {
                        pin_reg |= 1;

It really is a 0-15 divider, so without docs we should just assume
that it is 0-15 for now, which makes the divide 1 second by 16 thing
which got them 62.5 msec (or so I believe) a bit suspect. Either the
divide by 16 is wrong, or the divider really is a 1-16 divider...

Regards,

Hans


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-11-03 10:12                                             ` Hans de Goede
  0 siblings, 0 replies; 84+ messages in thread
From: Hans de Goede @ 2020-11-03 10:12 UTC (permalink / raw)
  To: Coiby Xu
  Cc: Shyam Sundar S K, Linus Walleij, wang jun,
	open list:GPIO SUBSYSTEM, Andy Shevchenko, linux-kernel-mentees,
	Nehal Shah

Hi,

On 11/3/20 1:05 AM, Coiby Xu wrote:
> On Tue, Oct 27, 2020 at 11:09:11AM +0100, Hans de Goede wrote:
>> Hi,
>>
> ...
>>
>> So I see 2 ways to move forward with his:
>>
>> 1. Just disable the debounce filter for level type IRQs; or
>> 2. Add a helper to sanitize the debounce pulse-duration setting and
>>   call that when setting the IRQ type.
>>   This helper would read the setting check it is not crazy long for
>>   an IRQ-line (lets say anything above 1 ms is crazy long) and if it
>>   is crazy long then overwrite it with a saner value.
>>
>> 2. is a bit tricky, because if the IRQ line comes from a chip then
>> obviously max 1ms debouncing to catch eletrical interference should be
>> fine. But sometimes cheap buttons for things like volume up/down on tablets
>> are directly connected to GPIOs and then we may want longer debouncing...
>>
>> So if we do 2. we may want to limit it to only level type IRQs too.
>>
>> Note I have contacted AMD about this and asked them for some input on this,
>> ideally they can tell us how exactly we should program the debounce filter
>> and based on which data we should do that.
> 
> Is there any update from AMD?

I'm afraid not.

> Based on the discussion, I'm going to
> submit a patch to disable debounce filter for both level and edge
> type IRQs, i.e. to remove relevant code in amd_gpio_irq_set_type of
> drivers/pinctrl/pinctrl-amd.c since setting debounce filter is
> orthogonal to setting irq type and Andy has submitted the patch to
> support setting debounce setting supplied by ACPI in gpiolib-acpi.c

Ok.

> Btw, did you contact AMD through a representative?

Yes I'm using Red Hat's contacts in to AMD's server department,
which are putting me in contact with AMD'se client department.

> Obviously CC them
> didn't get their attention. There is an inconsistency for configuring
> debounce timeout in pinctrl-amd as was spotted by Andy [1]. I also need
> their feedback for this matter.
> 
> [1] https://lore.kernel.org/patchwork/comment/1522675/

This is a case where Andy is obviously right and you should just use the
higher precision "unit = 15625" value (except probably that is wrong too,
see below).

We have had similar issues with the docs for getting the TSC frequency
on some Intel chips, where the docs said 16.6 MHz for a certain register
value, where what they meant was 100/6 MHz which really is significantly
different. This was leading to a time drift of 5 minutes / day on non
networked (so no NTP) Linux systems.

I think this is what Andy was referring to when he wrote:
"What the heck with HW companies! (Just an emotion based on the experience)"

So the lesson learned there is when you can be reasonable certain that
the value really is a/b and the resulting digits of the value in the
hw doc match that taking the lousy precision into account then you
should probably assume the value really is a/b and not the lousy
precision value given in the docs (or the code comment in this case).

I mean 15.6 msec has 3 significant numbers, that gives an imprecision /
error of approx. 1000 ppm where as a decent clock crystal is in the order
of 50 ppm, so the hardware has a drift / error of approx. 50 ppm which
makes using a value with an error of 1000 ppm in the code really really
bad.

Actually all the values look somewhat suspect. The comment:

>                 Debounce        Debounce        Timer   Max
>                 TmrLarge        TmrOutUnit      Unit    Debounce
>                                                         Time
>                 0       0       61 usec (2 RtcClk)      976 usec
>                 0       1       244 usec (8 RtcClk)     3.9 msec
>                 1       0       15.6 msec (512 RtcClk)  250 msec
>                 1       1       62.5 msec (2048 RtcClk) 1 sec

Helpfully gives the values in RtcClks. A typical RTC clock crystal
is 32 KHz which gives us 31.25 usec per tick, so I would expect the
values to be:

                 0       0       62.500 usec (2 RtcClk)      
                 0       1       250.00 usec (8 RtcClk)    
                 1       0       16.000 msec (512 RtcClk) 
                 1       1       64.000 msec (2048 RtcClk)

And the max multiplier seems to be 15, not 16 as is used for the
Max Debounce Time's in the comment, so those are wrong too. I have
a feeling the table was build the wrong way around (minus the
RtcClk parts). They started with a Max Debounce Time of 1 sec, then
divided that by 16 given them 62.5 msec, where as in reality
we have 2048 ticks of a 32 KHz clock giving us 64 millisec, etc.

I also wonder if the 0-15 divider really is a 0-15 divider or a
1-16 divider... This suggests:

                if (debounce < 61) {
                        pin_reg |= 1;

It really is a 0-15 divider, so without docs we should just assume
that it is 0-15 for now, which makes the divide 1 second by 16 thing
which got them 62.5 msec (or so I believe) a bit suspect. Either the
divide by 16 is wrong, or the divider really is a 1-16 divider...

Regards,

Hans

_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-11-03 10:12                                             ` [Linux-kernel-mentees] " Hans de Goede
@ 2020-11-03 10:49                                               ` Andy Shevchenko
  -1 siblings, 0 replies; 84+ messages in thread
From: Andy Shevchenko @ 2020-11-03 10:49 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Coiby Xu, Linus Walleij, open list:GPIO SUBSYSTEM, wang jun,
	Nehal Shah, Shyam Sundar S K, linux-kernel-mentees

On Tue, Nov 3, 2020 at 12:12 PM Hans de Goede <hdegoede@redhat.com> wrote:
> On 11/3/20 1:05 AM, Coiby Xu wrote:
> > On Tue, Oct 27, 2020 at 11:09:11AM +0100, Hans de Goede wrote:

...

> > [1] https://lore.kernel.org/patchwork/comment/1522675/
>
> This is a case where Andy is obviously right and you should just use the
> higher precision "unit = 15625" value (except probably that is wrong too,
> see below).
>
> We have had similar issues with the docs for getting the TSC frequency
> on some Intel chips, where the docs said 16.6 MHz for a certain register
> value, where what they meant was 100/6 MHz which really is significantly
> different. This was leading to a time drift of 5 minutes / day on non
> networked (so no NTP) Linux systems.
>
> I think this is what Andy was referring to when he wrote:
> "What the heck with HW companies! (Just an emotion based on the experience)"

Exactly!

...

> Actually all the values look somewhat suspect. The comment:
>
> >                 Debounce        Debounce        Timer   Max
> >                 TmrLarge        TmrOutUnit      Unit    Debounce
> >                                                         Time
> >                 0       0       61 usec (2 RtcClk)      976 usec
> >                 0       1       244 usec (8 RtcClk)     3.9 msec
> >                 1       0       15.6 msec (512 RtcClk)  250 msec
> >                 1       1       62.5 msec (2048 RtcClk) 1 sec
>
> Helpfully gives the values in RtcClks. A typical RTC clock crystal
> is 32 KHz which gives us 31.25 usec per tick, so I would expect the
> values to be:

I guess you are mistaken here. Usual frequency for RTC is 32.768kHz
[1], which gives more or less above values

30.51757
61.03515
244.14062
15625
62500

[1]: https://en.wikipedia.org/wiki/Real-time_clock
(just google: rtc clock frequency)

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-11-03 10:49                                               ` Andy Shevchenko
  0 siblings, 0 replies; 84+ messages in thread
From: Andy Shevchenko @ 2020-11-03 10:49 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Shyam Sundar S K, Linus Walleij, Coiby Xu, wang jun,
	open list:GPIO SUBSYSTEM, linux-kernel-mentees, Nehal Shah

On Tue, Nov 3, 2020 at 12:12 PM Hans de Goede <hdegoede@redhat.com> wrote:
> On 11/3/20 1:05 AM, Coiby Xu wrote:
> > On Tue, Oct 27, 2020 at 11:09:11AM +0100, Hans de Goede wrote:

...

> > [1] https://lore.kernel.org/patchwork/comment/1522675/
>
> This is a case where Andy is obviously right and you should just use the
> higher precision "unit = 15625" value (except probably that is wrong too,
> see below).
>
> We have had similar issues with the docs for getting the TSC frequency
> on some Intel chips, where the docs said 16.6 MHz for a certain register
> value, where what they meant was 100/6 MHz which really is significantly
> different. This was leading to a time drift of 5 minutes / day on non
> networked (so no NTP) Linux systems.
>
> I think this is what Andy was referring to when he wrote:
> "What the heck with HW companies! (Just an emotion based on the experience)"

Exactly!

...

> Actually all the values look somewhat suspect. The comment:
>
> >                 Debounce        Debounce        Timer   Max
> >                 TmrLarge        TmrOutUnit      Unit    Debounce
> >                                                         Time
> >                 0       0       61 usec (2 RtcClk)      976 usec
> >                 0       1       244 usec (8 RtcClk)     3.9 msec
> >                 1       0       15.6 msec (512 RtcClk)  250 msec
> >                 1       1       62.5 msec (2048 RtcClk) 1 sec
>
> Helpfully gives the values in RtcClks. A typical RTC clock crystal
> is 32 KHz which gives us 31.25 usec per tick, so I would expect the
> values to be:

I guess you are mistaken here. Usual frequency for RTC is 32.768kHz
[1], which gives more or less above values

30.51757
61.03515
244.14062
15625
62500

[1]: https://en.wikipedia.org/wiki/Real-time_clock
(just google: rtc clock frequency)

-- 
With Best Regards,
Andy Shevchenko
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
  2020-11-03 10:49                                               ` [Linux-kernel-mentees] " Andy Shevchenko
@ 2020-11-03 11:00                                                 ` Hans de Goede
  -1 siblings, 0 replies; 84+ messages in thread
From: Hans de Goede @ 2020-11-03 11:00 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Coiby Xu, Linus Walleij, open list:GPIO SUBSYSTEM, wang jun,
	Nehal Shah, Shyam Sundar S K, linux-kernel-mentees

Hi,

On 11/3/20 11:49 AM, Andy Shevchenko wrote:
> On Tue, Nov 3, 2020 at 12:12 PM Hans de Goede <hdegoede@redhat.com> wrote:
>> On 11/3/20 1:05 AM, Coiby Xu wrote:
>>> On Tue, Oct 27, 2020 at 11:09:11AM +0100, Hans de Goede wrote:
> 
> ...
> 
>>> [1] https://lore.kernel.org/patchwork/comment/1522675/
>>
>> This is a case where Andy is obviously right and you should just use the
>> higher precision "unit = 15625" value (except probably that is wrong too,
>> see below).
>>
>> We have had similar issues with the docs for getting the TSC frequency
>> on some Intel chips, where the docs said 16.6 MHz for a certain register
>> value, where what they meant was 100/6 MHz which really is significantly
>> different. This was leading to a time drift of 5 minutes / day on non
>> networked (so no NTP) Linux systems.
>>
>> I think this is what Andy was referring to when he wrote:
>> "What the heck with HW companies! (Just an emotion based on the experience)"
> 
> Exactly!
> 
> ...
> 
>> Actually all the values look somewhat suspect. The comment:
>>
>>>                 Debounce        Debounce        Timer   Max
>>>                 TmrLarge        TmrOutUnit      Unit    Debounce
>>>                                                         Time
>>>                 0       0       61 usec (2 RtcClk)      976 usec
>>>                 0       1       244 usec (8 RtcClk)     3.9 msec
>>>                 1       0       15.6 msec (512 RtcClk)  250 msec
>>>                 1       1       62.5 msec (2048 RtcClk) 1 sec
>>
>> Helpfully gives the values in RtcClks. A typical RTC clock crystal
>> is 32 KHz which gives us 31.25 usec per tick, so I would expect the
>> values to be:
> 
> I guess you are mistaken here. Usual frequency for RTC is 32.768kHz
> [1], which gives more or less above values
> 
> 30.51757
> 61.03515
> 244.14062
> 15625
> 62500

You are completely right, my bad.

> [1]: https://en.wikipedia.org/wiki/Real-time_clock
> (just google: rtc clock frequency)

I did duckduckgo, but one of the first hits said 32KHz crystal and
I assumed that meant 32.000 KHz falling into the exact precision
trap I was complaining about in my previous email, oops.

Regards,

Hans



^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [Linux-kernel-mentees] Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model?
@ 2020-11-03 11:00                                                 ` Hans de Goede
  0 siblings, 0 replies; 84+ messages in thread
From: Hans de Goede @ 2020-11-03 11:00 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Shyam Sundar S K, Linus Walleij, Coiby Xu, wang jun,
	open list:GPIO SUBSYSTEM, linux-kernel-mentees, Nehal Shah

Hi,

On 11/3/20 11:49 AM, Andy Shevchenko wrote:
> On Tue, Nov 3, 2020 at 12:12 PM Hans de Goede <hdegoede@redhat.com> wrote:
>> On 11/3/20 1:05 AM, Coiby Xu wrote:
>>> On Tue, Oct 27, 2020 at 11:09:11AM +0100, Hans de Goede wrote:
> 
> ...
> 
>>> [1] https://lore.kernel.org/patchwork/comment/1522675/
>>
>> This is a case where Andy is obviously right and you should just use the
>> higher precision "unit = 15625" value (except probably that is wrong too,
>> see below).
>>
>> We have had similar issues with the docs for getting the TSC frequency
>> on some Intel chips, where the docs said 16.6 MHz for a certain register
>> value, where what they meant was 100/6 MHz which really is significantly
>> different. This was leading to a time drift of 5 minutes / day on non
>> networked (so no NTP) Linux systems.
>>
>> I think this is what Andy was referring to when he wrote:
>> "What the heck with HW companies! (Just an emotion based on the experience)"
> 
> Exactly!
> 
> ...
> 
>> Actually all the values look somewhat suspect. The comment:
>>
>>>                 Debounce        Debounce        Timer   Max
>>>                 TmrLarge        TmrOutUnit      Unit    Debounce
>>>                                                         Time
>>>                 0       0       61 usec (2 RtcClk)      976 usec
>>>                 0       1       244 usec (8 RtcClk)     3.9 msec
>>>                 1       0       15.6 msec (512 RtcClk)  250 msec
>>>                 1       1       62.5 msec (2048 RtcClk) 1 sec
>>
>> Helpfully gives the values in RtcClks. A typical RTC clock crystal
>> is 32 KHz which gives us 31.25 usec per tick, so I would expect the
>> values to be:
> 
> I guess you are mistaken here. Usual frequency for RTC is 32.768kHz
> [1], which gives more or less above values
> 
> 30.51757
> 61.03515
> 244.14062
> 15625
> 62500

You are completely right, my bad.

> [1]: https://en.wikipedia.org/wiki/Real-time_clock
> (just google: rtc clock frequency)

I did duckduckgo, but one of the first hits said 32KHz crystal and
I assumed that meant 32.000 KHz falling into the exact precision
trap I was complaining about in my previous email, oops.

Regards,

Hans


_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 84+ messages in thread

end of thread, other threads:[~2020-11-03 11:00 UTC | newest]

Thread overview: 84+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-01 13:22 Any other ways to debug GPIO interrupt controller (pinctrl-amd) for broken touchpads of a new laptop model? Coiby Xu
2020-10-01 13:22 ` [Linux-kernel-mentees] " Coiby Xu
2020-10-01 20:57 ` Linus Walleij
2020-10-01 20:57   ` [Linux-kernel-mentees] " Linus Walleij
2020-10-02  9:40   ` Hans de Goede
2020-10-02  9:40     ` [Linux-kernel-mentees] " Hans de Goede
2020-10-02 12:42     ` Coiby Xu
2020-10-02 12:42       ` [Linux-kernel-mentees] " Coiby Xu
2020-10-02 13:36       ` Hans de Goede
2020-10-02 13:36         ` [Linux-kernel-mentees] " Hans de Goede
2020-10-02 14:51         ` Coiby Xu
2020-10-02 14:51           ` [Linux-kernel-mentees] " Coiby Xu
2020-10-02 19:44           ` Hans de Goede
2020-10-02 19:44             ` [Linux-kernel-mentees] " Hans de Goede
2020-10-02 22:45             ` Coiby Xu
2020-10-02 22:45               ` [Linux-kernel-mentees] " Coiby Xu
2020-10-03 13:22               ` Hans de Goede
2020-10-03 13:22                 ` [Linux-kernel-mentees] " Hans de Goede
2020-10-03 23:03                 ` Coiby Xu
2020-10-03 23:03                   ` [Linux-kernel-mentees] " Coiby Xu
2020-10-04  5:16                   ` Coiby Xu
2020-10-04  5:16                     ` [Linux-kernel-mentees] " Coiby Xu
2020-10-06  4:49                     ` Coiby Xu
2020-10-06  4:49                       ` [Linux-kernel-mentees] " Coiby Xu
2020-10-06  6:28                       ` Hans de Goede
2020-10-06  6:28                         ` [Linux-kernel-mentees] " Hans de Goede
2020-10-06  8:31                         ` Coiby Xu
2020-10-06  8:31                           ` [Linux-kernel-mentees] " Coiby Xu
2020-10-06  8:55                           ` Hans de Goede
2020-10-06  8:55                             ` [Linux-kernel-mentees] " Hans de Goede
2020-10-06  9:28                             ` Hans de Goede
2020-10-06  9:28                               ` [Linux-kernel-mentees] " Hans de Goede
2020-10-06  9:29                               ` Hans de Goede
2020-10-06  9:29                                 ` [Linux-kernel-mentees] " Hans de Goede
2020-10-08 16:32                                 ` Coiby Xu
2020-10-08 16:32                                   ` [Linux-kernel-mentees] " Coiby Xu
2020-10-14  4:24                                 ` Coiby Xu
2020-10-14  4:24                                   ` [Linux-kernel-mentees] " Coiby Xu
2020-10-14 11:34                                   ` Coiby Xu
2020-10-14 11:34                                     ` [Linux-kernel-mentees] " Coiby Xu
2020-10-14 11:46                                   ` Hans de Goede
2020-10-14 11:46                                     ` [Linux-kernel-mentees] " Hans de Goede
2020-10-15  3:27                                     ` Coiby Xu
2020-10-15  3:27                                       ` [Linux-kernel-mentees] " Coiby Xu
2020-10-15  4:06                                     ` Coiby Xu
2020-10-15  4:06                                       ` [Linux-kernel-mentees] " Coiby Xu
2020-10-26 22:54                                     ` Coiby Xu
2020-10-26 22:54                                       ` [Linux-kernel-mentees] " Coiby Xu
2020-10-27  9:52                                       ` Andy Shevchenko
2020-10-27  9:52                                         ` [Linux-kernel-mentees] " Andy Shevchenko
2020-10-30  4:58                                         ` Coiby Xu
2020-10-30  4:58                                           ` [Linux-kernel-mentees] " Coiby Xu
2020-10-27 10:09                                       ` Hans de Goede
2020-10-27 10:09                                         ` [Linux-kernel-mentees] " Hans de Goede
2020-10-27 15:13                                         ` Andy Shevchenko
2020-10-27 15:13                                           ` [Linux-kernel-mentees] " Andy Shevchenko
2020-10-27 16:00                                           ` Hans de Goede
2020-10-27 16:00                                             ` [Linux-kernel-mentees] " Hans de Goede
2020-10-27 16:09                                             ` Andy Shevchenko
2020-10-27 16:09                                               ` [Linux-kernel-mentees] " Andy Shevchenko
2020-10-29  8:04                                               ` Mika Westerberg
2020-10-29  8:04                                                 ` [Linux-kernel-mentees] " Mika Westerberg
2020-10-30  4:54                                               ` Coiby Xu
2020-10-30  4:54                                                 ` [Linux-kernel-mentees] " Coiby Xu
2020-11-02 19:06                                                 ` Andy Shevchenko
2020-11-02 19:06                                                   ` [Linux-kernel-mentees] " Andy Shevchenko
2020-11-02 22:56                                                   ` Coiby Xu
2020-11-02 22:56                                                     ` [Linux-kernel-mentees] " Coiby Xu
2020-11-03  0:05                                         ` Coiby Xu
2020-11-03  0:05                                           ` [Linux-kernel-mentees] " Coiby Xu
2020-11-03 10:12                                           ` Hans de Goede
2020-11-03 10:12                                             ` [Linux-kernel-mentees] " Hans de Goede
2020-11-03 10:49                                             ` Andy Shevchenko
2020-11-03 10:49                                               ` [Linux-kernel-mentees] " Andy Shevchenko
2020-11-03 11:00                                               ` Hans de Goede
2020-11-03 11:00                                                 ` [Linux-kernel-mentees] " Hans de Goede
2020-10-08 16:26                               ` Coiby Xu
2020-10-08 16:26                                 ` [Linux-kernel-mentees] " Coiby Xu
2020-10-06  9:16                           ` Linus Walleij
2020-10-06  9:16                             ` [Linux-kernel-mentees] " Linus Walleij
2020-10-08 16:40                             ` Coiby Xu
2020-10-08 16:40                               ` [Linux-kernel-mentees] " Coiby Xu
2020-10-02 10:59   ` Coiby Xu
2020-10-02 10:59     ` [Linux-kernel-mentees] " Coiby Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.