linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Designware UART bug
@ 2017-05-03 10:17 Olliver Schinagl
  2017-05-03 10:40 ` [linux-sunxi] " Chen-Yu Tsai
  0 siblings, 1 reply; 7+ messages in thread
From: Olliver Schinagl @ 2017-05-03 10:17 UTC (permalink / raw)
  To: jamie, tim.kryger; +Cc: linux-kernel, dev, Maxime Ripard

Hey Jamie,

Several years ago you wrote the glue-code [0] for the DW 8250 IP. Over 
the years various 'fixes' have been applied to resolve certain 'weird' 
problems that Tim tried to fix with [1].

After going over the datasheets and code with a comb several times now, 
I think I may have found one (of a few others) reasons and would like 
both your and Tim's thoughts here.

The current (and original) code [2] uses the register offset 0x1f for 
the UART_USR register.

I searched far and wide, various datasheet of physical uarts (8250 - 
16950) and some IP cores and none seem to have the USR (and specifically 
the USR[0] bit) register, so it seems to be specific to the DW_apb_uart. 
However looking at the only databook available to me [3] of the UART IP, 
I cannot seem to find anything at register offset 0x1f.

The platform I'm using uses the Allwinner A20 SoC, which also features 
the DW uart IP and also here, there is nothing at offset 0x1f.

The intended register IS however actually at 0x7c.

My question is thus twofold.

Why was 0x1f used? Is this specific to a certain (version) UART or is 
this a long uncaught typo.

Tim, assuming it is a typo, could this the cause which made you 
implement [1]? From what I understand, you keep writing the LCR until it 
takes.

Initially, the UART_IIR_BUSY check looked like this:
	if (serial8250_handle_irq(p, iir)) {
                 return 1;
         } else if ((iir & UART_IIR_BUSY) == UART_IIR_BUSY) {
                 /* Clear the USR and write the LCR again. */
                 (void)p->serial_in(p, d->usr_reg);
                 p->serial_out(p, UART_LCR, d->last_lcr);

                 return 1;
         }

what I'm missing here is, that if UART_IIR_BUSY is set, we have:
* check the d->usr_reg (UART_USR) bit 0
* wait until it becomes cleared (do not allow new data to be pushed out, 
optionally force the data to be pushed out)
* write LCR register (and check if it took (and no longer loop over the 
LCR to see if the values stuck, in theory).
* if we never get un-busy, something is wrong?

All of this btw is currently moot anyway, as the only way to get into 
the (else) if, is if serial8250_handle_irq returns false. And from what 
I can see, this is only if iir == UART_IIR_NO_IRQ, in which case we 
never ever clear the USR and thus never ever clear the UART_IIR_BUSY flag.

Olliver


[0] 
https://github.com/torvalds/linux/commit/6b1a98d1c4851235d9b6764b3f7b7db7909fc760

[1] 
https://github.com/torvalds/linux/commit/c49436b657d0a56a6ad90d14a7c3041add7cf64d

[2] 
https://github.com/torvalds/linux/commit/6b1a98d1c4851235d9b6764b3f7b7db7909fc760#diff-d6e619fc4c51febf7880632fde5d0208R63

[3] http://linux-sunxi.org/images/d/d2/Dw_apb_uart_db.pdf

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [linux-sunxi] Designware UART bug
  2017-05-03 10:17 Designware UART bug Olliver Schinagl
@ 2017-05-03 10:40 ` Chen-Yu Tsai
  2017-05-03 11:26   ` Olliver Schinagl
  0 siblings, 1 reply; 7+ messages in thread
From: Chen-Yu Tsai @ 2017-05-03 10:40 UTC (permalink / raw)
  To: Oliver Schinagl; +Cc: jamie, tim.kryger, linux-kernel, dev, Maxime Ripard

On Wed, May 3, 2017 at 6:17 PM, Olliver Schinagl <oliver@schinagl.nl> wrote:
> Hey Jamie,
>
> Several years ago you wrote the glue-code [0] for the DW 8250 IP. Over the
> years various 'fixes' have been applied to resolve certain 'weird' problems
> that Tim tried to fix with [1].
>
> After going over the datasheets and code with a comb several times now, I
> think I may have found one (of a few others) reasons and would like both
> your and Tim's thoughts here.
>
> The current (and original) code [2] uses the register offset 0x1f for the
> UART_USR register.
>
> I searched far and wide, various datasheet of physical uarts (8250 - 16950)
> and some IP cores and none seem to have the USR (and specifically the USR[0]
> bit) register, so it seems to be specific to the DW_apb_uart. However
> looking at the only databook available to me [3] of the UART IP, I cannot
> seem to find anything at register offset 0x1f.
>
> The platform I'm using uses the Allwinner A20 SoC, which also features the
> DW uart IP and also here, there is nothing at offset 0x1f.
>
> The intended register IS however actually at 0x7c.
>
> My question is thus twofold.
>
> Why was 0x1f used? Is this specific to a certain (version) UART or is this a
> long uncaught typo.

The original 8250 datasheets have registers at byte offsets. Note that the
registers are only 8 bits wide.

On Allwinner, and many other platforms, the registers are still 8 bits
wide, but are 32bit-aligned, likely for aligned bus access. 0x7c = 0x1f << 2.
The 8250 core accessor functions are supposed to handle this for you.

ChenYu

> Tim, assuming it is a typo, could this the cause which made you implement
> [1]? From what I understand, you keep writing the LCR until it takes.
>
> Initially, the UART_IIR_BUSY check looked like this:
>         if (serial8250_handle_irq(p, iir)) {
>                 return 1;
>         } else if ((iir & UART_IIR_BUSY) == UART_IIR_BUSY) {
>                 /* Clear the USR and write the LCR again. */
>                 (void)p->serial_in(p, d->usr_reg);
>                 p->serial_out(p, UART_LCR, d->last_lcr);
>
>                 return 1;
>         }
>
> what I'm missing here is, that if UART_IIR_BUSY is set, we have:
> * check the d->usr_reg (UART_USR) bit 0
> * wait until it becomes cleared (do not allow new data to be pushed out,
> optionally force the data to be pushed out)
> * write LCR register (and check if it took (and no longer loop over the LCR
> to see if the values stuck, in theory).
> * if we never get un-busy, something is wrong?
>
> All of this btw is currently moot anyway, as the only way to get into the
> (else) if, is if serial8250_handle_irq returns false. And from what I can
> see, this is only if iir == UART_IIR_NO_IRQ, in which case we never ever
> clear the USR and thus never ever clear the UART_IIR_BUSY flag.
>
> Olliver
>
>
> [0]
> https://github.com/torvalds/linux/commit/6b1a98d1c4851235d9b6764b3f7b7db7909fc760
>
> [1]
> https://github.com/torvalds/linux/commit/c49436b657d0a56a6ad90d14a7c3041add7cf64d
>
> [2]
> https://github.com/torvalds/linux/commit/6b1a98d1c4851235d9b6764b3f7b7db7909fc760#diff-d6e619fc4c51febf7880632fde5d0208R63
>
> [3] http://linux-sunxi.org/images/d/d2/Dw_apb_uart_db.pdf
>
> --
> You received this message because you are subscribed to the Google Groups
> "linux-sunxi" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to linux-sunxi+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [linux-sunxi] Designware UART bug
  2017-05-03 10:40 ` [linux-sunxi] " Chen-Yu Tsai
@ 2017-05-03 11:26   ` Olliver Schinagl
  2017-05-03 14:22     ` Tim Kryger
  0 siblings, 1 reply; 7+ messages in thread
From: Olliver Schinagl @ 2017-05-03 11:26 UTC (permalink / raw)
  To: Chen-Yu Tsai; +Cc: jamie, tim.kryger, linux-kernel, dev, Maxime Ripard

Hey Chen-Yu

On 03-05-17 12:40, Chen-Yu Tsai wrote:
> On Wed, May 3, 2017 at 6:17 PM, Olliver Schinagl <oliver@schinagl.nl> wrote:
>> Hey Jamie,
>>
>> Several years ago you wrote the glue-code [0] for the DW 8250 IP. Over the
>> years various 'fixes' have been applied to resolve certain 'weird' problems
>> that Tim tried to fix with [1].
>>
>> After going over the datasheets and code with a comb several times now, I
>> think I may have found one (of a few others) reasons and would like both
>> your and Tim's thoughts here.
>>
>> The current (and original) code [2] uses the register offset 0x1f for the
>> UART_USR register.
>>
>> I searched far and wide, various datasheet of physical uarts (8250 - 16950)
>> and some IP cores and none seem to have the USR (and specifically the USR[0]
>> bit) register, so it seems to be specific to the DW_apb_uart. However
>> looking at the only databook available to me [3] of the UART IP, I cannot
>> seem to find anything at register offset 0x1f.
>>
>> The platform I'm using uses the Allwinner A20 SoC, which also features the
>> DW uart IP and also here, there is nothing at offset 0x1f.
>>
>> The intended register IS however actually at 0x7c.
>>
>> My question is thus twofold.
>>
>> Why was 0x1f used? Is this specific to a certain (version) UART or is this a
>> long uncaught typo.
>
> The original 8250 datasheets have registers at byte offsets. Note that the
> registers are only 8 bits wide.
Yes, I did account for that difference :) Or rather, it should be the 
first register after the scratchpad, but there is nothing after the 
scratchpad on the 8250's.
>
> On Allwinner, and many other platforms, the registers are still 8 bits
> wide, but are 32bit-aligned, likely for aligned bus access. 0x7c = 0x1f << 2.
> The 8250 core accessor functions are supposed to handle this for you.
Actually, they are 32 bit registers, but only the lowest 8 are used, to 
remain code-compatible with true 8 bitters. The end result, is the same 
of course.

As for the shift, I see now!

readb(p->membase + (offset << p->regshift));

And indeed, the regular defines are all indeed 8 bit offsets. Oversight 
on my part, and I changed the comment to make this slightly more clear, 
which goes into my greater uart series.

Thanks Chen-Yu for pointing this oversight of mine out!

>
> ChenYu
>
>> Tim, assuming it is a typo, could this the cause which made you implement
>> [1]? From what I understand, you keep writing the LCR until it takes.
So with ChenYu's explanation, the USR register holds a valid status, but 
it was and is never checked and thus the below part is still a valid 
question?

>>
>> Initially, the UART_IIR_BUSY check looked like this:
>>         if (serial8250_handle_irq(p, iir)) {
>>                 return 1;
>>         } else if ((iir & UART_IIR_BUSY) == UART_IIR_BUSY) {
>>                 /* Clear the USR and write the LCR again. */
>>                 (void)p->serial_in(p, d->usr_reg);
>>                 p->serial_out(p, UART_LCR, d->last_lcr);
>>
>>                 return 1;
>>         }
>>
>> what I'm missing here is, that if UART_IIR_BUSY is set, we have to:
>> * check the d->usr_reg (UART_USR) bit 0
>> * wait until it becomes cleared (do not allow new data to be pushed out,
>> optionally force the data to be pushed out after a timeout)
>> * write LCR register (and check if it took (and no longer loop over the LCR
>> to see if the values stuck, in theory)).
>> * if we never get un-busy, something is wrong?
>>
>> All of this btw is currently moot anyway, as the only way to get into the
>> (else) if, is if serial8250_handle_irq returns false. And from what I can
>> see, this is only if iir == UART_IIR_NO_IRQ, in which case we never ever
>> clear the USR and thus never ever clear the UART_IIR_BUSY flag.
especially this point is important I suppose, hence the need for the 
workaround [1].

Olliver
>>
>> Olliver
>>
>>
>> [0]
>> https://github.com/torvalds/linux/commit/6b1a98d1c4851235d9b6764b3f7b7db7909fc760
>>
>> [1]
>> https://github.com/torvalds/linux/commit/c49436b657d0a56a6ad90d14a7c3041add7cf64d
>>
>> [2]
>> https://github.com/torvalds/linux/commit/6b1a98d1c4851235d9b6764b3f7b7db7909fc760#diff-d6e619fc4c51febf7880632fde5d0208R63
>>
>> [3] http://linux-sunxi.org/images/d/d2/Dw_apb_uart_db.pdf
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "linux-sunxi" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to linux-sunxi+unsubscribe@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [linux-sunxi] Designware UART bug
  2017-05-03 11:26   ` Olliver Schinagl
@ 2017-05-03 14:22     ` Tim Kryger
  2017-05-03 15:40       ` Olliver Schinagl
  0 siblings, 1 reply; 7+ messages in thread
From: Tim Kryger @ 2017-05-03 14:22 UTC (permalink / raw)
  To: Olliver Schinagl
  Cc: Chen-Yu Tsai, Jamie Iles, Tim Kryger, linux-kernel, dev, Maxime Ripard

On Wed, May 3, 2017 at 4:26 AM, Olliver Schinagl <oliver@schinagl.nl> wrote:
> Hey Chen-Yu
>
> On 03-05-17 12:40, Chen-Yu Tsai wrote:
>>
>> On Wed, May 3, 2017 at 6:17 PM, Olliver Schinagl <oliver@schinagl.nl>
>> wrote:
>>>
>>> Hey Jamie,
>>>
>>> Several years ago you wrote the glue-code [0] for the DW 8250 IP. Over
>>> the
>>> years various 'fixes' have been applied to resolve certain 'weird'
>>> problems
>>> that Tim tried to fix with [1].
>>>
>>> After going over the datasheets and code with a comb several times now, I
>>> think I may have found one (of a few others) reasons and would like both
>>> your and Tim's thoughts here.
>>>
>>> The current (and original) code [2] uses the register offset 0x1f for the
>>> UART_USR register.
>>>
>>> I searched far and wide, various datasheet of physical uarts (8250 -
>>> 16950)
>>> and some IP cores and none seem to have the USR (and specifically the
>>> USR[0]
>>> bit) register, so it seems to be specific to the DW_apb_uart. However
>>> looking at the only databook available to me [3] of the UART IP, I cannot
>>> seem to find anything at register offset 0x1f.
>>>
>>> The platform I'm using uses the Allwinner A20 SoC, which also features
>>> the
>>> DW uart IP and also here, there is nothing at offset 0x1f.
>>>
>>> The intended register IS however actually at 0x7c.
>>>
>>> My question is thus twofold.
>>>
>>> Why was 0x1f used? Is this specific to a certain (version) UART or is
>>> this a
>>> long uncaught typo.
>>
>>
>> The original 8250 datasheets have registers at byte offsets. Note that the
>> registers are only 8 bits wide.
>
> Yes, I did account for that difference :) Or rather, it should be the first
> register after the scratchpad, but there is nothing after the scratchpad on
> the 8250's.
>>
>>
>> On Allwinner, and many other platforms, the registers are still 8 bits
>> wide, but are 32bit-aligned, likely for aligned bus access. 0x7c = 0x1f <<
>> 2.
>> The 8250 core accessor functions are supposed to handle this for you.
>
> Actually, they are 32 bit registers, but only the lowest 8 are used, to
> remain code-compatible with true 8 bitters. The end result, is the same of
> course.
>
> As for the shift, I see now!
>
> readb(p->membase + (offset << p->regshift));
>
> And indeed, the regular defines are all indeed 8 bit offsets. Oversight on
> my part, and I changed the comment to make this slightly more clear, which
> goes into my greater uart series.
>
> Thanks Chen-Yu for pointing this oversight of mine out!
>
>>
>> ChenYu
>>
>>> Tim, assuming it is a typo, could this the cause which made you implement
>>> [1]? From what I understand, you keep writing the LCR until it takes.
>
> So with ChenYu's explanation, the USR register holds a valid status, but it
> was and is never checked and thus the below part is still a valid question?
>
>>>
>>> Initially, the UART_IIR_BUSY check looked like this:
>>>         if (serial8250_handle_irq(p, iir)) {
>>>                 return 1;
>>>         } else if ((iir & UART_IIR_BUSY) == UART_IIR_BUSY) {
>>>                 /* Clear the USR and write the LCR again. */
>>>                 (void)p->serial_in(p, d->usr_reg);
>>>                 p->serial_out(p, UART_LCR, d->last_lcr);
>>>
>>>                 return 1;
>>>         }
>>>
>>> what I'm missing here is, that if UART_IIR_BUSY is set, we have to:
>>> * check the d->usr_reg (UART_USR) bit 0
>>> * wait until it becomes cleared (do not allow new data to be pushed out,
>>> optionally force the data to be pushed out after a timeout)
>>> * write LCR register (and check if it took (and no longer loop over the
>>> LCR
>>> to see if the values stuck, in theory)).
>>> * if we never get un-busy, something is wrong?
>>>
>>> All of this btw is currently moot anyway, as the only way to get into the
>>> (else) if, is if serial8250_handle_irq returns false. And from what I can
>>> see, this is only if iir == UART_IIR_NO_IRQ, in which case we never ever
>>> clear the USR and thus never ever clear the UART_IIR_BUSY flag.
>
> especially this point is important I suppose, hence the need for the
> workaround [1].

I'm not clear on what question you are actually asking here.

The original code didn't work since multiple register writes may occur
after the initial failed LCR write while interrupts are still disabled.

Thus the driver must read back LCR to ensure writes are going through
immediately rather than rely upon an interrupt that can be delayed.

>
> Olliver
>
>>>
>>> Olliver
>>>
>>>
>>> [0]
>>>
>>> https://github.com/torvalds/linux/commit/6b1a98d1c4851235d9b6764b3f7b7db7909fc760
>>>
>>> [1]
>>>
>>> https://github.com/torvalds/linux/commit/c49436b657d0a56a6ad90d14a7c3041add7cf64d
>>>
>>> [2]
>>>
>>> https://github.com/torvalds/linux/commit/6b1a98d1c4851235d9b6764b3f7b7db7909fc760#diff-d6e619fc4c51febf7880632fde5d0208R63
>>>
>>> [3] http://linux-sunxi.org/images/d/d2/Dw_apb_uart_db.pdf
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups
>>> "linux-sunxi" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an
>>> email to linux-sunxi+unsubscribe@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [linux-sunxi] Designware UART bug
  2017-05-03 14:22     ` Tim Kryger
@ 2017-05-03 15:40       ` Olliver Schinagl
  2017-05-04  3:51         ` Tim Kryger
  0 siblings, 1 reply; 7+ messages in thread
From: Olliver Schinagl @ 2017-05-03 15:40 UTC (permalink / raw)
  To: Tim Kryger
  Cc: Chen-Yu Tsai, Jamie Iles, Tim Kryger, linux-kernel, dev, Maxime Ripard

Hey Tim,

On 03-05-17 16:22, Tim Kryger wrote:
> On Wed, May 3, 2017 at 4:26 AM, Olliver Schinagl <oliver@schinagl.nl> wrote:
>> Hey Chen-Yu
>>
>> On 03-05-17 12:40, Chen-Yu Tsai wrote:
>>>
>>> On Wed, May 3, 2017 at 6:17 PM, Olliver Schinagl <oliver@schinagl.nl>
>>> wrote:
>>>>
>>>> Hey Jamie,
>>>>
>>>> Several years ago you wrote the glue-code [0] for the DW 8250 IP. Over
>>>> the
>>>> years various 'fixes' have been applied to resolve certain 'weird'
>>>> problems
>>>> that Tim tried to fix with [1].
>>>>
>>>> After going over the datasheets and code with a comb several times now, I
>>>> think I may have found one (of a few others) reasons and would like both
>>>> your and Tim's thoughts here.
>>>>
>>>> The current (and original) code [2] uses the register offset 0x1f for the
>>>> UART_USR register.
>>>>
>>>> I searched far and wide, various datasheet of physical uarts (8250 -
>>>> 16950)
>>>> and some IP cores and none seem to have the USR (and specifically the
>>>> USR[0]
>>>> bit) register, so it seems to be specific to the DW_apb_uart. However
>>>> looking at the only databook available to me [3] of the UART IP, I cannot
>>>> seem to find anything at register offset 0x1f.
>>>>
>>>> The platform I'm using uses the Allwinner A20 SoC, which also features
>>>> the
>>>> DW uart IP and also here, there is nothing at offset 0x1f.
>>>>
>>>> The intended register IS however actually at 0x7c.
>>>>
>>>> My question is thus twofold.
>>>>
>>>> Why was 0x1f used? Is this specific to a certain (version) UART or is
>>>> this a
>>>> long uncaught typo.
>>>
>>>
>>> The original 8250 datasheets have registers at byte offsets. Note that the
>>> registers are only 8 bits wide.
>>
>> Yes, I did account for that difference :) Or rather, it should be the first
>> register after the scratchpad, but there is nothing after the scratchpad on
>> the 8250's.
>>>
>>>
>>> On Allwinner, and many other platforms, the registers are still 8 bits
>>> wide, but are 32bit-aligned, likely for aligned bus access. 0x7c = 0x1f <<
>>> 2.
>>> The 8250 core accessor functions are supposed to handle this for you.
>>
>> Actually, they are 32 bit registers, but only the lowest 8 are used, to
>> remain code-compatible with true 8 bitters. The end result, is the same of
>> course.
>>
>> As for the shift, I see now!
>>
>> readb(p->membase + (offset << p->regshift));
>>
>> And indeed, the regular defines are all indeed 8 bit offsets. Oversight on
>> my part, and I changed the comment to make this slightly more clear, which
>> goes into my greater uart series.
>>
>> Thanks Chen-Yu for pointing this oversight of mine out!
>>
>>>
>>> ChenYu
>>>
>>>> Tim, assuming it is a typo, could this the cause which made you implement
>>>> [1]? From what I understand, you keep writing the LCR until it takes.
>>
>> So with ChenYu's explanation, the USR register holds a valid status, but it
>> was and is never checked and thus the below part is still a valid question?
>>
>>>>
>>>> Initially, the UART_IIR_BUSY check looked like this:
>>>>         if (serial8250_handle_irq(p, iir)) {
>>>>                 return 1;
>>>>         } else if ((iir & UART_IIR_BUSY) == UART_IIR_BUSY) {
>>>>                 /* Clear the USR and write the LCR again. */
>>>>                 (void)p->serial_in(p, d->usr_reg);
>>>>                 p->serial_out(p, UART_LCR, d->last_lcr);
>>>>
>>>>                 return 1;
>>>>         }
>>>>
>>>> what I'm missing here is, that if UART_IIR_BUSY is set, we have to:
>>>> * check the d->usr_reg (UART_USR) bit 0
>>>> * wait until it becomes cleared (do not allow new data to be pushed out,
>>>> optionally force the data to be pushed out after a timeout)
>>>> * write LCR register (and check if it took (and no longer loop over the
>>>> LCR
>>>> to see if the values stuck, in theory)).
>>>> * if we never get un-busy, something is wrong?
>>>>
>>>> All of this btw is currently moot anyway, as the only way to get into the
>>>> (else) if, is if serial8250_handle_irq returns false. And from what I can
>>>> see, this is only if iir == UART_IIR_NO_IRQ, in which case we never ever
>>>> clear the USR and thus never ever clear the UART_IIR_BUSY flag.
>>
>> especially this point is important I suppose, hence the need for the
>> workaround [1].
>
> I'm not clear on what question you are actually asking here.
>
> The original code didn't work since multiple register writes may occur
> after the initial failed LCR write while interrupts are still disabled.
>
> Thus the driver must read back LCR to ensure writes are going through
> immediately rather than rely upon an interrupt that can be delayed.

Ok, so as far as I understand (from the datasheet) the intended way to 
do this would be to check for the BUSY IRQ & USR[0] IRQ and if it is 
busy, (re-write) the LCR. We no longer do this because it did not work 
due to the delayed interrupt.

So one question is, why are we not checking the USR[0] reg whilst doing 
the loop? Is that register not there for exactly that purpose? But I 
guess brute-forcing it works more reliable I suppose then.

But secondly, when is the UART_IIR_BUSY interrupt handled? And it not 
being handled, could that be the actual reason things where failing? (Or 
as I read somewhere a silicon bug?)

Right now, we have serial8250_handle_irq() and if that returns 0, we 
check if we (still) have an unhandled UART_IIR_BUSY interrupt.
But the only way for serial8250_handle_irq() to return 0, is if it has 
set UART_NO_INT.

If we do not have an IRQ, i'm pretty sure, UART_IIR_BUSY can't be 
triggered right? And if it IS the interrupt that caused us to go into 
the interrupt handler, well, handle_irq does its thing and then returns 
1 for finishing it, which in turn causes the UART_IIR_BUSY check to be 
skipped.

So we never clear the UART_IIR_BUSY interrupt if we manage to trigger 
that. (Please do correct me if I'm wrong.


I'm changing things in the interrupt handler a bit now to first check 
for the busy interrupt first and if that is triggered do the dummy 
return (clear it) and return (since LCR is handled alternativly.

Olliver
>
>>
>> Olliver
>>
>>>>
>>>> Olliver
>>>>
>>>>
>>>> [0]
>>>>
>>>> https://github.com/torvalds/linux/commit/6b1a98d1c4851235d9b6764b3f7b7db7909fc760
>>>>
>>>> [1]
>>>>
>>>> https://github.com/torvalds/linux/commit/c49436b657d0a56a6ad90d14a7c3041add7cf64d
>>>>
>>>> [2]
>>>>
>>>> https://github.com/torvalds/linux/commit/6b1a98d1c4851235d9b6764b3f7b7db7909fc760#diff-d6e619fc4c51febf7880632fde5d0208R63
>>>>
>>>> [3] http://linux-sunxi.org/images/d/d2/Dw_apb_uart_db.pdf
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google Groups
>>>> "linux-sunxi" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send an
>>>> email to linux-sunxi+unsubscribe@googlegroups.com.
>>>> For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [linux-sunxi] Designware UART bug
  2017-05-03 15:40       ` Olliver Schinagl
@ 2017-05-04  3:51         ` Tim Kryger
  2017-05-04  8:35           ` Olliver Schinagl
  0 siblings, 1 reply; 7+ messages in thread
From: Tim Kryger @ 2017-05-04  3:51 UTC (permalink / raw)
  To: Olliver Schinagl
  Cc: Chen-Yu Tsai, Jamie Iles, Tim Kryger, linux-kernel, dev, Maxime Ripard

On Wed, May 3, 2017 at 8:40 AM, Olliver Schinagl <oliver@schinagl.nl> wrote:
> Hey Tim,

>
> Ok, so as far as I understand (from the datasheet) the intended way to do
> this would be to check for the BUSY IRQ & USR[0] IRQ and if it is busy,
> (re-write) the LCR. We no longer do this because it did not work due to the
> delayed interrupt.
>
> So one question is, why are we not checking the USR[0] reg whilst doing the
> loop? Is that register not there for exactly that purpose? But I guess
> brute-forcing it works more reliable I suppose then.

That status bit isn't particularly helpful since even if it reports
the UART is idle, there is no guarantee it won't become busy before
the attempted write of the LCR happens.

> But secondly, when is the UART_IIR_BUSY interrupt handled? And it not being
> handled, could that be the actual reason things where failing? (Or as I read
> somewhere a silicon bug?)
>
> Right now, we have serial8250_handle_irq() and if that returns 0, we check
> if we (still) have an unhandled UART_IIR_BUSY interrupt.
> But the only way for serial8250_handle_irq() to return 0, is if it has set
> UART_NO_INT.
>
> If we do not have an IRQ, i'm pretty sure, UART_IIR_BUSY can't be triggered
> right? And if it IS the interrupt that caused us to go into the interrupt
> handler, well, handle_irq does its thing and then returns 1 for finishing
> it, which in turn causes the UART_IIR_BUSY check to be skipped.
>
> So we never clear the UART_IIR_BUSY interrupt if we manage to trigger that.
> (Please do correct me if I'm wrong.

Note that the LSB is set in each of the following defines.

#define UART_IIR_NO_INT         0x01 /* No interrupts pending */
#define UART_IIR_BUSY           0x07 /* DesignWare APB Busy Detect */

Thus, when iir is UART_IIR_BUSY, serial8250_handle_irq bails out and
returns zero such that the interrupt gets cleared by the read of USR
in dw8250_handle_irq.

> I'm changing things in the interrupt handler a bit now to first check for
> the busy interrupt first and if that is triggered do the dummy return (clear
> it) and return (since LCR is handled alternativly.
>
> Olliver

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [linux-sunxi] Designware UART bug
  2017-05-04  3:51         ` Tim Kryger
@ 2017-05-04  8:35           ` Olliver Schinagl
  0 siblings, 0 replies; 7+ messages in thread
From: Olliver Schinagl @ 2017-05-04  8:35 UTC (permalink / raw)
  To: Tim Kryger
  Cc: Chen-Yu Tsai, Jamie Iles, Tim Kryger, linux-kernel, dev, Maxime Ripard

Hey Tim,

On 04-05-17 05:51, Tim Kryger wrote:
> On Wed, May 3, 2017 at 8:40 AM, Olliver Schinagl <oliver@schinagl.nl> wrote:
>> Hey Tim,
>
>>
>> Ok, so as far as I understand (from the datasheet) the intended way to do
>> this would be to check for the BUSY IRQ & USR[0] IRQ and if it is busy,
>> (re-write) the LCR. We no longer do this because it did not work due to the
>> delayed interrupt.
>>
>> So one question is, why are we not checking the USR[0] reg whilst doing the
>> loop? Is that register not there for exactly that purpose? But I guess
>> brute-forcing it works more reliable I suppose then.
>
> That status bit isn't particularly helpful since even if it reports
> the UART is idle, there is no guarantee it won't become busy before
> the attempted write of the LCR happens.

I was thinking however of adding this to check_lcr:

		if (lcr & DW_UART_USR_BUSY)
			continue;

e.g. if it is busy, there is no point in trying to update the LCR as 
this will fail and will cause an interrupt again (the busy interrupt).

The rest of the forcefull logic then still holds.
>
>> But secondly, when is the UART_IIR_BUSY interrupt handled? And it not being
>> handled, could that be the actual reason things where failing? (Or as I read
>> somewhere a silicon bug?)
>>
>> Right now, we have serial8250_handle_irq() and if that returns 0, we check
>> if we (still) have an unhandled UART_IIR_BUSY interrupt.
>> But the only way for serial8250_handle_irq() to return 0, is if it has set
>> UART_NO_INT.
>>
>> If we do not have an IRQ, i'm pretty sure, UART_IIR_BUSY can't be triggered
>> right? And if it IS the interrupt that caused us to go into the interrupt
>> handler, well, handle_irq does its thing and then returns 1 for finishing
>> it, which in turn causes the UART_IIR_BUSY check to be skipped.
>>
>> So we never clear the UART_IIR_BUSY interrupt if we manage to trigger that.
>> (Please do correct me if I'm wrong.
>
> Note that the LSB is set in each of the following defines.
>
> #define UART_IIR_NO_INT         0x01 /* No interrupts pending */
> #define UART_IIR_BUSY           0x07 /* DesignWare APB Busy Detect */

Right, I am aware,
>
> Thus, when iir is UART_IIR_BUSY, serial8250_handle_irq bails out and
> returns zero such that the interrupt gets cleared by the read of USR
> in dw8250_handle_irq.

the trickery of course is here that we are treating the iir register as 
a bitfield, when really, it is not (anymore).

It just gets very confusing of course. We get an interrupt, we check 
which one it, either no interrupt, or busy.

So In the dw8250_handle_irq, we could first check if the uart is busy 
and then handle the busy case and return.

The reason while I'm looking so into this, is as I mentioned, we 
sometimes get the too much work for interrupt storm, and what I see from 
debugging, is that the busy interrupt is set, but never cleared (or 
constantly set).

Thanks for listening so far, and I will keep digging :)

>
>> I'm changing things in the interrupt handler a bit now to first check for
>> the busy interrupt first and if that is triggered do the dummy return (clear
>> it) and return (since LCR is handled alternativly.
>>
>> Olliver

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-05-04  8:36 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-03 10:17 Designware UART bug Olliver Schinagl
2017-05-03 10:40 ` [linux-sunxi] " Chen-Yu Tsai
2017-05-03 11:26   ` Olliver Schinagl
2017-05-03 14:22     ` Tim Kryger
2017-05-03 15:40       ` Olliver Schinagl
2017-05-04  3:51         ` Tim Kryger
2017-05-04  8:35           ` Olliver Schinagl

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).