All of lore.kernel.org
 help / color / mirror / Atom feed
From: Romain Izard <romain.izard.pro@gmail.com>
To: Nicolas Ferre <nicolas.ferre@microchip.com>
Cc: Boris Brezillon <boris.brezillon@free-electrons.com>,
	Michael Turquette <mturquette@baylibre.com>,
	Stephen Boyd <sboyd@codeaurora.org>,
	Ludovic Desroches <ludovic.desroches@microchip.com>,
	Jonathan Cameron <jic23@kernel.org>,
	Wenyou Yang <wenyou.yang@atmel.com>,
	Josh Wu <rainyfeeling@outlook.com>,
	David Woodhouse <dwmw2@infradead.org>,
	Brian Norris <computersforpeace@gmail.com>,
	Marek Vasut <marek.vasut@gmail.com>,
	Cyrille Pitchen <cyrille.pitchen@wedev4u.fr>,
	Thierry Reding <thierry.reding@gmail.com>,
	Richard Genoud <richard.genoud@gmail.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Alan Stern <stern@rowland.harvard.edu>,
	linux-clk@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
	linux-iio@vger.kernel.org,
	linux-mtd <linux-mtd@lists.infradead.org>,
	linux-pwm@vger.kernel.org, linux-serial@vger.kernel.org,
	linux-usb@vger.kernel.org,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH v1 01/10] clk: at91: pmc: Wait for clocks when resuming
Date: Thu, 14 Sep 2017 18:15:56 +0200	[thread overview]
Message-ID: <CAGkQfmPB8b38=hJrtuKJ-Hiqdfx9dt5SQ5R7EBL9cJV5_eRmXQ@mail.gmail.com> (raw)
In-Reply-To: <14fb5261-bb03-a6bd-6653-2a2c655b04ee@microchip.com>

2017-09-13 14:15 GMT+02:00 Nicolas Ferre <nicolas.ferre@microchip.com>:
> On 08/09/2017 at 17:35, Romain Izard wrote:
>> Wait for the syncronization of all clocks when resuming, not only the
>> UPLL clock. Do not use regmap_read_poll_timeout, as it will call BUG()
>> when interrupts are masked, which is the case in here.
>>
>> Signed-off-by: Romain Izard <romain.izard.pro@gmail.com>
>> ---
>>  drivers/clk/at91/pmc.c | 24 ++++++++++++++++--------
>>  1 file changed, 16 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/clk/at91/pmc.c b/drivers/clk/at91/pmc.c
>> index 775af473fe11..5c2b26de303e 100644
>> --- a/drivers/clk/at91/pmc.c
>> +++ b/drivers/clk/at91/pmc.c
>> @@ -107,10 +107,20 @@ static int pmc_suspend(void)
>>       return 0;
>>  }
>>
>> +static bool pmc_ready(unsigned int mask)
>> +{
>> +     unsigned int status;
>> +
>> +     regmap_read(pmcreg, AT91_PMC_SR, &status);
>> +
>> +     return ((status & mask) == mask) ? 1 : 0;
>> +}
>> +
>>  static void pmc_resume(void)
>>  {
>> -     int i, ret = 0;
>> +     int i;
>>       u32 tmp;
>> +     u32 mask = AT91_PMC_MCKRDY | AT91_PMC_LOCKA;
>>
>>       regmap_read(pmcreg, AT91_PMC_MCKR, &tmp);
>>       if (pmc_cache.mckr != tmp)
>> @@ -134,13 +144,11 @@ static void pmc_resume(void)
>>                            AT91_PMC_PCR_CMD);
>>       }
>>
>> -     if (pmc_cache.uckr & AT91_PMC_UPLLEN) {
>> -             ret = regmap_read_poll_timeout(pmcreg, AT91_PMC_SR, tmp,
>> -                                            !(tmp & AT91_PMC_LOCKU),
>> -                                            10, 5000);
>> -             if (ret)
>> -                     pr_crit("USB PLL didn't lock when resuming\n");
>> -     }
>> +     if (pmc_cache.uckr & AT91_PMC_UPLLEN)
>> +             mask |= AT91_PMC_LOCKU;
>> +
>> +     while (!pmc_ready(mask))
>> +             cpu_relax();
>
> Okay, but I would prefer to keep the timeout property in it. So we may
> need to re-implement a timeout way-out here.
>

We need to have a reference clock to measure the timeout delay. If we use
the kernel's timekeeping, it relies on the clocks that we are configuring in
this code. Moreover, my experience with the mainline code is that when
something goes wrong, nothing will work. No oops or panic will be reported,
the device will just stop working.

In my case, I had obvious failures (it just stopped working unless I removed
USB wakeup or activated the console during suspend) but also very rare
failures, that occurred in the bootloader. Those issues were detected when
testing repeated suspend cycles for a night: the memory controller would
never enter the self-refresh mode during the resume sequence.

This led me to question the bootloader's code first, and I set up 4 boards
with the backup prototype code on v4.9 to verify that it was stable on
suspend. I've reached 1.5 million sleep cycles over 3 weeks without
failure, so this hinted towards the difference between the prototype and the
backup code provided for v4.12 (which contained the patch that got in
v4.13). Once I integrated this patch, I've run the v4.12 code for 2 weeks
without issue as well.

In the end, I don't want to touch this code if I do not have to, as checking
that it does not regress is really cumbersome.

-- 
Romain Izard

WARNING: multiple messages have this Message-ID (diff)
From: Romain Izard <romain.izard.pro@gmail.com>
To: Nicolas Ferre <nicolas.ferre@microchip.com>
Cc: Boris Brezillon <boris.brezillon@free-electrons.com>,
	Michael Turquette <mturquette@baylibre.com>,
	Stephen Boyd <sboyd@codeaurora.org>,
	Ludovic Desroches <ludovic.desroches@microchip.com>,
	Jonathan Cameron <jic23@kernel.org>,
	Wenyou Yang <wenyou.yang@atmel.com>,
	Josh Wu <rainyfeeling@outlook.com>,
	David Woodhouse <dwmw2@infradead.org>,
	Brian Norris <computersforpeace@gmail.com>,
	Marek Vasut <marek.vasut@gmail.com>,
	Cyrille Pitchen <cyrille.pitchen@wedev4u.fr>,
	Thierry Reding <thierry.reding@gmail.com>,
	Richard Genoud <richard.genoud@gmail.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Alan Stern <stern@rowland.harvard.edu>,
	linux-clk@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
	linux-iio@vger.kernel.org, linux-mtd <linux-mtd@lists.in>
Subject: Re: [PATCH v1 01/10] clk: at91: pmc: Wait for clocks when resuming
Date: Thu, 14 Sep 2017 18:15:56 +0200	[thread overview]
Message-ID: <CAGkQfmPB8b38=hJrtuKJ-Hiqdfx9dt5SQ5R7EBL9cJV5_eRmXQ@mail.gmail.com> (raw)
In-Reply-To: <14fb5261-bb03-a6bd-6653-2a2c655b04ee@microchip.com>

2017-09-13 14:15 GMT+02:00 Nicolas Ferre <nicolas.ferre@microchip.com>:
> On 08/09/2017 at 17:35, Romain Izard wrote:
>> Wait for the syncronization of all clocks when resuming, not only the
>> UPLL clock. Do not use regmap_read_poll_timeout, as it will call BUG()
>> when interrupts are masked, which is the case in here.
>>
>> Signed-off-by: Romain Izard <romain.izard.pro@gmail.com>
>> ---
>>  drivers/clk/at91/pmc.c | 24 ++++++++++++++++--------
>>  1 file changed, 16 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/clk/at91/pmc.c b/drivers/clk/at91/pmc.c
>> index 775af473fe11..5c2b26de303e 100644
>> --- a/drivers/clk/at91/pmc.c
>> +++ b/drivers/clk/at91/pmc.c
>> @@ -107,10 +107,20 @@ static int pmc_suspend(void)
>>       return 0;
>>  }
>>
>> +static bool pmc_ready(unsigned int mask)
>> +{
>> +     unsigned int status;
>> +
>> +     regmap_read(pmcreg, AT91_PMC_SR, &status);
>> +
>> +     return ((status & mask) == mask) ? 1 : 0;
>> +}
>> +
>>  static void pmc_resume(void)
>>  {
>> -     int i, ret = 0;
>> +     int i;
>>       u32 tmp;
>> +     u32 mask = AT91_PMC_MCKRDY | AT91_PMC_LOCKA;
>>
>>       regmap_read(pmcreg, AT91_PMC_MCKR, &tmp);
>>       if (pmc_cache.mckr != tmp)
>> @@ -134,13 +144,11 @@ static void pmc_resume(void)
>>                            AT91_PMC_PCR_CMD);
>>       }
>>
>> -     if (pmc_cache.uckr & AT91_PMC_UPLLEN) {
>> -             ret = regmap_read_poll_timeout(pmcreg, AT91_PMC_SR, tmp,
>> -                                            !(tmp & AT91_PMC_LOCKU),
>> -                                            10, 5000);
>> -             if (ret)
>> -                     pr_crit("USB PLL didn't lock when resuming\n");
>> -     }
>> +     if (pmc_cache.uckr & AT91_PMC_UPLLEN)
>> +             mask |= AT91_PMC_LOCKU;
>> +
>> +     while (!pmc_ready(mask))
>> +             cpu_relax();
>
> Okay, but I would prefer to keep the timeout property in it. So we may
> need to re-implement a timeout way-out here.
>

We need to have a reference clock to measure the timeout delay. If we use
the kernel's timekeeping, it relies on the clocks that we are configuring in
this code. Moreover, my experience with the mainline code is that when
something goes wrong, nothing will work. No oops or panic will be reported,
the device will just stop working.

In my case, I had obvious failures (it just stopped working unless I removed
USB wakeup or activated the console during suspend) but also very rare
failures, that occurred in the bootloader. Those issues were detected when
testing repeated suspend cycles for a night: the memory controller would
never enter the self-refresh mode during the resume sequence.

This led me to question the bootloader's code first, and I set up 4 boards
with the backup prototype code on v4.9 to verify that it was stable on
suspend. I've reached 1.5 million sleep cycles over 3 weeks without
failure, so this hinted towards the difference between the prototype and the
backup code provided for v4.12 (which contained the patch that got in
v4.13). Once I integrated this patch, I've run the v4.12 code for 2 weeks
without issue as well.

In the end, I don't want to touch this code if I do not have to, as checking
that it does not regress is really cumbersome.

-- 
Romain Izard

WARNING: multiple messages have this Message-ID (diff)
From: romain.izard.pro@gmail.com (Romain Izard)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v1 01/10] clk: at91: pmc: Wait for clocks when resuming
Date: Thu, 14 Sep 2017 18:15:56 +0200	[thread overview]
Message-ID: <CAGkQfmPB8b38=hJrtuKJ-Hiqdfx9dt5SQ5R7EBL9cJV5_eRmXQ@mail.gmail.com> (raw)
In-Reply-To: <14fb5261-bb03-a6bd-6653-2a2c655b04ee@microchip.com>

2017-09-13 14:15 GMT+02:00 Nicolas Ferre <nicolas.ferre@microchip.com>:
> On 08/09/2017 at 17:35, Romain Izard wrote:
>> Wait for the syncronization of all clocks when resuming, not only the
>> UPLL clock. Do not use regmap_read_poll_timeout, as it will call BUG()
>> when interrupts are masked, which is the case in here.
>>
>> Signed-off-by: Romain Izard <romain.izard.pro@gmail.com>
>> ---
>>  drivers/clk/at91/pmc.c | 24 ++++++++++++++++--------
>>  1 file changed, 16 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/clk/at91/pmc.c b/drivers/clk/at91/pmc.c
>> index 775af473fe11..5c2b26de303e 100644
>> --- a/drivers/clk/at91/pmc.c
>> +++ b/drivers/clk/at91/pmc.c
>> @@ -107,10 +107,20 @@ static int pmc_suspend(void)
>>       return 0;
>>  }
>>
>> +static bool pmc_ready(unsigned int mask)
>> +{
>> +     unsigned int status;
>> +
>> +     regmap_read(pmcreg, AT91_PMC_SR, &status);
>> +
>> +     return ((status & mask) == mask) ? 1 : 0;
>> +}
>> +
>>  static void pmc_resume(void)
>>  {
>> -     int i, ret = 0;
>> +     int i;
>>       u32 tmp;
>> +     u32 mask = AT91_PMC_MCKRDY | AT91_PMC_LOCKA;
>>
>>       regmap_read(pmcreg, AT91_PMC_MCKR, &tmp);
>>       if (pmc_cache.mckr != tmp)
>> @@ -134,13 +144,11 @@ static void pmc_resume(void)
>>                            AT91_PMC_PCR_CMD);
>>       }
>>
>> -     if (pmc_cache.uckr & AT91_PMC_UPLLEN) {
>> -             ret = regmap_read_poll_timeout(pmcreg, AT91_PMC_SR, tmp,
>> -                                            !(tmp & AT91_PMC_LOCKU),
>> -                                            10, 5000);
>> -             if (ret)
>> -                     pr_crit("USB PLL didn't lock when resuming\n");
>> -     }
>> +     if (pmc_cache.uckr & AT91_PMC_UPLLEN)
>> +             mask |= AT91_PMC_LOCKU;
>> +
>> +     while (!pmc_ready(mask))
>> +             cpu_relax();
>
> Okay, but I would prefer to keep the timeout property in it. So we may
> need to re-implement a timeout way-out here.
>

We need to have a reference clock to measure the timeout delay. If we use
the kernel's timekeeping, it relies on the clocks that we are configuring in
this code. Moreover, my experience with the mainline code is that when
something goes wrong, nothing will work. No oops or panic will be reported,
the device will just stop working.

In my case, I had obvious failures (it just stopped working unless I removed
USB wakeup or activated the console during suspend) but also very rare
failures, that occurred in the bootloader. Those issues were detected when
testing repeated suspend cycles for a night: the memory controller would
never enter the self-refresh mode during the resume sequence.

This led me to question the bootloader's code first, and I set up 4 boards
with the backup prototype code on v4.9 to verify that it was stable on
suspend. I've reached 1.5 million sleep cycles over 3 weeks without
failure, so this hinted towards the difference between the prototype and the
backup code provided for v4.12 (which contained the patch that got in
v4.13). Once I integrated this patch, I've run the v4.12 code for 2 weeks
without issue as well.

In the end, I don't want to touch this code if I do not have to, as checking
that it does not regress is really cumbersome.

-- 
Romain Izard

  reply	other threads:[~2017-09-14 16:16 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-08 15:35 [PATCH v1 00/10] Various patches for SAMA5D2 backup mode Romain Izard
2017-09-08 15:35 ` Romain Izard
2017-09-08 15:35 ` [PATCH v1 01/10] clk: at91: pmc: Wait for clocks when resuming Romain Izard
2017-09-08 15:35   ` Romain Izard
2017-09-13 12:15   ` Nicolas Ferre
2017-09-13 12:15     ` Nicolas Ferre
2017-09-13 12:15     ` Nicolas Ferre
2017-09-14 16:15     ` Romain Izard [this message]
2017-09-14 16:15       ` Romain Izard
2017-09-14 16:15       ` Romain Izard
2017-09-22 12:12       ` Nicolas Ferre
2017-09-22 12:12         ` Nicolas Ferre
2017-09-22 12:12         ` Nicolas Ferre
2017-09-08 15:35 ` [PATCH v1 02/10] clk: at91: pmc: Save SCSR during suspend Romain Izard
2017-09-08 15:35   ` Romain Izard
2017-09-08 15:35   ` Romain Izard
2017-09-13 12:10   ` Nicolas Ferre
2017-09-13 12:10     ` Nicolas Ferre
2017-09-13 12:10     ` Nicolas Ferre
2017-09-08 15:35 ` [PATCH v1 03/10] clk: at91: pmc: Support backup for programmable clocks Romain Izard
2017-09-08 15:35   ` Romain Izard
2017-09-13 12:29   ` Nicolas Ferre
2017-09-13 12:29     ` Nicolas Ferre
2017-09-13 12:29     ` Nicolas Ferre
2017-09-13 17:03     ` Alexandre Belloni
2017-09-13 17:03       ` Alexandre Belloni
2017-09-13 17:03       ` Alexandre Belloni
2017-09-14  7:41       ` romain izard
2017-09-14  7:41         ` romain izard
2017-09-14  7:41         ` romain izard
2017-09-14  9:38         ` Nicolas Ferre
2017-09-14  9:38           ` Nicolas Ferre
2017-09-14  9:38           ` Nicolas Ferre
2017-09-08 15:35 ` [PATCH v1 04/10] mtd: nand: atmel: Avoid ECC errors when leaving backup mode Romain Izard
2017-09-08 15:35   ` Romain Izard
2017-09-08 15:35 ` [PATCH v1 05/10] mtd: nand: atmel: Report PMECC failures as errors Romain Izard
2017-09-08 15:35   ` Romain Izard
2017-09-08 15:36 ` [PATCH v1 06/10] ehci-atmel: Power down during suspend is normal Romain Izard
2017-09-08 15:36   ` Romain Izard
2017-09-08 15:36 ` [PATCH v1 07/10] iio:adc:at91-sama5d2: Support backup mode Romain Izard
2017-09-08 15:36   ` Romain Izard
2017-09-08 16:03   ` Nicolas Ferre
2017-09-08 16:03     ` Nicolas Ferre
2017-09-08 16:03     ` Nicolas Ferre
2017-09-08 16:21     ` Romain Izard
2017-09-08 16:21       ` Romain Izard
2017-09-08 16:21       ` Romain Izard
2017-09-08 15:36 ` [PATCH v1 08/10] pwm: atmel-tcb: " Romain Izard
2017-09-08 15:36   ` Romain Izard
2017-09-08 15:36   ` Romain Izard
2017-09-08 15:36 ` [PATCH v1 09/10] atmel_flexcom: " Romain Izard
2017-09-08 15:36   ` Romain Izard
2017-09-08 15:36 ` [PATCH v1 10/10] tty/serial: atmel: Prevent a warning on suspend Romain Izard
2017-09-08 15:36   ` Romain Izard
2017-09-08 15:36   ` Romain Izard
2017-09-11  9:52   ` Romain Izard
2017-09-11  9:52     ` Romain Izard
2017-09-11  9:52     ` Romain Izard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGkQfmPB8b38=hJrtuKJ-Hiqdfx9dt5SQ5R7EBL9cJV5_eRmXQ@mail.gmail.com' \
    --to=romain.izard.pro@gmail.com \
    --cc=boris.brezillon@free-electrons.com \
    --cc=computersforpeace@gmail.com \
    --cc=cyrille.pitchen@wedev4u.fr \
    --cc=dwmw2@infradead.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jic23@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-clk@vger.kernel.org \
    --cc=linux-iio@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mtd@lists.infradead.org \
    --cc=linux-pwm@vger.kernel.org \
    --cc=linux-serial@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=ludovic.desroches@microchip.com \
    --cc=marek.vasut@gmail.com \
    --cc=mturquette@baylibre.com \
    --cc=nicolas.ferre@microchip.com \
    --cc=rainyfeeling@outlook.com \
    --cc=richard.genoud@gmail.com \
    --cc=sboyd@codeaurora.org \
    --cc=stern@rowland.harvard.edu \
    --cc=thierry.reding@gmail.com \
    --cc=wenyou.yang@atmel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.