All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/1] Fix for U-Boot SPL hang on sunxi H6 due to incorrect ram size detection
@ 2023-10-01 16:13 Gunjan Gupta
  2023-10-01 16:13 ` [PATCH 1/1] sunxi: dram: Fix incorrect ram size detection for some H6 boards Gunjan Gupta
  0 siblings, 1 reply; 10+ messages in thread
From: Gunjan Gupta @ 2023-10-01 16:13 UTC (permalink / raw)
  To: u-boot
  Cc: Jernej Skrabec, Ondrej Jirman, Gunjan Gupta, Andre Przywara, Jagan Teki


On Orange Pi 3 LTS, sometimes on reboot, SPL detects ram as 4 GB instead
of 2 GB. This leads to a hang later and board fails to boot. Fix is
tested by leaving the device in a reboot loop overnight where device
successfully rebooted 1350+ times by the time I stopped rebooting it.


Gunjan Gupta (1):
  sunxi: dram: Fix incorrect ram size detection for some H6 boards

 arch/arm/mach-sunxi/dram_helpers.c   | 1 +
 arch/arm/mach-sunxi/dram_sun50i_h6.c | 2 ++
 2 files changed, 3 insertions(+)

-- 
2.42.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/1] sunxi: dram: Fix incorrect ram size detection for some H6 boards
  2023-10-01 16:13 [PATCH 0/1] Fix for U-Boot SPL hang on sunxi H6 due to incorrect ram size detection Gunjan Gupta
@ 2023-10-01 16:13 ` Gunjan Gupta
  2023-10-02 11:26   ` Andre Przywara
  0 siblings, 1 reply; 10+ messages in thread
From: Gunjan Gupta @ 2023-10-01 16:13 UTC (permalink / raw)
  To: u-boot
  Cc: Jernej Skrabec, Ondrej Jirman, Gunjan Gupta, Andre Przywara, Jagan Teki

On some H6 boards like Orange Pi 3 LTS, some times U-Boot fails to detect
ram size correctly. Instead of 2GB thats available, it detects 4GB of ram
and then SPL just hangs there making board not to boot further.

On debugging, I found that the rows value were being determined correctly,
but columns were sometimes off by one value. I found that adding some
delay after the mctl_core_init call along with making use of dsb in the
start of the mctl_mem_matches solves the issue.

Signed-off-by: Gunjan Gupta <viraniac@gmail.com>
---

 arch/arm/mach-sunxi/dram_helpers.c   | 1 +
 arch/arm/mach-sunxi/dram_sun50i_h6.c | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/arch/arm/mach-sunxi/dram_helpers.c b/arch/arm/mach-sunxi/dram_helpers.c
index cdf2750f1c..5758c58e07 100644
--- a/arch/arm/mach-sunxi/dram_helpers.c
+++ b/arch/arm/mach-sunxi/dram_helpers.c
@@ -32,6 +32,7 @@ void mctl_await_completion(u32 *reg, u32 mask, u32 val)
 #ifndef CONFIG_MACH_SUNIV
 bool mctl_mem_matches(u32 offset)
 {
+	dsb();
 	/* Try to write different values to RAM at two addresses */
 	writel(0, CFG_SYS_SDRAM_BASE);
 	writel(0xaa55aa55, (ulong)CFG_SYS_SDRAM_BASE + offset);
diff --git a/arch/arm/mach-sunxi/dram_sun50i_h6.c b/arch/arm/mach-sunxi/dram_sun50i_h6.c
index bff2e42513..a031a845f5 100644
--- a/arch/arm/mach-sunxi/dram_sun50i_h6.c
+++ b/arch/arm/mach-sunxi/dram_sun50i_h6.c
@@ -623,6 +623,8 @@ static void mctl_auto_detect_dram_size(struct dram_para *para)
 	para->cols = 11;
 	mctl_core_init(para);
 
+	udelay(50);
+
 	for (para->cols = 8; para->cols < 11; para->cols++) {
 		/* 8 bits per byte and 16/32 bit width */
 		if (mctl_mem_matches(1 << (para->cols + 1 +
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] sunxi: dram: Fix incorrect ram size detection for some H6 boards
  2023-10-01 16:13 ` [PATCH 1/1] sunxi: dram: Fix incorrect ram size detection for some H6 boards Gunjan Gupta
@ 2023-10-02 11:26   ` Andre Przywara
  2023-10-02 12:42     ` Gunjan Gupta
  2023-10-02 18:50     ` Jernej Škrabec
  0 siblings, 2 replies; 10+ messages in thread
From: Andre Przywara @ 2023-10-02 11:26 UTC (permalink / raw)
  To: Gunjan Gupta; +Cc: u-boot, Jernej Skrabec, Ondrej Jirman, Jagan Teki

On Sun,  1 Oct 2023 21:43:32 +0530
Gunjan Gupta <viraniac@gmail.com> wrote:

(fixing Jernej's email) 

Hi Gunjan,

thanks for sending a patch!
 
> On some H6 boards like Orange Pi 3 LTS, some times U-Boot fails to detect
> ram size correctly. Instead of 2GB thats available, it detects 4GB of ram
> and then SPL just hangs there making board not to boot further.
> 
> On debugging, I found that the rows value were being determined correctly,
> but columns were sometimes off by one value. I found that adding some
> delay after the mctl_core_init call along with making use of dsb in the
> start of the mctl_mem_matches solves the issue.
> 
> Signed-off-by: Gunjan Gupta <viraniac@gmail.com>
> ---
> 
>  arch/arm/mach-sunxi/dram_helpers.c   | 1 +
>  arch/arm/mach-sunxi/dram_sun50i_h6.c | 2 ++
>  2 files changed, 3 insertions(+)
> 
> diff --git a/arch/arm/mach-sunxi/dram_helpers.c b/arch/arm/mach-sunxi/dram_helpers.c
> index cdf2750f1c..5758c58e07 100644
> --- a/arch/arm/mach-sunxi/dram_helpers.c
> +++ b/arch/arm/mach-sunxi/dram_helpers.c
> @@ -32,6 +32,7 @@ void mctl_await_completion(u32 *reg, u32 mask, u32 val)
>  #ifndef CONFIG_MACH_SUNIV
>  bool mctl_mem_matches(u32 offset)
>  {
> +	dsb();

This looks a bit odd, do you have an explanation for that? And are you
sure that is really needed?
I understand why we need the DSB after the writel's below, but before that?
The only thing I could think of is that we are missing a barrier in
mctl_core_init() - which is the function called before mctl_mem_matches().
Can you move that dsb(); into mctl_auto_detect_dram_size(), right after
the mctl_core_init() call (where you add the udelay() below)? And I wonder
if a dmb() would already be sufficient? I noticed recently that the
clr/setbit_le32() functions don't have a barrier at all, maybe that should
be fixed instead?

>  	/* Try to write different values to RAM at two addresses */
>  	writel(0, CFG_SYS_SDRAM_BASE);
>  	writel(0xaa55aa55, (ulong)CFG_SYS_SDRAM_BASE + offset);
> diff --git a/arch/arm/mach-sunxi/dram_sun50i_h6.c b/arch/arm/mach-sunxi/dram_sun50i_h6.c
> index bff2e42513..a031a845f5 100644
> --- a/arch/arm/mach-sunxi/dram_sun50i_h6.c
> +++ b/arch/arm/mach-sunxi/dram_sun50i_h6.c
> @@ -623,6 +623,8 @@ static void mctl_auto_detect_dram_size(struct dram_para *para)
>  	para->cols = 11;
>  	mctl_core_init(para);
>  
> +	udelay(50);

The location of that udelay() looks a bit odd, any chance that belongs at
the end of mctl_channel_init() instead? And how did you come up with that
and the value of 50? Just pure experimentation? I think the original BSP
DRAM code has plenty of delay calls, so we might have missed this one
here, but I would love to see some explanation here.

Cheers,
Andre

>  	for (para->cols = 8; para->cols < 11; para->cols++) {
>  		/* 8 bits per byte and 16/32 bit width */
>  		if (mctl_mem_matches(1 << (para->cols + 1 +


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] sunxi: dram: Fix incorrect ram size detection for some H6 boards
  2023-10-02 11:26   ` Andre Przywara
@ 2023-10-02 12:42     ` Gunjan Gupta
  2023-10-02 18:59       ` Jernej Škrabec
  2023-10-02 18:50     ` Jernej Škrabec
  1 sibling, 1 reply; 10+ messages in thread
From: Gunjan Gupta @ 2023-10-02 12:42 UTC (permalink / raw)
  To: Andre Przywara; +Cc: u-boot, Jernej Skrabec, Ondrej Jirman, Jagan Teki

> >  bool mctl_mem_matches(u32 offset)
> >  {
> > +     dsb();
> This looks a bit odd, do you have an explanation for that? And are you
> sure that is really needed?
> I understand why we need the DSB after the writel's below, but before that?

I started with Ondrej Jirman's patch from LibreELEC's tree that had a
dsb call added
after the first writel call. That reduced the frequency of the errors
but didn't removed
it completely. My reason for moving it before the writel was to make
sure any memory
access is completed before starting the actual logic for the test.
That reduced the
frequency even further, but didn't resolve the issue. I did try
removing it leaving only
udelay added to the code, but that brings back the issue.

> The only thing I could think of is that we are missing a barrier in
> mctl_core_init() - which is the function called before mctl_mem_matches().
> Can you move that dsb(); into mctl_auto_detect_dram_size(), right after
> the mctl_core_init() call (where you add the udelay() below)? And I wonder
> if a dmb() would already be sufficient?

Sure, I will try experimenting with it.

> I noticed recently that the clr/setbit_le32() functions don't have a barrier at all, maybe
> that should be fixed instead?

I haven't done much of the low level programming myself. Mostly have
done user space
programming along with fixing minor kernel module compilation issues
due to kernel api
changes. So I wasn't sure what all places to debug. But if you point
me to places with
things to try, I surely can give time playing around testing the proposed fixes.

> > @@ -623,6 +623,8 @@ static void mctl_auto_detect_dram_size(struct dram_para *para)
> >       para->cols = 11;
> >       mctl_core_init(para);
> >
> > +     udelay(50);
> The location of that udelay() looks a bit odd, any chance that belongs at
> the end of mctl_channel_init() instead? And how did you come up with that
> and the value of 50? Just pure experimentation?

Before adding the udelay, I added 7 print statements to print all the
members of the para
struct. That itself solved the issue along with the dsb added to the
top of the mctl_mem_matches
function. This is what gave me the clue that a delay is needed there.
The value of 50 is
indeed from pure experimentation

Thanks & Regards
Gunjan Gupta

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] sunxi: dram: Fix incorrect ram size detection for some H6 boards
  2023-10-02 11:26   ` Andre Przywara
  2023-10-02 12:42     ` Gunjan Gupta
@ 2023-10-02 18:50     ` Jernej Škrabec
  2023-10-03  9:48       ` Andre Przywara
  1 sibling, 1 reply; 10+ messages in thread
From: Jernej Škrabec @ 2023-10-02 18:50 UTC (permalink / raw)
  To: Gunjan Gupta, Andre Przywara; +Cc: u-boot, Ondrej Jirman, Jagan Teki

Dne ponedeljek, 02. oktober 2023 ob 13:26:26 CEST je Andre Przywara napisal(a):
> On Sun,  1 Oct 2023 21:43:32 +0530
> Gunjan Gupta <viraniac@gmail.com> wrote:
> 
> (fixing Jernej's email) 
> 
> Hi Gunjan,
> 
> thanks for sending a patch!
>  
> > On some H6 boards like Orange Pi 3 LTS, some times U-Boot fails to detect
> > ram size correctly. Instead of 2GB thats available, it detects 4GB of ram
> > and then SPL just hangs there making board not to boot further.
> > 
> > On debugging, I found that the rows value were being determined correctly,
> > but columns were sometimes off by one value. I found that adding some
> > delay after the mctl_core_init call along with making use of dsb in the
> > start of the mctl_mem_matches solves the issue.
> > 
> > Signed-off-by: Gunjan Gupta <viraniac@gmail.com>
> > ---
> > 
> >  arch/arm/mach-sunxi/dram_helpers.c   | 1 +
> >  arch/arm/mach-sunxi/dram_sun50i_h6.c | 2 ++
> >  2 files changed, 3 insertions(+)
> > 
> > diff --git a/arch/arm/mach-sunxi/dram_helpers.c b/arch/arm/mach-sunxi/dram_helpers.c
> > index cdf2750f1c..5758c58e07 100644
> > --- a/arch/arm/mach-sunxi/dram_helpers.c
> > +++ b/arch/arm/mach-sunxi/dram_helpers.c
> > @@ -32,6 +32,7 @@ void mctl_await_completion(u32 *reg, u32 mask, u32 val)
> >  #ifndef CONFIG_MACH_SUNIV
> >  bool mctl_mem_matches(u32 offset)
> >  {
> > +	dsb();
> 
> This looks a bit odd, do you have an explanation for that? And are you
> sure that is really needed?
> I understand why we need the DSB after the writel's below, but before that?
> The only thing I could think of is that we are missing a barrier in
> mctl_core_init() - which is the function called before mctl_mem_matches().
> Can you move that dsb(); into mctl_auto_detect_dram_size(), right after
> the mctl_core_init() call (where you add the udelay() below)? And I wonder
> if a dmb() would already be sufficient? I noticed recently that the
> clr/setbit_le32() functions don't have a barrier at all, maybe that should
> be fixed instead?

Looking at original BSP DRAM code, there is no data barriers that I can find.
Cache shouldn't be a thing before DRAM is initialized, right? Conversely,
I suggest adding memory barriers before each udelay(), as it is there for a
reason.

> 
> >  	/* Try to write different values to RAM at two addresses */
> >  	writel(0, CFG_SYS_SDRAM_BASE);
> >  	writel(0xaa55aa55, (ulong)CFG_SYS_SDRAM_BASE + offset);
> > diff --git a/arch/arm/mach-sunxi/dram_sun50i_h6.c b/arch/arm/mach-sunxi/dram_sun50i_h6.c
> > index bff2e42513..a031a845f5 100644
> > --- a/arch/arm/mach-sunxi/dram_sun50i_h6.c
> > +++ b/arch/arm/mach-sunxi/dram_sun50i_h6.c
> > @@ -623,6 +623,8 @@ static void mctl_auto_detect_dram_size(struct dram_para *para)
> >  	para->cols = 11;
> >  	mctl_core_init(para);
> >  
> > +	udelay(50);
> 
> The location of that udelay() looks a bit odd, any chance that belongs at
> the end of mctl_channel_init() instead? And how did you come up with that
> and the value of 50? Just pure experimentation? I think the original BSP
> DRAM code has plenty of delay calls, so we might have missed this one
> here, but I would love to see some explanation here.

I checked original driver and we have *almost* all delays. There are two
missing. Both in mctl_phy_pir_init(). One before mctl_await_completion() and
another after it. Both are 1 us long.

Maybe that solves it (in combination with data barriers)?

Best regards,
Jernej

> 
> Cheers,
> Andre
> 
> >  	for (para->cols = 8; para->cols < 11; para->cols++) {
> >  		/* 8 bits per byte and 16/32 bit width */
> >  		if (mctl_mem_matches(1 << (para->cols + 1 +
> 
> 





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] sunxi: dram: Fix incorrect ram size detection for some H6 boards
  2023-10-02 12:42     ` Gunjan Gupta
@ 2023-10-02 18:59       ` Jernej Škrabec
  2023-10-20 23:38         ` Andre Przywara
  0 siblings, 1 reply; 10+ messages in thread
From: Jernej Škrabec @ 2023-10-02 18:59 UTC (permalink / raw)
  To: Andre Przywara, Gunjan Gupta; +Cc: u-boot, Ondrej Jirman, Jagan Teki

Dne ponedeljek, 02. oktober 2023 ob 14:42:40 CEST je Gunjan Gupta napisal(a):
> > >  bool mctl_mem_matches(u32 offset)
> > >  {
> > > +     dsb();
> > This looks a bit odd, do you have an explanation for that? And are you
> > sure that is really needed?
> > I understand why we need the DSB after the writel's below, but before that?
> 
> I started with Ondrej Jirman's patch from LibreELEC's tree that had a
> dsb call added
> after the first writel call. That reduced the frequency of the errors
> but didn't removed
> it completely. My reason for moving it before the writel was to make
> sure any memory
> access is completed before starting the actual logic for the test.
> That reduced the
> frequency even further, but didn't resolve the issue. I did try
> removing it leaving only
> udelay added to the code, but that brings back the issue.
> 
> > The only thing I could think of is that we are missing a barrier in
> > mctl_core_init() - which is the function called before mctl_mem_matches().
> > Can you move that dsb(); into mctl_auto_detect_dram_size(), right after
> > the mctl_core_init() call (where you add the udelay() below)? And I wonder
> > if a dmb() would already be sufficient?
> 
> Sure, I will try experimenting with it.
> 
> > I noticed recently that the clr/setbit_le32() functions don't have a barrier at all, maybe
> > that should be fixed instead?
> 
> I haven't done much of the low level programming myself. Mostly have
> done user space
> programming along with fixing minor kernel module compilation issues
> due to kernel api
> changes. So I wasn't sure what all places to debug. But if you point
> me to places with
> things to try, I surely can give time playing around testing the proposed fixes.
> 
> > > @@ -623,6 +623,8 @@ static void mctl_auto_detect_dram_size(struct dram_para *para)
> > >       para->cols = 11;
> > >       mctl_core_init(para);
> > >
> > > +     udelay(50);
> > The location of that udelay() looks a bit odd, any chance that belongs at
> > the end of mctl_channel_init() instead? And how did you come up with that
> > and the value of 50? Just pure experimentation?
> 
> Before adding the udelay, I added 7 print statements to print all the
> members of the para
> struct. That itself solved the issue along with the dsb added to the
> top of the mctl_mem_matches
> function. This is what gave me the clue that a delay is needed there.
> The value of 50 is
> indeed from pure experimentation

Oh, I found one major difference between BSP and mainline driver. Please test
patch attached below. I don't know if this path is always taken when wrong
configuration is tested or not.

Best regards,
Jernej

--- a/arch/arm/mach-sunxi/dram_sun50i_h6.c
+++ b/arch/arm/mach-sunxi/dram_sun50i_h6.c
@@ -420,6 +420,7 @@ static bool mctl_channel_init(struct dram_para *para)
                        (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE;
        struct sunxi_mctl_phy_reg * const mctl_phy =
                        (struct sunxi_mctl_phy_reg *)SUNXI_DRAM_PHY0_BASE;
+       bool ret = true;
        int i;
        u32 val;
 
@@ -537,7 +538,7 @@ static bool mctl_channel_init(struct dram_para *para)
                        debug("DRAM PHY DX%dRSR0 = %x\n", i, readl(&mctl_phy->dx[i].rsr[0]));
                debug("Error while initializing DRAM PHY!\n");
 
-               return false;
+               ret = false;
        }
 
        if (sunxi_dram_is_lpddr(para->type))
@@ -553,7 +554,7 @@ static bool mctl_channel_init(struct dram_para *para)
        writel(0x7ff, &mctl_com->maer1);
        writel(0xffff, &mctl_com->maer2);
 
-       return true;
+       return ret;
 }
 
 static void mctl_auto_detect_rank_width(struct dram_para *para)






^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] sunxi: dram: Fix incorrect ram size detection for some H6 boards
  2023-10-02 18:50     ` Jernej Škrabec
@ 2023-10-03  9:48       ` Andre Przywara
  0 siblings, 0 replies; 10+ messages in thread
From: Andre Przywara @ 2023-10-03  9:48 UTC (permalink / raw)
  To: Jernej Škrabec; +Cc: Gunjan Gupta, u-boot, Ondrej Jirman, Jagan Teki

On Mon, 02 Oct 2023 20:50:49 +0200
Jernej Škrabec <jernej.skrabec@gmail.com> wrote:

Hi,

> Dne ponedeljek, 02. oktober 2023 ob 13:26:26 CEST je Andre Przywara napisal(a):
> > On Sun,  1 Oct 2023 21:43:32 +0530
> > Gunjan Gupta <viraniac@gmail.com> wrote:
> > 
> > (fixing Jernej's email) 
> > 
> > Hi Gunjan,
> > 
> > thanks for sending a patch!
> >    
> > > On some H6 boards like Orange Pi 3 LTS, some times U-Boot fails to detect
> > > ram size correctly. Instead of 2GB thats available, it detects 4GB of ram
> > > and then SPL just hangs there making board not to boot further.
> > > 
> > > On debugging, I found that the rows value were being determined correctly,
> > > but columns were sometimes off by one value. I found that adding some
> > > delay after the mctl_core_init call along with making use of dsb in the
> > > start of the mctl_mem_matches solves the issue.
> > > 
> > > Signed-off-by: Gunjan Gupta <viraniac@gmail.com>
> > > ---
> > > 
> > >  arch/arm/mach-sunxi/dram_helpers.c   | 1 +
> > >  arch/arm/mach-sunxi/dram_sun50i_h6.c | 2 ++
> > >  2 files changed, 3 insertions(+)
> > > 
> > > diff --git a/arch/arm/mach-sunxi/dram_helpers.c b/arch/arm/mach-sunxi/dram_helpers.c
> > > index cdf2750f1c..5758c58e07 100644
> > > --- a/arch/arm/mach-sunxi/dram_helpers.c
> > > +++ b/arch/arm/mach-sunxi/dram_helpers.c
> > > @@ -32,6 +32,7 @@ void mctl_await_completion(u32 *reg, u32 mask, u32 val)
> > >  #ifndef CONFIG_MACH_SUNIV
> > >  bool mctl_mem_matches(u32 offset)
> > >  {
> > > +	dsb();  
> > 
> > This looks a bit odd, do you have an explanation for that? And are you
> > sure that is really needed?
> > I understand why we need the DSB after the writel's below, but before that?
> > The only thing I could think of is that we are missing a barrier in
> > mctl_core_init() - which is the function called before mctl_mem_matches().
> > Can you move that dsb(); into mctl_auto_detect_dram_size(), right after
> > the mctl_core_init() call (where you add the udelay() below)? And I wonder
> > if a dmb() would already be sufficient? I noticed recently that the
> > clr/setbit_le32() functions don't have a barrier at all, maybe that should
> > be fixed instead?  
> 
> Looking at original BSP DRAM code, there is no data barriers that I can find.
> Cache shouldn't be a thing before DRAM is initialized, right?

With the *MMU off* every memory access is "device nGnRnE", so
definitely not cached. But a Cortex-A53 still has a store buffer which
is in effect even for device accesses.

> Conversely,
> I suggest adding memory barriers before each udelay(), as it is there for a
> reason.

Talking to a colleague and looking at the ARM ARM and other
documentation, including [1]:
A DMB memory barrier (__iowmb() in U-Boot and Linux), as used in
readl/writel, just ensures ordering, it does not force completion.
For just programming the DRAM controller, this is what we want.
A DSB does everything that a DMB does, plus ensures "completion" of
memory accesses (plus other things like TLBs and CMOs, which we
don't care about in this case). In a Cortex-A53, this seems to include
flushing the store buffer, which is probably the culprit here.

So I don't know if the delays in the BSP DRAM driver are really
pauses that the DRAM controller needs. In this case we would need at
least a DSB before the udelay(), if not a device-read-back, though I
just assume that the DRAM controller registers do not have another
buffer on the hardware side. Check 27:47 onward of [1].

Another possibility is that the delays are just crude measures to paper
over missing barriers, hoping that the buffer is drained after the
time. But since the delays don't really hurt here, we should just assume
they are meaningful and keep them.

Also we pretty surely need a DSB after we setup the DRAM controller,
but before we use the DRAM array: the controller registers and the array
are separate devices, on different AXI buses. So that aspect of this patch
seems actually to be alright, and the fact that Gunjan needed the DSB
supports that. 

I will try to come up with a patch that implements these ideas.

Cheers,
Andre

P.S. Please note that above statements are an application of the
architecture rules to the A53 and the MMU-off/single core situation in the
SPL. The situation is more complex when running Linux, especially on more
modern cores that do speculation and out-of-order execution.

[1] https://www.youtube.com/watch?v=i6DayghhA8Q

> 
> >   
> > >  	/* Try to write different values to RAM at two addresses */
> > >  	writel(0, CFG_SYS_SDRAM_BASE);
> > >  	writel(0xaa55aa55, (ulong)CFG_SYS_SDRAM_BASE + offset);
> > > diff --git a/arch/arm/mach-sunxi/dram_sun50i_h6.c b/arch/arm/mach-sunxi/dram_sun50i_h6.c
> > > index bff2e42513..a031a845f5 100644
> > > --- a/arch/arm/mach-sunxi/dram_sun50i_h6.c
> > > +++ b/arch/arm/mach-sunxi/dram_sun50i_h6.c
> > > @@ -623,6 +623,8 @@ static void mctl_auto_detect_dram_size(struct dram_para *para)
> > >  	para->cols = 11;
> > >  	mctl_core_init(para);
> > >  
> > > +	udelay(50);  
> > 
> > The location of that udelay() looks a bit odd, any chance that belongs at
> > the end of mctl_channel_init() instead? And how did you come up with that
> > and the value of 50? Just pure experimentation? I think the original BSP
> > DRAM code has plenty of delay calls, so we might have missed this one
> > here, but I would love to see some explanation here.  
> 
> I checked original driver and we have *almost* all delays. There are two
> missing. Both in mctl_phy_pir_init(). One before mctl_await_completion() and
> another after it. Both are 1 us long.
> 
> Maybe that solves it (in combination with data barriers)?
> 
> Best regards,
> Jernej
> 
> > 
> > Cheers,
> > Andre
> >   
> > >  	for (para->cols = 8; para->cols < 11; para->cols++) {
> > >  		/* 8 bits per byte and 16/32 bit width */
> > >  		if (mctl_mem_matches(1 << (para->cols + 1 +  
> > 
> >   
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] sunxi: dram: Fix incorrect ram size detection for some H6 boards
  2023-10-02 18:59       ` Jernej Škrabec
@ 2023-10-20 23:38         ` Andre Przywara
  2023-10-21  5:48           ` Jernej Škrabec
  0 siblings, 1 reply; 10+ messages in thread
From: Andre Przywara @ 2023-10-20 23:38 UTC (permalink / raw)
  To: Jernej Škrabec; +Cc: Gunjan Gupta, u-boot, Ondrej Jirman, Jagan Teki

On Mon, 02 Oct 2023 20:59:34 +0200
Jernej Škrabec <jernej.skrabec@gmail.com> wrote:

Hi Jernej,

> Dne ponedeljek, 02. oktober 2023 ob 14:42:40 CEST je Gunjan Gupta napisal(a):
> > > >  bool mctl_mem_matches(u32 offset)
> > > >  {
> > > > +     dsb();  
> > > This looks a bit odd, do you have an explanation for that? And are you
> > > sure that is really needed?
> > > I understand why we need the DSB after the writel's below, but before that?  
> > 
> > I started with Ondrej Jirman's patch from LibreELEC's tree that had a
> > dsb call added
> > after the first writel call. That reduced the frequency of the errors
> > but didn't removed
> > it completely. My reason for moving it before the writel was to make
> > sure any memory
> > access is completed before starting the actual logic for the test.
> > That reduced the
> > frequency even further, but didn't resolve the issue. I did try
> > removing it leaving only
> > udelay added to the code, but that brings back the issue.
> >   
> > > The only thing I could think of is that we are missing a barrier in
> > > mctl_core_init() - which is the function called before mctl_mem_matches().
> > > Can you move that dsb(); into mctl_auto_detect_dram_size(), right after
> > > the mctl_core_init() call (where you add the udelay() below)? And I wonder
> > > if a dmb() would already be sufficient?  
> > 
> > Sure, I will try experimenting with it.
> >   
> > > I noticed recently that the clr/setbit_le32() functions don't have a barrier at all, maybe
> > > that should be fixed instead?  
> > 
> > I haven't done much of the low level programming myself. Mostly have
> > done user space
> > programming along with fixing minor kernel module compilation issues
> > due to kernel api
> > changes. So I wasn't sure what all places to debug. But if you point
> > me to places with
> > things to try, I surely can give time playing around testing the proposed fixes.
> >   
> > > > @@ -623,6 +623,8 @@ static void mctl_auto_detect_dram_size(struct dram_para *para)
> > > >       para->cols = 11;
> > > >       mctl_core_init(para);
> > > >
> > > > +     udelay(50);  
> > > The location of that udelay() looks a bit odd, any chance that belongs at
> > > the end of mctl_channel_init() instead? And how did you come up with that
> > > and the value of 50? Just pure experimentation?  
> > 
> > Before adding the udelay, I added 7 print statements to print all the
> > members of the para
> > struct. That itself solved the issue along with the dsb added to the
> > top of the mctl_mem_matches
> > function. This is what gave me the clue that a delay is needed there.
> > The value of 50 is
> > indeed from pure experimentation  
> 
> Oh, I found one major difference between BSP and mainline driver. Please test
> patch attached below. I don't know if this path is always taken when wrong
> configuration is tested or not.
>
> Best regards,
> Jernej
> 
> --- a/arch/arm/mach-sunxi/dram_sun50i_h6.c
> +++ b/arch/arm/mach-sunxi/dram_sun50i_h6.c
> @@ -420,6 +420,7 @@ static bool mctl_channel_init(struct dram_para *para)
>                         (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE;
>         struct sunxi_mctl_phy_reg * const mctl_phy =
>                         (struct sunxi_mctl_phy_reg *)SUNXI_DRAM_PHY0_BASE;
> +       bool ret = true;
>         int i;
>         u32 val;
>  
> @@ -537,7 +538,7 @@ static bool mctl_channel_init(struct dram_para *para)
>                         debug("DRAM PHY DX%dRSR0 = %x\n", i, readl(&mctl_phy->dx[i].rsr[0]));
>                 debug("Error while initializing DRAM PHY!\n");

So those error messages (and the ones before them, not shown here) look
odd: if I follow the code correctly, this here is the only place which
makes mctl_core_init() return false. And we rely on it doing so up to
three times, to detect the proper rank and bus width, in
mctl_auto_detect_rank_width(). So we should not have dramatic error
messages (even for debug), as they would occur during normal, eventually
successful setup as well.
So shall we tone those messages down? I'd suggest to change the
"Oops!..." comment into something saying that we detected a wrong
rank/bus-width setup. Also removing the pointless PLL debug print
(since the failure is not DRAM clock related), and also the final error
message (the last line shown here). I will make patch for that.

But regardless I doubt that this patch is doing anything: when this
function returns false, we set new rank/width parameters, and call
mctl_core_init() again, which starts with mctl_sys_init() resetting all
DRAM controller registers, which I actually wonder is really necessary.
But surely any register setup after that point above is useless, with
the current code.

Does that make sense?

Cheers,
Andre

> -               return false;
> +               ret = false;
>         }
>  
>         if (sunxi_dram_is_lpddr(para->type))
> @@ -553,7 +554,7 @@ static bool mctl_channel_init(struct dram_para *para)
>         writel(0x7ff, &mctl_com->maer1);
>         writel(0xffff, &mctl_com->maer2);
>  
> -       return true;
> +       return ret;
>  }
>  
>  static void mctl_auto_detect_rank_width(struct dram_para *para)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] sunxi: dram: Fix incorrect ram size detection for some H6 boards
  2023-10-20 23:38         ` Andre Przywara
@ 2023-10-21  5:48           ` Jernej Škrabec
  2023-10-21  9:40             ` Gunjan Gupta
  0 siblings, 1 reply; 10+ messages in thread
From: Jernej Škrabec @ 2023-10-21  5:48 UTC (permalink / raw)
  To: Andre Przywara; +Cc: Gunjan Gupta, u-boot, Ondrej Jirman, Jagan Teki

On Saturday, October 21, 2023 1:38:39 AM CEST Andre Przywara wrote:
> On Mon, 02 Oct 2023 20:59:34 +0200
> Jernej Škrabec <jernej.skrabec@gmail.com> wrote:
> 
> Hi Jernej,
> 
> > Dne ponedeljek, 02. oktober 2023 ob 14:42:40 CEST je Gunjan Gupta 
napisal(a):
> > > > >  bool mctl_mem_matches(u32 offset)
> > > > >  {
> > > > > 
> > > > > +     dsb();
> > > > 
> > > > This looks a bit odd, do you have an explanation for that? And are you
> > > > sure that is really needed?
> > > > I understand why we need the DSB after the writel's below, but before
> > > > that?
> > > 
> > > I started with Ondrej Jirman's patch from LibreELEC's tree that had a
> > > dsb call added
> > > after the first writel call. That reduced the frequency of the errors
> > > but didn't removed
> > > it completely. My reason for moving it before the writel was to make
> > > sure any memory
> > > access is completed before starting the actual logic for the test.
> > > That reduced the
> > > frequency even further, but didn't resolve the issue. I did try
> > > removing it leaving only
> > > udelay added to the code, but that brings back the issue.
> > > 
> > > > The only thing I could think of is that we are missing a barrier in
> > > > mctl_core_init() - which is the function called before
> > > > mctl_mem_matches().
> > > > Can you move that dsb(); into mctl_auto_detect_dram_size(), right
> > > > after
> > > > the mctl_core_init() call (where you add the udelay() below)? And I
> > > > wonder
> > > > if a dmb() would already be sufficient?
> > > 
> > > Sure, I will try experimenting with it.
> > > 
> > > > I noticed recently that the clr/setbit_le32() functions don't have a
> > > > barrier at all, maybe that should be fixed instead?
> > > 
> > > I haven't done much of the low level programming myself. Mostly have
> > > done user space
> > > programming along with fixing minor kernel module compilation issues
> > > due to kernel api
> > > changes. So I wasn't sure what all places to debug. But if you point
> > > me to places with
> > > things to try, I surely can give time playing around testing the
> > > proposed fixes.> > 
> > > > > @@ -623,6 +623,8 @@ static void mctl_auto_detect_dram_size(struct
> > > > > dram_para *para)> > > > 
> > > > >       para->cols = 11;
> > > > >       mctl_core_init(para);
> > > > > 
> > > > > +     udelay(50);
> > > > 
> > > > The location of that udelay() looks a bit odd, any chance that belongs
> > > > at
> > > > the end of mctl_channel_init() instead? And how did you come up with
> > > > that
> > > > and the value of 50? Just pure experimentation?
> > > 
> > > Before adding the udelay, I added 7 print statements to print all the
> > > members of the para
> > > struct. That itself solved the issue along with the dsb added to the
> > > top of the mctl_mem_matches
> > > function. This is what gave me the clue that a delay is needed there.
> > > The value of 50 is
> > > indeed from pure experimentation
> > 
> > Oh, I found one major difference between BSP and mainline driver. Please
> > test patch attached below. I don't know if this path is always taken when
> > wrong configuration is tested or not.
> > 
> > Best regards,
> > Jernej
> > 
> > --- a/arch/arm/mach-sunxi/dram_sun50i_h6.c
> > +++ b/arch/arm/mach-sunxi/dram_sun50i_h6.c
> > @@ -420,6 +420,7 @@ static bool mctl_channel_init(struct dram_para *para)
> > 
> >                         (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE;
> >         
> >         struct sunxi_mctl_phy_reg * const mctl_phy =
> >         
> >                         (struct sunxi_mctl_phy_reg *)SUNXI_DRAM_PHY0_BASE;
> > 
> > +       bool ret = true;
> > 
> >         int i;
> >         u32 val;
> > 
> > @@ -537,7 +538,7 @@ static bool mctl_channel_init(struct dram_para *para)
> > 
> >                         debug("DRAM PHY DX%dRSR0 = %x\n", i,
> >                         readl(&mctl_phy->dx[i].rsr[0]));
> >                 
> >                 debug("Error while initializing DRAM PHY!\n");
> 
> So those error messages (and the ones before them, not shown here) look
> odd: if I follow the code correctly, this here is the only place which
> makes mctl_core_init() return false. And we rely on it doing so up to
> three times, to detect the proper rank and bus width, in
> mctl_auto_detect_rank_width(). So we should not have dramatic error
> messages (even for debug), as they would occur during normal, eventually
> successful setup as well.
> So shall we tone those messages down? I'd suggest to change the
> "Oops!..." comment into something saying that we detected a wrong
> rank/bus-width setup. Also removing the pointless PLL debug print
> (since the failure is not DRAM clock related), and also the final error
> message (the last line shown here). I will make patch for that.

Those messages are remnants of old behaviour. In the past, code first
enabled dual rank and 32-bit lanes. If that failed, it analyzed error
messages and in second round only enabled supported features.
However, that proved unreliable, so I changed code so it simply tries
every combination and uses whatever works first.

In any case, I concur that these messages might not make sense anymore.

> 
> But regardless I doubt that this patch is doing anything: when this
> function returns false, we set new rank/width parameters, and call
> mctl_core_init() again, which starts with mctl_sys_init() resetting all
> DRAM controller registers, which I actually wonder is really necessary.
> But surely any register setup after that point above is useless, with
> the current code.
> 
> Does that make sense?

My concern is that code after failure might put controller or DRAM in
initial state which may gave more time for next round of attempts to
init DRAM properly. It may also be the reason why initial approach didn't
work reliably.

I'm fine with current approach if Gunjan confirms that initialization
works reliably.

Best regards,
Jernej

> 
> Cheers,
> Andre
> 
> > -               return false;
> > +               ret = false;
> > 
> >         }
> >         
> >         if (sunxi_dram_is_lpddr(para->type))
> > 
> > @@ -553,7 +554,7 @@ static bool mctl_channel_init(struct dram_para *para)
> > 
> >         writel(0x7ff, &mctl_com->maer1);
> >         writel(0xffff, &mctl_com->maer2);
> > 
> > -       return true;
> > +       return ret;
> > 
> >  }
> >  
> >  static void mctl_auto_detect_rank_width(struct dram_para *para)





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] sunxi: dram: Fix incorrect ram size detection for some H6 boards
  2023-10-21  5:48           ` Jernej Škrabec
@ 2023-10-21  9:40             ` Gunjan Gupta
  0 siblings, 0 replies; 10+ messages in thread
From: Gunjan Gupta @ 2023-10-21  9:40 UTC (permalink / raw)
  To: Jernej Škrabec; +Cc: Andre Przywara, u-boot, Ondrej Jirman, Jagan Teki

> I'm fine with the current approach if Gunjan confirms that initialization
> works reliably.

I will test it out over the coming week. Too sick to even pick up the
laptop right now.

Regards
Gunjan

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-10-21 12:41 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-01 16:13 [PATCH 0/1] Fix for U-Boot SPL hang on sunxi H6 due to incorrect ram size detection Gunjan Gupta
2023-10-01 16:13 ` [PATCH 1/1] sunxi: dram: Fix incorrect ram size detection for some H6 boards Gunjan Gupta
2023-10-02 11:26   ` Andre Przywara
2023-10-02 12:42     ` Gunjan Gupta
2023-10-02 18:59       ` Jernej Škrabec
2023-10-20 23:38         ` Andre Przywara
2023-10-21  5:48           ` Jernej Škrabec
2023-10-21  9:40             ` Gunjan Gupta
2023-10-02 18:50     ` Jernej Škrabec
2023-10-03  9:48       ` Andre Przywara

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.