From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wolfgang Denk Date: Sat, 14 Apr 2012 00:26:27 +0200 Subject: [U-Boot] FW: P4080 target has 16G memory stability issues ... In-Reply-To: <2DD52030B5146141BEB762A11AE97C4C014C6791@SPQCEXC05.exfo.com> References: <2DD52030B5146141BEB762A11AE97C4C014C6791@SPQCEXC05.exfo.com> Message-ID: <20120413222627.D76E720019A@gemini.denx.de> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: u-boot@lists.denx.de Dear Robert, please note: it is NOT a good idea to post the same message to several mailing lists separately. Normally such cross-posts should be avoided completely; if they appear to make sense, you should really cross-post, so threading works, and people are aware that this is a message they have already seen on another list. Thanks. In message <2DD52030B5146141BEB762A11AE97C4C014C6791@SPQCEXC05.exfo.com> you wrote: > > Our P4080 target board is using 2 SODIMM's on each of 2 Controllers > (4x4G DDR3), and we are seeing some memory problems (linux panics) > when beating up large amounts of memory (just under the 16G), on > multiple threads (7 or 8 CPUs). This is actually a somewhat frequent problem. It takes some experience to get DDR3 designs right. We have done some hardware design reviews which showed quite a number of issues in this area, typically resulting in issues similar to yours. > Our DDR3 configuration is derived from the SPD dump of U-Boot, and > we are using a version based upon the 2011.09 release of U-Boot. Our > firmware memory test, limited as it is to 2G chunks, and a single CPU > shows no problem, it is only using a small test program under Linux > and using multiple cpu's that we see the problems, and we can > reproduce the problem at will, although reducing our memory speed via > the RCW does seem to ameliorate the problem somewhat. Most memory test routines don't help you here - they execrise the memory with plain read / write cycles, which results in pretty much relaxed timings. Even if these tests work perfectly, your memory mayu fail seriously when you manage to load it with back-to-back burst mode accesses. The easiest way to do this is running Linux with root file system over NFS, and then running some bigger application (like compiling a Linux kernel on the target). This results in many context switches (cache flush / cache fetch) and lots of DMA (network drivers). If everything else works, and this tests crashes your system, you can be pretty sure that burst mode accesses have some problem. > - is anyone using a similar configuration? I don;t think the configuration is a problem here. My bet is either incomplete / incorrect initialization of the memory controller, and/or problems with the hardware design. > - is anyone aware of limitations in the U-Boot 2011.09R version of > the mpc8xxx/ddr/* code we need to be aware of? I have no idea what "2011.09R" might be, sorry. > - any ideas? > > We've been pounding our heads on this for a while now, and I'm just > wondering if we are covering old territory here. This _is_ a well known problem. Memory errors like this have always been a major issue wehn runnign an OS like Linux which really loads the hardware to the limits. Best regards, Wolfgang Denk -- DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd at denx.de 8 Catfish = 1 Octo-puss