From mboxrd@z Thu Jan 1 00:00:00 1970 From: Simon Glass Date: Fri, 1 Dec 2017 20:29:24 -0700 Subject: [U-Boot] U-Boot proper(not SPL) relocate option In-Reply-To: <1511952485.8313.76.camel@infinera.com> References: <8da48f85-01b8-dcce-91cf-2ebdc289912b@rock-chips.com> <20171121112953.12d5176b@jawa> <20171122102735.B9D18120302@gemini.denx.de> <1511952485.8313.76.camel@infinera.com> Message-ID: List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: u-boot@lists.denx.de Hi Joakim, On 29 November 2017 at 03:48, Joakim Tjernlund wrote: > On Wed, 2017-11-29 at 19:11 +0900, Masahiro Yamada wrote: >> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. >> >> >> Hi Simon, >> >> >> 2017-11-28 2:13 GMT+09:00 Simon Glass : >> > (Tom - any thoughts about a more expansive cc list on this?) >> > >> > Hi Masahiro, >> > >> > On 26 November 2017 at 07:16, Masahiro Yamada >> > wrote: >> > > 2017-11-26 20:38 GMT+09:00 Simon Glass : >> > > > Hi Philipp, >> > > > >> > > > On 25 November 2017 at 16:31, Dr. Philipp Tomsich >> > > > wrote: >> > > > > Hi, >> > > > > >> > > > > > On 25 Nov 2017, at 23:34, Simon Glass wrote: >> > > > > > >> > > > > > +Tom, Masahiro, Philipp >> > > > > > >> > > > > > Hi, >> > > > > > >> > > > > > On 22 November 2017 at 03:27, Wolfgang Denk wrote: >> > > > > > > Dear Kever Yang, >> > > > > > > >> > > > > > > In message you wrote: >> > > > > > > > >> > > > > > > > I can understand this feature, we always do dram_init_banks() first, >> > > > > > > > then we relocate to 'known' area, then will be no risk to access memory. >> > > > > > > > I believe there must be some historical reason for some kind of device, >> > > > > > > > the relocate feature is a wonderful idea for it. >> > > > > > > >> > > > > > > This is actuallyu not so much a feature needed to support some >> > > > > > > specific device (in this case much simpler approahces would be >> > > > > > > possible), but to support a whole set of features. Unfortunately >> > > > > > > these appear to get forgotten / ignored over time. >> > > > > > > >> > > > > > > > many other SoCs should be similar. >> > > > > > > > - Without relocate we can save many step, some of our customer really >> > > > > > > > care much about the boot time duration. >> > > > > > > > * no need to relocate everything >> > > > > > > > * no need to copy all the code >> > > > > > > > * no need init the driver more than once >> > > > > > > >> > > > > > > Please have a look at the README, section "Memory Management". >> > > > > > > The reloaction is not done to any _fixed_ address, but the address >> > > > > > > is actually computed at runtime, depending on a number features >> > > > > > > enabled (at least this is how it used to be - appearently little of >> > > > > > > this is tested on a regular base, so I would not be surprised if >> > > > > > > things are broken today). >> > > > > > > >> > > > > > > The basic idea was to reserve areas of memory at the top of RAM, >> > > > > > > that would not be initialized / modified by U-Boot and Linux, not >> > > > > > > even across a reset / warm boot. >> > > > > > > >> > > > > > > This was used for exaple for: >> > > > > > > >> > > > > > > - pRAM (Protected RAM) which could be used to store all kind of data >> > > > > > > (for example, using a pramfs [Protected and Persistent RAM >> > > > > > > Filesystem]) that could be kept across reboots of the OS. >> > > > > > > >> > > > > > > - shared frame buffer / video memory. U-Boot and Linux would be able >> > > > > > > to initialize the video memory just once (in U-Boot) and then >> > > > > > > share it, maybe even across reboots. especially, this would allow >> > > > > > > for a very early splash screen that gets passed (flicker free) to >> > > > > > > Linux until some Linux GUI takes over (much more difficult today). >> > > > > > > >> > > > > > > - shared log buffer: U-Boot and Linux used to use the same syslog >> > > > > > > buffer mechanism, so you could share it between U-Boot and Linux. >> > > > > > > this allows for example to >> > > > > > > * read the Linux kernel panic messages after reset in U-Boot; this >> > > > > > > is very useful when you bring up a new system and Linux crashes >> > > > > > > before it can display the log buffer on the console >> > > > > > > * pass U-Boot POST results on to Linux, so the application code >> > > > > > > can read and process these >> > > > > > > * process the system log of the previous run (especially after a >> > > > > > > panic) in Lunux after it rebootet. >> > > > > > > >> > > > > > > etc. >> > > > > > > >> > > > > > > There are a number of such features which require to reserve room at >> > > > > > > the top of RAM, the size of which is calculatedat runtime, often >> > > > > > > depending on user settable environment data. >> > > > > > > >> > > > > > > All this cannot be done without relocation to a (dynmaically >> > > > > > > computed) target address. >> > > > > > > >> > > > > > > >> > > > > > > Yes, the code could be simpler and faster without that - but then, >> > > > > > > you cut off a number of features. >> > > > > > >> > > > > > I would be interested in seeing benchmarks showing the cost of >> > > > > > relocation in terms of boot time. Last time I did this was on Exynos 5 >> > > > > > and it was some years ago. The time was pretty small provided the >> > > > > > cache was on for the memory copies associated with relocation itself. >> > > > > > Something like 10-20ms but I don't have the numbers handy. >> > > > > > >> > > > > > I think it is useful to be able to allocate memory in board_init_f() >> > > > > > for use by U-Boot for things like the display and the malloc() region. >> > > > > > >> > > > > > Options we might consider: >> > > > > > >> > > > > > 1. Don't relocate the code and data. Thus we could avoid the copy and >> > > > > > relocation cost. This is already supported with the GD_FLG_SKIP_RELOC >> > > > > > used when U-Boot runs as an EFI app >> > > > > > >> > > > > > 2. Rather than throwing away the old malloc() region, keep it around >> > > > > > so existing allocated blocks work. Then new malloc() region would be >> > > > > > used for future allocations. We could perhaps ignore free() calls in >> > > > > > that region >> > > > > > >> > > > > > 2a. This would allow us to avoid re-init of driver model in most cases >> > > > > > I think. E.g. we could init serial and timer before relocation and >> > > > > > leave them inited after relocation. We could just init the >> > > > > > 'additional' devices not done before relocation. >> > > > > > >> > > > > > 2b. I suppose we could even extend this to SPL if we wanted to. I >> > > > > > suspect it would just be a pain though, since SPL might use memory >> > > > > > that U-Boot wants. >> > > > > > >> > > > > > 3. We could turn on the cache earlier. This removes most of the >> > > > > > boot-time penalty. Ideally this should be turned on in SPL and perhaps >> > > > > > redone in U-Boot which has more memory available. If SPL is not used, >> > > > > > we could turn on the cache before relocation. >> > > > > >> > > > > Both turning on the cache and initialising the clocking could be of benefit >> > > > > to boot-time. >> > > > > >> > > > > However, the biggest possible gain will come from utilising Falcon mode >> > > > > to skip the full U-Boot stage and directly boot into the OS from SPL. This >> > > > > assumes that the drivers involved are fully optimised, so loading up the >> > > > > OS image does not take longer than necessary. >> > > > >> > > > I'd like to see numbers on that. From my experience, loading and >> > > > running U-Boot does not take very long... >> > > > >> > > > > >> > > > > > 4. Rather than the reserving memory in board_init_f() we could have it >> > > > > > call malloc() from the expanded region. We could then perhaps then >> > > > > > move this reserve/allocate code in to particular drivers or >> > > > > > subsystems, and drop a good chunk of the init sequence. We would need >> > > > > > to have a larger malloc() region than is currently the case. >> > > > > > >> > > > > > There are still some arch-specific bits in board_init_f() which make >> > > > > > these sorts of changes a bit tricky to support generically. IMO it >> > > > > > would be best to move to 'generic relocation' written in C, where all >> > > > > > archs work basically the same way, before attempting any of the above. >> > > > > > >> > > > > > Still, I can see some benefits and even some simplifications. >> > > > > > >> > > > > > Regards, >> > > > > > Simon >> > > >> > > >> > > >> > > This discussion should have happened. >> > > U-Boot boot sequence is crazily inefficient. >> > > >> > > >> > > >> > > When we talk about "relocation", two things are happening. >> > > >> > > [1] U-Boot proper copies itself to the very end of DRAM >> > > [2] Fix-up the global symbols >> > > >> > > In my opinion, only [2] is useful. >> > > >> > > >> > > SPL initializes the DRAM, so it knows the base and size of DRAM. >> > > SPL should be able to load the U-Boot proper to the final destination. >> > > So, [1] is unnecessary. >> > > >> > > >> > > [2] is necessary because SPL may load the U-Boot proper >> > > to a different place than CONFIG_SYS_TEXT_BASE. >> > > This feature is useful for platforms >> > > whose DRAM base/size is only known at run-time. >> > > (Of course, it should be user-configurable by CONFIG_RELOCATE >> > > or something.) >> > > >> > > Moreover, board_init_f() is unneeded - >> > > everything in board_init_f() is already done by SPL. >> > > Multiple-time DM initialization is really inefficient and ugly. >> > > >> > > >> > > The following is how the ideal boot loader would work. >> > > >> > > >> > > Requirement for U-Boot proper: >> > > U-Boot never changes the location by itself. >> > > So, SPL or a vendor loader must load U-Boot proper >> > > to the final destination directly. >> > > (You can load it to the very end of DRAM if you like, >> > > but the actual place does not matter here.) >> > > >> > > >> > > Boot sequence of U-Boot proper: >> > > If CONFIG_RELOCATE (or something) is enabled, >> > > it fixes the global symbols at the very beginning >> > > of the boot. >> > > (In this case, CONFIG_SYS_TEXT_BASE can be arbitrary) >> > > >> > > That's it. Proceed to the rest of init code. >> > > (= board_init_r) >> > > board_init_f() is unnecessary. >> > > >> > > This should work for recent platforms. >> > >> > Yes that sounds reasonable to me. >> > >> > We could do the symbol fixup/relocation in SPL after loading U-Boot., >> > although that would probably push us to using ELF format for U-Boot >> > which is a bit limited. >> > >> > Still I think the biggest performance improvement comes from turning >> > on the cache in SPL. So the above is a simplification, not really a >> > speed-up. >> >> >> Right. >> I am more interested in simplification than in speed-up. >> The boot speed is not a significant problem at least for my boards. >> >> >> > > >> > > >> > > >> > > We should think about old platforms that boot from a NOR flash or something. >> > > There are two solutions: >> > > - execute-in-place: run the code in the flash directly >> > > - use SPL (common/spl/spl-nor.c) if you want to run >> > > it from RAM >> > >> > This seems like a big regression in functionality. For example for x86 >> > 32-bit we currently don't have an SPL (we do for 64-bit). So I think >> > this means that everything would be forced to have an SPL? >> >> After grace period for migration, Yes. >> XIP or SPL. >> No relocation in U-Boot proper. >> >> This assumption will allow us to dump a lot of burden. >> >> Remove relocation >> Remove board_init_f() >> Remove pre-reloc DM init >> Perhaps, remove struct global_data >> etc. > > I have not managed to keep up with this discussion but it seems you are suggesting > some radical change for NOR based boot boards ? > > We use such boards(ppc) and also use pram etc. would these still > work? I think they would have to switch to SPL. I suppose another way is to adjust boards which DO use SPL to NOT use board_init_f(). Regards, Simon