From mboxrd@z Thu Jan 1 00:00:00 1970 From: Simon Goldschmidt Date: Tue, 8 Jan 2019 14:51:55 +0100 Subject: [U-Boot] [PATCH v1 0/4] arm: socfgpa: support of-platdata In-Reply-To: <46a1b6f4-d237-e984-4e9e-f00eef74abfb@denx.de> References: <20190107211423.10151-1-simon.k.r.goldschmidt@gmail.com> <1f8fdec8-68d0-ad64-350d-5d290294a0cf@denx.de> <46a1b6f4-d237-e984-4e9e-f00eef74abfb@denx.de> Message-ID: List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: u-boot@lists.denx.de On Tue, Jan 8, 2019 at 2:38 PM Marek Vasut wrote: > > On 1/8/19 2:07 PM, Simon Goldschmidt wrote: > > On Tue, Jan 8, 2019 at 1:58 PM Marek Vasut wrote: > >> > >> On 1/8/19 1:38 PM, Simon Goldschmidt wrote: > >>> On Tue, Jan 8, 2019 at 1:06 PM Marek Vasut wrote: > >>>> > >>>> On 1/8/19 7:56 AM, Simon Goldschmidt wrote: > >>>>> On Mon, Jan 7, 2019 at 11:59 PM Marek Vasut wrote: > >>>>>> > >>>>>> On 1/7/19 10:14 PM, Simon Goldschmidt wrote: > >>>>>>> This is an initial attempt to support OF_PLATDATA for socfpga gen5. > >>>>>>> > >>>>>>> There are two motivations for this: > >>>>>>> a) reduce code size to eventually support secure boot (where SPL has to > >>>>>>> authenticate the next stage by loading/checking U-Boot from a FIT > >>>>>>> image) > >>>>>>> b) to support the cyclone 5 boot ROM's CRC check on the SPL in SRAM > >>>>>>> (on warm-restart), all bytes to check need to be in one piece. With > >>>>>>> OF_SEPARATE, this is not the case (.bss is between .rodata and the > >>>>>>> DTB). Since OF_EMBEDDED has been discouraged, OF_PLATDATA seems to > >>>>>>> be a good solution. > >>>>>> > >>>>>> I'd much prefer parsing the DT (and thus, decoupling the SW from HW) > >>>>>> than having some ad-hoc plat data again if we can avoid that. > >>>>> > >>>>> So you're against the whole OF_PLATDATA thing or how should I understand > >>>>> that? > >>>> > >>>> If we can avoid it, I'd prefer to do so. > >>>> > >>>>> It's not really ad-hoc, it's the DT converted to C structs. It's just in another > >>>>> format, but it's still (sort of) decoupled SW from HW. > >>>>> > >>>>> As written above, I have two goals I want to achieve with this. Right now, I > >>>>> cannot enable verified boot in SPL because the available OCRAM cannot > >>>>> hold all the code. And it seemed to me OF_PLATDATA could help me there. > >>>> > >>>> Well this might be a long shot, but I discussed this lack of OCRAM > >>>> during 35C3 and there was a suggestion to lock L2 cache lines above ROM > >>>> (so there's some backing store) and use that as extra SRAM. Would that > >>>> help you ? > >>> > >>> I would have joined that discussion if my Family would have let me go during the > >>> holidays :-)) > >>> > >>> This is an interesing idea, but actually it's a lack of code/rodata > >>> size. The Intel > >>> docs clearly state that the binary SPL loaded from SPI/MMC must be 60 KiB at > >>> max. I have not checked the code size increase I would get when enabling trusted > >>> boot (SPL loading U-Boot from FIT and verifying it with a public key), > >>> but I'm currently > >>> at ~45 KiB for .text, .rodata and DTB and only 40 bytes for BSS. I'm > >>> booting from SPI. > >>> When booting from MMC, the code is about ~4 KiB smaller but BSS grows to ~600 > >>> Bytes. > >> > >> I wonder if there are some huge chunks of code which could be optimized? > >> > >>> Of course the stack and initial malloc area do need some bytes too, but I think > >>> summed up, bss, stack and malloc should probably fit into 4 KiB, so I > >>> currently have > >>> about 15 KiB to add FIT loading and public key verification/hashing. I > >>> don't think that's > >>> enough just from the code size. > >>> > >>> And on socfpga, I think all added code would use the heap, which is > >>> changed to SDRAM > >>> very early, so it's not the RAM that is tight. > >> > >> Can you check readelf and see how the function size looks ? Maybe > >> there's something which is just too big ? > > > > I'm looking at the map file all the time ;-) The only thing that looks > > too big is > > SDRAM initialization, which is about 16 KiB overall, I think. The rest > > just seems > > to be smaller parts. But the binary blob u32 arrays created by Quartus don't > > help, either: rodata is about 9 KiB. > > Can that be somehow optimized ? The ideal approach would be to move it > somehow to DT. I don't know if those binary blobs (pin config, clock config etc.) can be converted without internal information from Intel. The SDRAM initialization might just be bad code, I don't know. So like you wrote in the other thread: obviously, we're doing something wrong as those 60 KiB will not be enough for what I want SPL to do. But, I haven't yet found something that is just obviously code bloat. Regards, Simon