From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marek Vasut Date: Tue, 8 Jan 2019 15:43:25 +0100 Subject: [U-Boot] [PATCH v1 0/4] arm: socfgpa: support of-platdata In-Reply-To: References: <20190107211423.10151-1-simon.k.r.goldschmidt@gmail.com> <1f8fdec8-68d0-ad64-350d-5d290294a0cf@denx.de> <46a1b6f4-d237-e984-4e9e-f00eef74abfb@denx.de> Message-ID: <0fe320d7-f3d8-dfca-511b-e091edb0985a@denx.de> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: u-boot@lists.denx.de On 1/8/19 2:51 PM, Simon Goldschmidt wrote: > On Tue, Jan 8, 2019 at 2:38 PM Marek Vasut wrote: >> >> On 1/8/19 2:07 PM, Simon Goldschmidt wrote: >>> On Tue, Jan 8, 2019 at 1:58 PM Marek Vasut wrote: >>>> >>>> On 1/8/19 1:38 PM, Simon Goldschmidt wrote: >>>>> On Tue, Jan 8, 2019 at 1:06 PM Marek Vasut wrote: >>>>>> >>>>>> On 1/8/19 7:56 AM, Simon Goldschmidt wrote: >>>>>>> On Mon, Jan 7, 2019 at 11:59 PM Marek Vasut wrote: >>>>>>>> >>>>>>>> On 1/7/19 10:14 PM, Simon Goldschmidt wrote: >>>>>>>>> This is an initial attempt to support OF_PLATDATA for socfpga gen5. >>>>>>>>> >>>>>>>>> There are two motivations for this: >>>>>>>>> a) reduce code size to eventually support secure boot (where SPL has to >>>>>>>>> authenticate the next stage by loading/checking U-Boot from a FIT >>>>>>>>> image) >>>>>>>>> b) to support the cyclone 5 boot ROM's CRC check on the SPL in SRAM >>>>>>>>> (on warm-restart), all bytes to check need to be in one piece. With >>>>>>>>> OF_SEPARATE, this is not the case (.bss is between .rodata and the >>>>>>>>> DTB). Since OF_EMBEDDED has been discouraged, OF_PLATDATA seems to >>>>>>>>> be a good solution. >>>>>>>> >>>>>>>> I'd much prefer parsing the DT (and thus, decoupling the SW from HW) >>>>>>>> than having some ad-hoc plat data again if we can avoid that. >>>>>>> >>>>>>> So you're against the whole OF_PLATDATA thing or how should I understand >>>>>>> that? >>>>>> >>>>>> If we can avoid it, I'd prefer to do so. >>>>>> >>>>>>> It's not really ad-hoc, it's the DT converted to C structs. It's just in another >>>>>>> format, but it's still (sort of) decoupled SW from HW. >>>>>>> >>>>>>> As written above, I have two goals I want to achieve with this. Right now, I >>>>>>> cannot enable verified boot in SPL because the available OCRAM cannot >>>>>>> hold all the code. And it seemed to me OF_PLATDATA could help me there. >>>>>> >>>>>> Well this might be a long shot, but I discussed this lack of OCRAM >>>>>> during 35C3 and there was a suggestion to lock L2 cache lines above ROM >>>>>> (so there's some backing store) and use that as extra SRAM. Would that >>>>>> help you ? >>>>> >>>>> I would have joined that discussion if my Family would have let me go during the >>>>> holidays :-)) >>>>> >>>>> This is an interesing idea, but actually it's a lack of code/rodata >>>>> size. The Intel >>>>> docs clearly state that the binary SPL loaded from SPI/MMC must be 60 KiB at >>>>> max. I have not checked the code size increase I would get when enabling trusted >>>>> boot (SPL loading U-Boot from FIT and verifying it with a public key), >>>>> but I'm currently >>>>> at ~45 KiB for .text, .rodata and DTB and only 40 bytes for BSS. I'm >>>>> booting from SPI. >>>>> When booting from MMC, the code is about ~4 KiB smaller but BSS grows to ~600 >>>>> Bytes. >>>> >>>> I wonder if there are some huge chunks of code which could be optimized? >>>> >>>>> Of course the stack and initial malloc area do need some bytes too, but I think >>>>> summed up, bss, stack and malloc should probably fit into 4 KiB, so I >>>>> currently have >>>>> about 15 KiB to add FIT loading and public key verification/hashing. I >>>>> don't think that's >>>>> enough just from the code size. >>>>> >>>>> And on socfpga, I think all added code would use the heap, which is >>>>> changed to SDRAM >>>>> very early, so it's not the RAM that is tight. >>>> >>>> Can you check readelf and see how the function size looks ? Maybe >>>> there's something which is just too big ? >>> >>> I'm looking at the map file all the time ;-) The only thing that looks >>> too big is >>> SDRAM initialization, which is about 16 KiB overall, I think. The rest >>> just seems >>> to be smaller parts. But the binary blob u32 arrays created by Quartus don't >>> help, either: rodata is about 9 KiB. >> >> Can that be somehow optimized ? The ideal approach would be to move it >> somehow to DT. > > I don't know if those binary blobs (pin config, clock config etc.) can > be converted > without internal information from Intel. > > The SDRAM initialization might just be bad code, I don't know. > > So like you wrote in the other thread: obviously, we're doing something wrong > as those 60 KiB will not be enough for what I want SPL to do. But, I haven't yet > found something that is just obviously code bloat. I wonder if the SDRAM init tables aren't a bit sparse for example. Maybe something can be done there ? -- Best regards, Marek Vasut