From mboxrd@z Thu Jan 1 00:00:00 1970 From: Kever Yang Date: Mon, 23 Jan 2017 10:37:36 +0800 Subject: [U-Boot] [RFC PATCH 03/11] SPL: FIT: factor out spl_load_fit_image() In-Reply-To: References: <1484877211-19332-1-git-send-email-andre.przywara@arm.com> <1484877211-19332-4-git-send-email-andre.przywara@arm.com> <58845C57.7010108@rock-chips.com> Message-ID: <58856C70.9030004@rock-chips.com> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: u-boot@lists.denx.de Hi Andre, On 01/22/2017 06:42 PM, Andr? Przywara wrote: > On 22/01/17 07:16, Kever Yang wrote: >> Hi Andre, >> >> On 01/20/2017 09:53 AM, Andre Przywara wrote: >>> At the moment we load two images from a FIT image: the actual U-Boot >>> image and the DTB. Both times we have very similar code to deal with >>> alignment requirement the media we load from imposes upon us. >>> Factor out this code into a new function, which we just call twice. >>> >>> Signed-off-by: Andre Przywara >>> --- >>> common/spl/spl_fit.c | 122 >>> +++++++++++++++++++++------------------------------ >>> 1 file changed, 51 insertions(+), 71 deletions(-) >>> >>> diff --git a/common/spl/spl_fit.c b/common/spl/spl_fit.c >>> index 381ed1f..d4149c5 100644 >>> --- a/common/spl/spl_fit.c >>> +++ b/common/spl/spl_fit.c >>> @@ -138,19 +138,58 @@ static int get_aligned_image_size(struct >>> spl_load_info *info, int data_size, >>> return (data_size + info->bl_len - 1) / info->bl_len; >>> } >>> +static int spl_load_fit_image(struct spl_load_info *info, ulong >>> sector, >>> + void *fit, ulong base_offset, int node, >>> + struct spl_image_info *image_info) >>> +{ >>> + ulong offset; >>> + size_t length; >>> + ulong load, entry; >>> + void *src; >>> + ulong overhead; >>> + int nr_sectors; >>> + >>> + offset = fdt_getprop_u32(fit, node, "data-offset") + base_offset; >>> + length = fdt_getprop_u32(fit, node, "data-size"); >>> + load = fdt_getprop_u32(fit, node, "load"); >>> + if (load == -1U && image_info) >>> + load = image_info->load_addr; > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>> + entry = fdt_getprop_u32(fit, node, "entry"); >>> + >>> + overhead = get_aligned_image_overhead(info, offset); >>> + nr_sectors = get_aligned_image_size(info, overhead + length, >>> offset); >>> + >>> + if (info->read(info, sector + get_aligned_image_offset(info, >>> offset), >>> + nr_sectors, (void*)load) != nr_sectors) >>> + return -EIO; >>> + debug("image: dst=%lx, offset=%lx, size=%lx\n", load, offset, >>> + (unsigned long)length); >>> + >>> + src = (void *)load + overhead; >>> +#ifdef CONFIG_SPL_FIT_IMAGE_POST_PROCESS >>> + board_fit_image_post_process(&src, &length); >>> +#endif >>> + >>> + memcpy((void*)load, src, length); >> For FIT image, we read the first block for header, and then we know the >> offset >> and size for images, is it possible to have an option that we can read >> the image >> data to load address directly without memcpy(), this helps speed up the >> boot time. > We do read directly into the load address (see the marked place above). > But we have to fix up the alignment: > a) The load address may not be cache line aligned. I am not totally > convinced we really need this, but the code takes care of this. > Shouldn't be an issue in practice since most load addresses are already > aligned. > b) More importantly we need to observe the block size of the device we > read from. mkimage packs the images as tightly as possible into the > image, so ATF may start at say offset 0x456 within the FIT image. Given > that the FIT image itself starts at a sector boundary, that may be "in > the middle" of a sector, which is the smallest addressable unit for > block devices. For MMC based storage for instance this is typically 512 > bytes. So we read the whole sector and later discard the overhead by > copying the image back in memory. > > Questions to you: > 1) Is the memcpy really an issue? I take it you use bl31.bin only, which > is probably between 32KB and 64KB. So does the memcpy really contribute > to boot time here? > 2) What device do you usually read from? What is the alignment there? > For instance SPI flash has typically an alignment of 1 Byte, so no > adjustment is actually needed. The memcpy routine bails out if dst==src, > so in this case there are no cycles spend on this. > 3) Can you tweak mkimage to observe alignment? As I said above, in the > moment it packs it tightly, but I don't see why we couldn't align > everything in the image at the cost of loosing 511 Bytes per embedded > image in the worst case, for instance. This would eventually fold into > the same point as question 2): If everything is aligned already, memcpy > is a NOP. > > So one relatively easy and clean solution would be to introduce an > alignment command line option to mkimage. One could optionally specify a > value, to which mkimage would align each embedded images into, leaving > gaps in between. We would still go over all the calculations in the SPL > code, but all the offsets and corrections would end up being 0, so no > copying would be necessary anymore. This looks good to me, I didn't notice that there is no copy if dst==src. Reviewed-by: Kever Yang Thanks, - Kever > > Does this sound like a plan? > > Cheers, > Andre. > >>> + >>> + if (image_info) { >>> + image_info->load_addr = load; >>> + image_info->size = length; >>> + image_info->entry_point = entry; >>> + } >>> + >>> + return 0; >>> +} >>> + >>> int spl_load_simple_fit(struct spl_image_info *spl_image, >>> struct spl_load_info *info, ulong sector, void *fit) >>> { >>> int sectors; >>> - ulong size, load; >>> + ulong size; >>> unsigned long count; >>> + struct spl_image_info image_info; >>> int node, images; >>> - void *load_ptr; >>> - int fdt_offset, fdt_len; >>> - int data_offset, data_size; >>> int base_offset, align_len = ARCH_DMA_MINALIGN - 1; >>> - int src_sector; >>> - void *dst, *src; >>> /* >>> * Figure out where the external images start. This is the base >>> for the >>> @@ -202,82 +241,23 @@ int spl_load_simple_fit(struct spl_image_info >>> *spl_image, >>> return -1; >>> } >>> - /* Get its information and set up the spl_image structure */ >>> - data_offset = fdt_getprop_u32(fit, node, "data-offset"); >>> - data_size = fdt_getprop_u32(fit, node, "data-size"); >>> - load = fdt_getprop_u32(fit, node, "load"); >>> - debug("data_offset=%x, data_size=%x\n", data_offset, data_size); >>> - spl_image->load_addr = load; >>> - spl_image->entry_point = load; >>> + /* Load the image and set up the spl_image structure */ >>> + spl_load_fit_image(info, sector, fit, base_offset, node, spl_image); >>> spl_image->os = IH_OS_U_BOOT; >>> - /* >>> - * Work out where to place the image. We read it so that the first >>> - * byte will be at 'load'. This may mean we need to load it starting >>> - * before then, since we can only read whole blocks. >>> - */ >>> - data_offset += base_offset; >>> - sectors = get_aligned_image_size(info, data_size, data_offset); >>> - load_ptr = (void *)load; >>> - debug("U-Boot size %x, data %p\n", data_size, load_ptr); >>> - dst = load_ptr; >>> - >>> - /* Read the image */ >>> - src_sector = sector + get_aligned_image_offset(info, data_offset); >>> - debug("Aligned image read: dst=%p, src_sector=%x, sectors=%x\n", >>> - dst, src_sector, sectors); >>> - count = info->read(info, src_sector, sectors, dst); >>> - if (count != sectors) >>> - return -EIO; >>> - debug("image: dst=%p, data_offset=%x, size=%x\n", dst, data_offset, >>> - data_size); >>> - src = dst + get_aligned_image_overhead(info, data_offset); >>> - >>> -#ifdef CONFIG_SPL_FIT_IMAGE_POST_PROCESS >>> - board_fit_image_post_process((void **)&src, (size_t *)&data_size); >>> -#endif >>> - >>> - memcpy(dst, src, data_size); >>> - >>> /* Figure out which device tree the board wants to use */ >>> node = spl_fit_get_image_node(fit, images, FIT_FDT_PROP, 0); >>> if (node < 0) { >>> debug("%s: cannot find FDT node\n", __func__); >>> return node; >>> } >>> - fdt_offset = fdt_getprop_u32(fit, node, "data-offset"); >>> - fdt_len = fdt_getprop_u32(fit, node, "data-size"); >>> /* >>> - * Read the device tree and place it after the image. There may be >>> - * some extra data before it since we can only read entire blocks. >>> - * And also align the destination address to ARCH_DMA_MINALIGN. >>> + * Read the device tree and place it after the image. >>> + * Align the destination address to ARCH_DMA_MINALIGN. >>> */ >>> - dst = (void *)((load + data_size + align_len) & ~align_len); >>> - fdt_offset += base_offset; >>> - sectors = get_aligned_image_size(info, fdt_len, fdt_offset); >>> - src_sector = sector + get_aligned_image_offset(info, fdt_offset); >>> - count = info->read(info, src_sector, sectors, dst); >>> - debug("Aligned fdt read: dst %p, src_sector = %x, sectors %x\n", >>> - dst, src_sector, sectors); >>> - if (count != sectors) >>> - return -EIO; >>> - >>> - /* >>> - * Copy the device tree so that it starts immediately after the >>> image. >>> - * After this we will have the U-Boot image and its device tree >>> ready >>> - * for us to start. >>> - */ >>> - debug("fdt: dst=%p, data_offset=%x, size=%x\n", dst, fdt_offset, >>> - fdt_len); >>> - src = dst + get_aligned_image_overhead(info, fdt_offset); >>> - dst = load_ptr + data_size; >>> - >>> -#ifdef CONFIG_SPL_FIT_IMAGE_POST_PROCESS >>> - board_fit_image_post_process((void **)&src, (size_t *)&fdt_len); >>> -#endif >>> - >>> - memcpy(dst, src, fdt_len); >>> + image_info.load_addr = spl_image->load_addr + spl_image->size; >>> + spl_load_fit_image(info, sector, fit, base_offset, node, >>> &image_info); >>> return 0; >>> } >> > > >