From mboxrd@z Thu Jan  1 00:00:00 1970
From: Tomas Hlavacek <tmshlvck@gmail.com>
Date: Thu, 25 Oct 2012 21:16:15 +0200
Subject: [U-Boot] [PATCH v9] [RFC] Add dmmalloc module for DM.
In-Reply-To: <CALButC+kbn-EVE+1odTw-5ebuOPeM_8fxpteLMRKa=g==OGeTg@mail.gmail.com>
References: <1346069529-27397-1-git-send-email-tmshlvck@gmail.com>
	<1351122590-15068-1-git-send-email-tmshlvck@gmail.com>
	<CALButC+kbn-EVE+1odTw-5ebuOPeM_8fxpteLMRKa=g==OGeTg@mail.gmail.com>
Message-ID: <CAEB7QLC+A8PAJdtiPoGVn8beqUggunD0DSPF1h4TYaQ0nC9JZg@mail.gmail.com>
List-Id: <u-boot.lists.denx.de>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
To: u-boot@lists.denx.de

Hello Graeme,

On Thu, Oct 25, 2012 at 3:40 AM, Graeme Russ <graeme.russ@gmail.com> wrote:

>> diff --git a/arch/arm/include/asm/global_data.h b/arch/arm/include/asm/global_data.h
>> index 2b9af93..9045829 100644
>> --- a/arch/arm/include/asm/global_data.h
>> +++ b/arch/arm/include/asm/global_data.h
>> @@ -82,6 +82,9 @@ typedef       struct  global_data {
>>         unsigned long   post_log_res; /* success of POST test */
>>         unsigned long   post_init_f_time; /* When post_init_f started */
>>  #endif
>> +#ifdef CONFIG_SYS_EARLY_MALLOC
>> +       void            *early_heap;    /* heap for early_malloc */
>> +#endif
>
> Why not early_heap_header *early_heap; ?
>

It might be.

Actually it is a good point because I am using 3 different ways of
dealing with addresses: 1) struct early_heap header * or struct
early_block_header * - I am using this when I want to access members
of the stucture or compute address past the structure (which is where
the heap or block starts); 2) phys_addr_t - which is plain integer and
I use this for simple computations when I do not want to worry about
pointer arithmetic; 3) void * when I have just plain address,
especially when I want to pass an addres among logical parts of the
mallocator or outside. This may a bit controversial and perhaps I
should replace it by specific strucutre pointers internally.

I am unable to decide: Should I remove struct early_heap_header from
dmmalloc.h making it publicly unavailable or should I rather change
the void * to struct early_heap_header * in the GD structure? What do
you think is better?

>  diff --git a/common/Makefile b/common/Makefile
>> index fdfead7..bfb4d7a 100644
>> --- a/common/Makefile
>> +++ b/common/Makefile
>> @@ -209,6 +209,7 @@ COBJS-y += dlmalloc.o
>>  COBJS-y += image.o
>>  COBJS-y += memsize.o
>>  COBJS-y += stdio.o
>> +COBJS-$(CONFIG_DM) += dmmalloc.o
>
> COBJS-$(CONFIG_SYS_EARLY_MALLOC) += dmmalloc.o ?

Oh yes, now it is redundant to #ifdef CONFIG_SYS_EARLY_MALLOC inside
the dmmalloc.c file. I had a plan to extend the dmmalloc.c file by
relocation routines and then it would make sense. But I will shufle
the code a bit in the v10 anyway and we will see if the #ifdefs can
still be reduced.

>> +
>> +#include <common.h> /* for ROUND_UP */
>> +#include <asm/u-boot.h>
>> +#include <asm/global_data.h> /* for gd_t and gd */
>> +#include <asm/types.h> /* for phys_addr_t and size_addt_t */
>> +
>> +#include <dmmalloc.h>
>> +#include <malloc.h>
>> +
>> +#include <linux/compiler.h>
>> +
>> +DECLARE_GLOBAL_DATA_PTR;
>> +
>> +#ifdef CONFIG_SYS_EARLY_MALLOC
>
> If you use COBJS-$(CONFIG_SYS_EARLY_MALLOC) += dmmalloc.o in the Makefile,
> you can drop this #ifdef

Yes, that is redundant now.

>
>> +__weak struct early_heap_header *early_brk(size_t size)
>> +{
>> +       struct early_heap_header *h;
>> +       struct early_block_header *b;
>> +
>> +       if (gd->early_heap != NULL)
>> +               return NULL;
>> +
>> +       h = (struct early_heap_header *)CONFIG_SYS_EARLY_HEAP_ADDR;
>> +       b = (struct early_block_header *)(h + 1);
>
> Hmmm, does this really work? I would have thought:
>
> b = (struct early_block_header *)(h + sizeof(struct early_heap_header));
>
> but I could be mistaken

It seems that it works as it is (at least I wrote bunch of tests and I
inspected resulting heaps and it was all right). I believe that since
h is a pointer to the struct early_heap_header then pointer arithmetic
is in effect and h+1 actually means "next element in the array of
struct early_heap_header". Which is the address past the header that
equals beginning of the heap data block. (?)


>> +int early_malloc_active(void)
>> +{
>> +       return ((gd->flags & GD_FLG_RELOC) != GD_FLG_RELOC);
>> +}
>
> I think we need another flag - GD_FLG_RELOC gets set before the permanent
> heap is initialised, so there is a window of opportunity where things may
> break

Well, as I wrote in the commit message - this is only a temporary
implementation. I suppose I am going to change this when we have more
coarse initialization flags wired into DM (which I believe we are
going to have it anyway). So now I am just working around "forward
dependency" here.

>
>> +
>> +void early_heap_dump(struct early_heap_header *h)
>> +{
>> +       struct early_block_header *b;
>> +
>> +       debug("heap: h=%p, h->size=%d\n", h, h->size);
>> +
>> +       b = (struct early_block_header *)(h+1);
>> +       while ((phys_addr_t)b + sizeof(struct early_block_header)
>> +                       < (phys_addr_t)h + h->size) {
>> +               debug("block: h=%p h->size=%d b=%p b->size=%d b->(used)=%d\n",
>> +                               h, h->size, b, BLOCK_SIZE(b->size),
>> +                               BLOCK_USED(b->size));
>> +               assert(BLOCK_SIZE(b->size) > 0);
>> +               b = (struct early_block_header *)((phys_addr_t)b +
>> +                               sizeof(struct early_block_header) +
>> +                               BLOCK_SIZE(b->size));
>> +       }
>> +       debug("--- heap dump end ---\n");
>> +}
>
> Nice touch, but could we just iterate through all ealry heap chunks starting
> from gd->early_heap?

Or I can have two functions. One heap specific and one for all heaps.
I think both might be useful when somebody needs to debug early_malloc
or memory usage etc. in the early stage. Thanks.

>> +
>> +#ifndef __INCLUDE_DMMALLOC_H
>> +#define __INCLUDE_DMMALLOC_H
>> +
>> +#include <config.h>
>> +#include <linux/stddef.h> /* for size_t */
>> +#include <malloc.h>
>> +
>> +#if (!defined(CONFIG_SYS_EARLY_HEAP_ADDR)) || \
>> +       (!defined(CONFIG_SYS_EARLY_HEAP_SIZE))
>> +#undef CONFIG_SYS_EARLY_MALLOC
>> +#endif /* CONFIG_SYS_EARLY_HEAP_ADDR */
>
> If a board implements early_brk() using non-fixed a address and/or size
> then this is going to cause trouble

Right, I will drop this.

>
>> +#ifdef CONFIG_SYS_EARLY_MALLOC
>
> I see no need for the #ifdef
>
>> +struct early_heap_header {
>> +       size_t size;
>> +       void *early_heap_next;
>> +};
>> +
>> +struct early_block_header {
>> +       size_t size;
>> +};
>> +
>
>> +#define BLOCK_USED_FLAG 0x80000000
>> +#define BLOCK_SIZE(size) (size & (~BLOCK_USED_FLAG))
>> +#define BLOCK_USED(size) ((size & BLOCK_USED_FLAG) == BLOCK_USED_FLAG)
>> +#define BLOCK_FREE(size) (!BLOCK_USED(size))
>> +#define BLOCK_SET_FREE(size) BLOCK_SIZE(size)
>> +#define BLOCK_SET_USED(size) (size | BLOCK_USED_FLAG)
>
> Hmmm, I'm not sure what the general policy is about where to put 'stuff'
> that is only used by the implementation and does not 'need' to be made
> publically available. i.e. I don't know if these #defines and the
> early_block_header struct belong here or in the .c file...

You are right, I will put it to the .c.

>
>> +
>> +/**
>> + * early_brk() - obtain address of the heap
>> + * @size:      Minimal size of the new early heap to be allocated.
>> + *
>> + * Function returns a new heap pointer.
>> + *
>> + * Allocate and initialize early_heap at least size bytes long.
>> + * This function can be platform dependent or board dependent but sensible
>> + * default is provided.
>> + */
>> +struct early_heap_header *early_brk(size_t size);
>> +
>> +/**
>> + * early_malloc() - malloc operating on the early_heap(s)
>> + * @size:      Size in bytes.
>> + *
>> + * Function returns a pointer to the allocated block.
>> + */
>> +void *early_malloc(size_t size);
>> +
>> +/**
>> + * early_free() - free operating on the early_heap(s)
>> + * @addr:      Pointer to the allocated block to be released.
>> + */
>> +void early_free(void *addr);
>> +
>> +/**
>> + * early_malloc_active() - indicate if the early mallocator is active
>> + *
>> + * Function returns true when the early_malloc and early_free are used and
>> + * false otherwise.
>> + */
>> +int early_malloc_active(void);
>> +
>> +/**
>> + * early_heap_dump() - print blocks contained in an early_heap
>> + * @h:         Address of the early heap.
>> + */
>> +void early_heap_dump(struct early_heap_header *h);
>> +
>> +#endif /* CONFIG_SYS_EARLY_MALLOC */
>> +
>> +#ifdef CONFIG_DM
>> +
>> +/*
>> + * DM versions of malloc* functions. In early init it calls early_malloc.
>> + * It wraps around normal malloc* functions afterwards.
>> + */
>
> I don't think these _need_ to be inline functions - The compiler should be
> smart enough no (non-)inline them as appropriate.

Yes. Now I am thinking about pulling all this into the .c file and
making the functions non-static. Actually, this "static inline"
evolved from really simple "if(...) early_malloc() else malloc()", but
now it is not always straight-forward and even more future extensions
are expected to these functions due to relocation-related code.

>
>> +
>> +/**
>> + * dmmalloc() - malloc working seamlessly in early as well as in RAM stages
>> + * @size:      Size of the block to be allocated.
>> + *
>> + * Function returns an address of the newly allocated block when successful
>> + * or NULL otherwise.
>> + */
>> +static inline void *dmmalloc(size_t size)
>> +{
>> +#ifdef CONFIG_SYS_EARLY_MALLOC
>> +       if (early_malloc_active())
>> +               return early_malloc(size);
>> +#endif /* CONFIG_SYS_EARLY_MALLOC */
>> +       return malloc(size);
>> +}
>> +
>> +/**
>> + * dmfree() - free working seamlessly in early as well as in RAM stages
>> + * @ptr:       Pointer to the allocated block to be released.
>> + */
>> +static inline void dmfree(void *ptr)
>> +{
>> +#ifdef CONFIG_SYS_EARLY_MALLOC
>> +       if (early_malloc_active()) {
>> +               early_free(ptr);
>> +               return;
>> +       }
>> +#endif /* CONFIG_SYS_EARLY_MALLOC */
>> +       free(ptr);
>> +}
>> +
>> +/**
>> + * dmcalloc() - calloc working seamlessly in early as well as in RAM stages
>> + * @n:         Number of elements to be allocated.
>> + * @elem_size: Size of elements to be allocated.
>> + *
>> + * Function returns a pointer to newly the allocated area (n*elem_size) long.
>> + */
>> +static inline void *dmcalloc(size_t n, size_t elem_size)
>> +{
>> +#ifdef CONFIG_SYS_EARLY_MALLOC
>> +       char *addr;
>> +       int size = elem_size * n;
>> +
>> +       if (early_malloc_active()) {
>> +               addr = early_malloc(size);
>> +               memset(addr, 0, size);
>> +               return addr;
>> +       }
>> +#endif /* CONFIG_SYS_EARLY_MALLOC */
>> +       return calloc(n, elem_size);
>> +}
>> +
>> +/**
>> + * dmrealloc() - realloc working seamlessly in early as well as in RAM stages
>> + * @oldaddr:   Pointer to the old memory block.
>> + * @bytes:     New size to of the block to be reallocated.
>> + *
>> + * Function returns an address of the newly allocated block when successful
>> + * or NULL otherwise.
>> + *
>> + * Data are copied from the block specified by oldaddr to the new block.
>> + */
>> +static inline void *dmrealloc(void *oldaddr, size_t bytes)
>> +{
>> +#ifdef CONFIG_SYS_EARLY_MALLOC
>> +       char *addr;
>> +       struct early_block_header *h;
>> +       if (early_malloc_active()) {
>> +               addr = dmmalloc(bytes);
>> +               if (addr == NULL)
>> +                       return NULL;
>> +
>> +               h = BLOCK_HEADER(oldaddr);
>> +               if (BLOCK_FREE(h->size))
>> +                       return NULL;
>> +
>> +               if (bytes > BLOCK_SIZE(h->size))
>> +                       bytes = BLOCK_SIZE(h->size);
>> +
>> +               memcpy(addr, oldaddr, bytes);
>> +               dmfree(oldaddr);
>> +               return addr;
>> +       }
>> +#endif /* CONFIG_SYS_EARLY_MALLOC */
>> +       return realloc(oldaddr, bytes);
>> +}
>
> Hmmm, we have a very intersting corner case to deal with. What if we use
> dmrealloc() to move allocated data from the early malloc pool to the final
> malloc pool?
>
> At some point, we need to assume the developer is only ever going to pass
> early malloc'd memory to dmrealloc()

Good point! The current code would do the job assuming that the
early_malloc can still access the proper early_heap (or a copy, but in
that case some additional magic is needed) and the real malloc is
already initialized.

As you can see I am still sticking with the double-copy method. It is
maybe due to lack of insight. But the discussion here was not
absolutely conclusive last time. I have even some experimental code
(not ready for submitting at all) for double copy method but I would
prefer to discuss it separately when the early_malloc() is done.

>
> The other option is to use the gd->flags...
>
> Define two flags - something like:
>
> GD_FLG_HEAP_INIT -> Final heap has been initialised
> GD_FLG_EH_DONE    -> free(), realloc() refer to final heap
>
> (I don't like the names, I'm just not up to thinking of anything better)
>
> This way we can use dmalloc() prior to the heap being initialised and then
> set GD_FLG_HEAP_INIT. Once GD_FLG_HEAP_INIT has been set, do a bunch of
> dmrealloc() calls to essentially move the data into the permanent heap (of
> course pointers int the structures need to be fixed up by the drivers) and
> finally set GD_FLG_EH_DONE (and call early_heap_dump)
>
> One problem I see is what happens of you call malloc() and free() on the
> same block between the setting of GD_FLG_HEAP_INIT and GD_FLG_EH_DONE? The
> code will try to reference the block as if it is in the early heap, but it
> won't be.
>
> One solution is, on detection of GD_FLG_HEAP_INIT being set, dmfree() and
> dmrealloc() could search the early heap chunks instead of just assuming
> that the referenced block is in the early heap (if GD_FLG_HEAP_INIT has not
> been set, it will be safe to assume the memory is on the early heap)

Sure we will have that flags. But I think we can use them as well for
switching from DM driver instance list to tree for example. Or other
way around: I can use DM flags for early_malloc. Therefore I would
like to synchronize with DM cores for PCI and another low-level things
which are certainly going to start in early stage. It would be best to
use the same flags and switch on/off early_malloc based on DM internal
state.

Best regards,
Tomas