All of lore.kernel.org
 help / color / mirror / Atom feed
* [U-Boot] early_malloc outline
@ 2012-07-31 15:30 Tomas Hlavacek
  2012-07-31 19:52 ` Wolfgang Denk
  2012-08-01  2:57 ` Graeme Russ
  0 siblings, 2 replies; 14+ messages in thread
From: Tomas Hlavacek @ 2012-07-31 15:30 UTC (permalink / raw)
  To: u-boot

Hello all!

In u-boot-dm mailinglist we had a discussion about implementation of
early_malloc (not only) for U-Boot Driver Model. The intention is to
have a simple malloc() function in the early stage of init before
relocation and before RAM is up and running. There was an experimental
patch that added the early heap to GD structure.

In the following discussion Graeme Russ pointed out that there is a
pre-console buffer which does the similar thing. And we should not
explode GD by adding the early heap (which is going to be few hundreds
of bytes long) into it. He suggested to create an independent area
locked in cache lines for early heap in order to allow split GD and
early heap into more non-contiguous blocks.

Pavel Hermann said that we would have to copy data twice (first before
the RAM is up and running and caches are still off and second after
RAM and dlmalloc is initialized).

Marek Vasut said (earlier in the discussion) that we do not need to
care about few hundred of bytes, especially after copying them into
RAM. And Wolfgang Denk resisted. He also pointed out that there are
other possibilities where early memory may be allocated -
on-chip-memory, external SRAM and others and these should be kept in
mind including existing size restrictions.

(I apologize for eventual misinterpretation and I am sorry that we do
not have a link to the u-boot-dm mailinglist archive nor GMANE. But I
can eventually Fwd. needed pieces of the discussion.)

We would like to hear opinions on the early_malloc idea to find a
broadly acceptable solution.

Can/should we use some existing mechanism? Or would it be considered a
viable option to choose different beginning address for early heap,
use it (in architecture-specific way) and keep the pointer to the
beginning in GD. Then copy the early heap to memory before caches are
flushed and in case of DM copy again data from early heap to new
destinations that has been obtained through malloc() when it is
initialized?

Tomas

-- 
Tom?? Hlav??ek <tmshlvck@gmail.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [U-Boot] early_malloc outline
  2012-07-31 15:30 [U-Boot] early_malloc outline Tomas Hlavacek
@ 2012-07-31 19:52 ` Wolfgang Denk
  2012-08-01 15:19   ` Tomas Hlavacek
  2012-08-01  2:57 ` Graeme Russ
  1 sibling, 1 reply; 14+ messages in thread
From: Wolfgang Denk @ 2012-07-31 19:52 UTC (permalink / raw)
  To: u-boot

Dear Tomas Hlavacek,

In message <CAEB7QLChVGARx3cpdE4tGQ01BeWoZr6ANHqnRJp02P6dZnNT5g@mail.gmail.com> you wrote:
> 
> Can/should we use some existing mechanism? Or would it be considered a
> viable option to choose different beginning address for early heap,
> use it (in architecture-specific way) and keep the pointer to the
> beginning in GD. Then copy the early heap to memory before caches are
> flushed and in case of DM copy again data from early heap to new
> destinations that has been obtained through malloc() when it is
> initialized?

It is difficult (or actually impossible) to answer this, if you do not
explain which concept you are talking about here, or why two copy
operations would be needed, what "in case of DM" means (and which
other cases exist), or how you intend to handle the problem of
changing addresses (and thus pointers becoming incorrect) for each of
such copy operations.

Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd at denx.de
"Plan to throw one away.  You will anyway."
                              - Fred Brooks, "The Mythical Man Month"

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [U-Boot] early_malloc outline
  2012-07-31 15:30 [U-Boot] early_malloc outline Tomas Hlavacek
  2012-07-31 19:52 ` Wolfgang Denk
@ 2012-08-01  2:57 ` Graeme Russ
  2012-08-01 14:47   ` Tomas Hlavacek
  1 sibling, 1 reply; 14+ messages in thread
From: Graeme Russ @ 2012-08-01  2:57 UTC (permalink / raw)
  To: u-boot

Hi Thomas,

On 08/01/2012 01:30 AM, Tomas Hlavacek wrote:
> Hello all!
> 
> In u-boot-dm mailinglist we had a discussion about implementation of
> early_malloc (not only) for U-Boot Driver Model. The intention is to
> have a simple malloc() function in the early stage of init before
> relocation and before RAM is up and running. There was an experimental
> patch that added the early heap to GD structure.
> 
> In the following discussion Graeme Russ pointed out that there is a
> pre-console buffer which does the similar thing. And we should not
> explode GD by adding the early heap (which is going to be few hundreds
> of bytes long) into it. He suggested to create an independent area
> locked in cache lines for early heap in order to allow split GD and
> early heap into more non-contiguous blocks.

More specifically, we must not assume that we have a single, contiguous
region of memory capable of holding pre-relocations early stack,
pre-relocation global data, pre-console buffer, and early (pre-relocation)
heap.

Forget about 'locked cache lines' - That is only important when considering
when to call enable_caches(). The cover the generic case (which covers all
architectures) we simply need to keep in mind that enable_caches() can only
be called _after_ the early heap has been moved to the final (SDRAM) heap.
Therefore, we must keep in mind that any code which manipulated the early
heap into the final heap is going to be performance-hindered.

> Pavel Hermann said that we would have to copy data twice (first before
> the RAM is up and running and caches are still off and second after
> RAM and dlmalloc is initialized).

I think I understand why now - The idea is to blind-copy the early-heap
into SDRAM, enable caches and then process the early heap into final heap.
This _may_ provide a performance bonus on _some_ (most) cases

> Marek Vasut said (earlier in the discussion) that we do not need to
> care about few hundred of bytes, especially after copying them into
> RAM. And Wolfgang Denk resisted. He also pointed out that there are

And so do I - it sets a very bad precedent and it is simply not how
embedded developer should think. This is not 1980's Wall Street - Greed is
NOT good.

> other possibilities where early memory may be allocated -
> on-chip-memory, external SRAM and others and these should be kept in
> mind including existing size restrictions.
> 
> (I apologize for eventual misinterpretation and I am sorry that we do
> not have a link to the u-boot-dm mailinglist archive nor GMANE. But I
> can eventually Fwd. needed pieces of the discussion.)

OK, lets forget about Driver Model here - it is no longer relevant to the
discussion at hand.

> We would like to hear opinions on the early_malloc idea to find a
> broadly acceptable solution.
> 
> Can/should we use some existing mechanism? Or would it be considered a
> viable option to choose different beginning address for early heap,
> use it (in architecture-specific way) and keep the pointer to the
> beginning in GD. Then copy the early heap to memory before caches are
> flushed and in case of DM copy again data from early heap to new
> destinations that has been obtained through malloc() when it is
> initialized?

OK, I'm going to go out on a long and thin limb here (i.e. look out for
daft ideas) and say that all we need before relocation and final heap
initialisation is an early stack and an early heap (no global data or no
pre-relocation buffer as they are currently implemented). What! I hear you
say :)

Well, why can't we put global data and pre-relocation buffer _on_ the early
heap? Yes, it will be a bit tricky as there is some very early code (in
assembler) that reads/writes to/from GD, but if GD is placed at the top of
the heap, it's members can still be directly referenced.

And now we can have some fun with an early version of brk() / sbrk()
whereby if early malloc fails, a call to early_sbrk() will give us more
early heap which _may be in a memory region which is non-contiguous with
the existing early heap.

E.g.:

+ ----------------------+ \
|                       | |
|      Early Stack      | |
|                       | |
+-----------------------+ |
|     Early Heap A      | |
| +-------------------+ | |
| | Early Global Data | | |
| +-------------------+ | > Locked Cache Lines
| |    Early Data A   | | |
| +-------------------+ | |
| |    Early Data B   | | |
| +-------------------+ | |
| |    Early Data C   | | |
+ +-------------------+ + |
|                       | |
|      Unused Bytes     | |
|                       | |
+-----------------------+ /

+-----------------------+ \
|     Early Heap B      | |
| +-------------------+ | |
| |    Early Data D   | | |
| +-------------------+ | |
| |    Early Data E   | | |
| +-------------------+ | |
| |    Early Data F   | | > SRAM
| +-------------------+ | |
| |    Early Data G   | | |
+ +-------------------+ + |
|                       | |
| Free Early Heap Space | |
|                       | |
+-----------------------+ /

Now what we can have is an 'early heap info' structure at the start of each
early heap space:

struct early_heap_info {
  void *next_free_block;
  uint *free_bytes;
  void *next_early_heap;
};

early_malloc() would traverse the early_heap_info list until it found an
ealrly heap with enough space to fulfil the request or, if the last one has
not enough space and next_early_heap == NULL then call early_sbrk() to
attempt to create more early heap. early_sbrk() is platform (and even board
specific) and is intended to release pre-SDRAM memory in the most
appropriate manner possible (fastest first for example, or maybe biggest
first if the cost of setting up the fastest is too much)

And what about global data - I'm wondering how much of it is actually used
across all boards of a particular architecture. I'm thinking that, after
relocation, some contents of global data could be cherry-picked and some
not copied at all. Maybe it could be split into 'Global Data the lives
across relocation' and 'Global Data only used pre-relocation'. Examples of
the latter may include:

  reloc_off - After relocation, is this ever referenced anymore?
  env_buf - Isn't this for pre-relocation anyway? Could it be malloc'd?
  x86 has a few I know are only referenced during the transit through
  relocation and are then forgotten about.

If we move global data onto the heap, does this simplify things or does it
become even more complex?

And as for the question of fixing up pointer in the structures allocated on
the early heap, that is entirely up to the user of the early heap as only
they know what the contents of the structures mean.

Regards,

Graeme

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [U-Boot] early_malloc outline
  2012-08-01  2:57 ` Graeme Russ
@ 2012-08-01 14:47   ` Tomas Hlavacek
  0 siblings, 0 replies; 14+ messages in thread
From: Tomas Hlavacek @ 2012-08-01 14:47 UTC (permalink / raw)
  To: u-boot

Hi Graeme,

On Wed, Aug 1, 2012 at 4:57 AM, Graeme Russ <graeme.russ@gmail.com> wrote:

> More specifically, we must not assume that we have a single, contiguous
> region of memory capable of holding pre-relocations early stack,
> pre-relocation global data, pre-console buffer, and early (pre-relocation)
> heap.

Agreed.

>
> Forget about 'locked cache lines' - That is only important when considering
> when to call enable_caches(). The cover the generic case (which covers all
> architectures) we simply need to keep in mind that enable_caches() can only
> be called _after_ the early heap has been moved to the final (SDRAM) heap.
> Therefore, we must keep in mind that any code which manipulated the early
> heap into the final heap is going to be performance-hindered.

Yes. And actually there might be platforms or boards that do not need
early_malloc for DM and since there are no other users yet we might
want to switch it off completely not to waste memory and CPU cycles on
initialization etc.

>
>> Pavel Hermann said that we would have to copy data twice (first before
>> the RAM is up and running and caches are still off and second after
>> RAM and dlmalloc is initialized).
>
> I think I understand why now - The idea is to blind-copy the early-heap
> into SDRAM, enable caches and then process the early heap into final heap.
> This _may_ provide a performance bonus on _some_ (most) cases

Exactly. But we are a bit afraid of this copy-twice process. In fact
in meantime between relocation start and second copying finish DM
would be inactive (DM tree will be unavailable, therefore it is going
to be impossible to use drivers through DM etc.)

> OK, I'm going to go out on a long and thin limb here (i.e. look out for
> daft ideas) and say that all we need before relocation and final heap
> initialisation is an early stack and an early heap (no global data or no
> pre-relocation buffer as they are currently implemented). What! I hear you
> say :)
>
> Well, why can't we put global data and pre-relocation buffer _on_ the early
> heap? Yes, it will be a bit tricky as there is some very early code (in
> assembler) that reads/writes to/from GD, but if GD is placed at the top of
> the heap, it's members can still be directly referenced.

Well, since I am new to U-Boot development and I have not even managed
to read and understand line-by-line or better
instruction-by-instruction the early init code for all architectures
except for ARM yet, I am not be able to appreciate nor understand all
implications nor implement this idea. But I can prepare the
early_mallocator for this by using  by using your frame header struct
early_heap_info and by implementing heap list traversal into
early_malloc().

(But it may be considered as dead code now, because without any
working early_sbrk() it would add extra complexity without any
benefit. And I am certainly not the right person who could attempt
making such a deep changes to borad_init_f, move GD to early_heap
etc.)

> And as for the question of fixing up pointer in the structures allocated on
> the early heap, that is entirely up to the user of the early heap as only
> they know what the contents of the structures mean.

Exactly. And I will blind-copy only used early_heap, not the whole
early_heap in order not to waste space and CPU cycles.


Thanks,
Tomas



-- 
Tom?? Hlav??ek <tmshlvck@gmail.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [U-Boot] early_malloc outline
  2012-07-31 19:52 ` Wolfgang Denk
@ 2012-08-01 15:19   ` Tomas Hlavacek
  2012-08-01 19:09     ` Wolfgang Denk
  0 siblings, 1 reply; 14+ messages in thread
From: Tomas Hlavacek @ 2012-08-01 15:19 UTC (permalink / raw)
  To: u-boot

Dear Wolfgang,

On Tue, Jul 31, 2012 at 9:52 PM, Wolfgang Denk <wd@denx.de> wrote:

>> Can/should we use some existing mechanism? Or would it be considered a
>> viable option to choose different beginning address for early heap,
>> use it (in architecture-specific way) and keep the pointer to the
>> beginning in GD. Then copy the early heap to memory before caches are
>> flushed and in case of DM copy again data from early heap to new
>> destinations that has been obtained through malloc() when it is
>> initialized?
>
> It is difficult (or actually impossible) to answer this, if you do not
> explain which concept you are talking about here, or why two copy
> operations would be needed, what "in case of DM" means (and which
> other cases exist), or how you intend to handle the problem of
> changing addresses (and thus pointers becoming incorrect) for each of
> such copy operations.

I have been given an advice by Graeme not to make early_malloc() as
one-purpose thing for DM (i.e. not to implement DM tree relocation or
special support for doing so in early_malloc routines).

Other guys working on DM wants AFAIK to create DM tree using
early_malloc inside board_init_f(). The tree is going to have root and
on some boards few extra elements, like 2 or 3 in this phase and each
object has 16 bytes. Then they want to have this tree accessible (or
at least a copy of the tree) in board_init_r(). They want to traverse
the tree (by recomputing pointers) at some point in board_init_r(),
allocate new tree objects using dlmalloc and copy the data into the
new tree.

The concept I am thinking about is reserving space for early heap
right after GD by same platform specific means (i.e. subtracting
CONFIG_SYS_INIT_SP_ADDR). Then I would like to reserve space in RAM
equal to used size of early_heap before relocation and memcpy the
existing early_heap there (the same way GD are copied). Therefore we
would have a copy of used early_heap in RAM and we can recompute
pointers to traverse the tree in board_init_r().

Tomas

-- 
Tom?? Hlav??ek <tmshlvck@gmail.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [U-Boot] early_malloc outline
  2012-08-01 15:19   ` Tomas Hlavacek
@ 2012-08-01 19:09     ` Wolfgang Denk
  2012-08-01 22:12       ` Tomas Hlavacek
  0 siblings, 1 reply; 14+ messages in thread
From: Wolfgang Denk @ 2012-08-01 19:09 UTC (permalink / raw)
  To: u-boot

Dear Tomas,

In message <CAEB7QLCieZFKpam2TumS3Gqb=5McLVuFHm_PXbZfyVPDNWHNZw@mail.gmail.com> you wrote:
> 
> > It is difficult (or actually impossible) to answer this, if you do not
> > explain which concept you are talking about here, or why two copy
> > operations would be needed, what "in case of DM" means (and which
> > other cases exist), or how you intend to handle the problem of
> > changing addresses (and thus pointers becoming incorrect) for each of
> > such copy operations.
...
> Other guys working on DM wants AFAIK to create DM tree using
> early_malloc inside board_init_f(). The tree is going to have root and
> on some boards few extra elements, like 2 or 3 in this phase and each
> object has 16 bytes. Then they want to have this tree accessible (or
> at least a copy of the tree) in board_init_r(). They want to traverse
> the tree (by recomputing pointers) at some point in board_init_r(),
> allocate new tree objects using dlmalloc and copy the data into the
> new tree.

Hm... I have to admit that I am not really happy about such an
"explanation".  The statement that "other guys" want something is not
exactly an explanation of a concept that I can understand, and without
being able to understand it, I don't buy it.

Please do not assume that everybody has followed all (or any) or your
previous discussions on the DM list.  Please assume I have zero prior
knowledge about that stuff (which is about correct), and explain what
the concept is.  And please keep in mind that any time you write
"I/they want", I will wonder "why?", and probably we lose time in
another iteration because I will actually ask this question.

> The concept I am thinking about is reserving space for early heap
> right after GD by same platform specific means (i.e. subtracting
> CONFIG_SYS_INIT_SP_ADDR). Then I would like to reserve space in RAM

You are mixing design and implementation here.  Where exactly this
storage space is located, and how youcreate it etc., is implementation
details.  We should ignore this here.

> equal to used size of early_heap before relocation and memcpy the
> existing early_heap there (the same way GD are copied). Therefore we
> would have a copy of used early_heap in RAM and we can recompute
> pointers to traverse the tree in board_init_r().

But why would 2 copies be needed?  I understand then once regular
malloc() becomes available, you want to make sure all allocations are
maintained using this mechanism.  But I already wonder how you are
going to implement this - you will have to update all pointers.  How
will you find out where these might be?

Assume something like:

item *foo(...)
{
	static item *foo_local = malloc(size1);
	...
	return foo_local;
}

item *bar(..., item **ptr)
{
	static item *bar_local = malloc(size2);
	...
	*ptr = bar_local;
	return bar_local;
}

void some_function(...)
{
	item *x, *y;
	...
	x = bar(..., &y);
	baz(y);
	...
}

How will you later know which variables store the values from the early
malloc calls, and how will you access these for proper relocation?

Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd at denx.de
What is mind?  No matter.  What is matter?  Never mind.
                                      -- Thomas Hewitt Key, 1799-1875

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [U-Boot] early_malloc outline
  2012-08-01 19:09     ` Wolfgang Denk
@ 2012-08-01 22:12       ` Tomas Hlavacek
  2012-08-07 21:07         ` Wolfgang Denk
  0 siblings, 1 reply; 14+ messages in thread
From: Tomas Hlavacek @ 2012-08-01 22:12 UTC (permalink / raw)
  To: u-boot

Dear Wolfgang,

On Wed, Aug 1, 2012 at 9:09 PM, Wolfgang Denk <wd@denx.de> wrote:

> Hm... I have to admit that I am not really happy about such an
> "explanation".  The statement that "other guys" want something is not
> exactly an explanation of a concept that I can understand, and without
> being able to understand it, I don't buy it.

I wanted to say that this is outcome of an informal discussion with
Marek Vasut, Pavel Hermann and Viktor Krivak, who are working on
different parts of DM (and perhaps I should say that it is our school
project to implement driver model for U-Boot, which is not relevant
information to the current discussion, but it may explain why we
closely cooperate on DM among ourselves). Statements like "I/we want"
were not any notices nor decisions nor whatever final but rather
wishes or ideas (mine or from others). Although we have elaborate
outline of DM and supporting subsystems it is still subject to
changes.

> But why would 2 copies be needed?  I understand then once regular
> malloc() becomes available, you want to make sure all allocations are
> maintained using this mechanism.  But I already wonder how you are
> going to implement this - you will have to update all pointers.  How
> will you find out where these might be?
>
> Assume something like:
>
> item *foo(...)
> {
>         static item *foo_local = malloc(size1);
>         ...
>         return foo_local;
> }
>
> item *bar(..., item **ptr)
> {
>         static item *bar_local = malloc(size2);
>         ...
>         *ptr = bar_local;
>         return bar_local;
> }
>
> void some_function(...)
> {
>         item *x, *y;
>         ...
>         x = bar(..., &y);
>         baz(y);
>         ...
> }
>
> How will you later know which variables store the values from the early
> malloc calls, and how will you access these for proper relocation?
>

Each early heap (assuming we can have more than one contiguous early
heaps as Graeme suggested) has a pointer to the beginning (and it is
valid only before relocation and before caches are enabled). I can
memcpy() the used part of the early heap somewhere else and preserve
the original heap beginning pointer (I have in mind preservation of
the pointers in GD; it is implementation detail indeed, but I would
rather note this in order make sure that I am not working with false
assumption about possibility of preserving pointers like that). Then I
can compute copied_heap_address = original_address +
(copied_heap_begin - original_heap_begin).

I have to admit, that I am not familiar enough with plans for early DM
tree to know where are we going to hold tree root pointer. I think
that it could be eventually placed in GD. To your example:

#define TRANSLATE_ADDR(old_heap,new_heap,pointer) (pointer + new_heap
- old_heap)

void some_function(...)
{
        item *x, *y;
        ...
        x = bar(..., &y);
        baz(y);

	gd->x = x;
	gd->y = y;

	...

	memcpy(NEW_HEAP_ADDR, gd->old_heap_addr, heap_used_bytes);
}

void later(...)
{
	item *x = TRANSLATE_ADDR(old_heap_addr,NEW_HEAP_ADDR,gd->x);
	item *y = TRANSLATE_ADDR(old_heap_addr,NEW_HEAP_ADDR,gd->y);
}


Perhaps I can try to rewrite it into more real-life example:

void board_init_f(...)
{
	...
	struct dm_tree_node *root = malloc(sizeof(struct dm_tree_item));
	/* early_malloc() has been used in fact. */
	root->left_descendant = malloc(sizeof(struct dm_tree_item));
	gd->dm_tree_root = root;
	...

	memcpy(NEW_HEAP_ADDR,gd->early_heap,gd->early_heap_used_bytes);

	relocate_code(...);
}

void board_init_r(gd_t *id, ulong dest_addr)
{
	...
	struct dm_tree_node *root_new = TRANSLATE_ADDR(gd->early_heap,
		NEW_HEAP_ADDR, gd->root);
	root_new->left_descendant = TRANSLATE_ADDR(gd->early_heap,
		NEW_HEAP_ADDR, root_new->left_descendant);

	/* And I can rectify all the pointers in the tree just like that.
	   It should not be expensive because the tree is really
	   small at this point - 5 items max.
	   The problem is, that the root_new does not point to memory
	   obtained from dlmalloc. How could we possibily solve this? Our current
	   idea is to perform one extra round of malloc() and copying of the tree...
	*/
	...
}

Thank you for your help,
Tomas

-- 
Tom?? Hlav??ek <tmshlvck@gmail.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [U-Boot] early_malloc outline
  2012-08-01 22:12       ` Tomas Hlavacek
@ 2012-08-07 21:07         ` Wolfgang Denk
  2012-08-08 11:47           ` Tomas Hlavacek
  0 siblings, 1 reply; 14+ messages in thread
From: Wolfgang Denk @ 2012-08-07 21:07 UTC (permalink / raw)
  To: u-boot

Dear Tomas,

In message <CAEB7QLAnaNoTLQFzPaJ2NMgnDPv6QiNCzNTVBz6fhg2Yv99ctg@mail.gmail.com> you wrote:
> Dear Wolfgang,
> 
> On Wed, Aug 1, 2012 at 9:09 PM, Wolfgang Denk <wd@denx.de> wrote:
> 
> > Hm... I have to admit that I am not really happy about such an
> > "explanation".  The statement that "other guys" want something is not
> > exactly an explanation of a concept that I can understand, and without
> > being able to understand it, I don't buy it.
> 
> I wanted to say that this is outcome of an informal discussion with
> Marek Vasut, Pavel Hermann and Viktor Krivak, who are working on
> different parts of DM (and perhaps I should say that it is our school
> project to implement driver model for U-Boot, which is not relevant
> information to the current discussion, but it may explain why we
> closely cooperate on DM among ourselves). Statements like "I/we want"
> were not any notices nor decisions nor whatever final but rather
> wishes or ideas (mine or from others). Although we have elaborate
> outline of DM and supporting subsystems it is still subject to
> changes.

As long as you don't actually EXPLAIN the rationale behind any
suggestions or decisions, it is impossible to comment (otherwise we
run the rist to repeat all your previous discussions, and this would
be just a waste of time).

So please explain your concepts, and the rationale.

> Each early heap (assuming we can have more than one contiguous early
> heaps as Graeme suggested) has a pointer to the beginning (and it is
> valid only before relocation and before caches are enabled). I can
> memcpy() the used part of the early heap somewhere else and preserve
> the original heap beginning pointer (I have in mind preservation of
> the pointers in GD; it is implementation detail indeed, but I would
> rather note this in order make sure that I am not working with false
> assumption about possibility of preserving pointers like that). Then I
> can compute copied_heap_address = original_address +
> (copied_heap_begin - original_heap_begin).

Sure you can copy the content.

But "relocation" means that you have to add the address difference
(aka relocation offset) to _all_ pointers pointing into this area.
And there is no way to keep track of _all_ such pointers.


I am convinced that you _cannot_ reliably relocate the malloc arena if
you use the standard malloc//calloc/free interface for early
allocation.

Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd at denx.de
No problem is insoluble.
	-- Dr. Janet Wallace, "The Deadly Years", stardate 3479.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [U-Boot] early_malloc outline
  2012-08-07 21:07         ` Wolfgang Denk
@ 2012-08-08 11:47           ` Tomas Hlavacek
  2012-08-08 19:32             ` Wolfgang Denk
  0 siblings, 1 reply; 14+ messages in thread
From: Tomas Hlavacek @ 2012-08-08 11:47 UTC (permalink / raw)
  To: u-boot

Dear Wolfgang,

On Tue, Aug 7, 2012 at 11:07 PM, Wolfgang Denk <wd@denx.de> wrote:

> But "relocation" means that you have to add the address difference
> (aka relocation offset) to _all_ pointers pointing into this area.
> And there is no way to keep track of _all_ such pointers.

Sure there is no way to do relocation of all data in the early_malloc
arena generically (= for all data present regardless of their origin
and use). And neither I want to do so.

It comes down to the reason for implementing early_malloc: The reason
is to facilitate minimalistic heap allocation for driver model tree in
the early init phase. It is essential to start building the tree in
the early phase (= in board_init_f) in order to have drivers bound to
their cores before first use and to be able to call particular drivers
through DM driver cores from the very beginning. (Driver cores are
structs that are DM abstractions of particular drivers; this structs
are held in DM tree.) It is also needed to build DM tree in runtime,
otherwise the DM tree generated in compile-time would be board
specific and it would disallow prospective unification of resulting
ROMs for more than one board.

The reason for having DM tree is that driver cores use identification
that belongs to an ordered set and a tree offers the fastest way to
search by driver core ID (assuming that hash tables do not suite the
purpose well because of space consumption and constant possibility of
adding new values and repeated rehashing, but essentially hash tables
could be used as well). Anyway I think I am not the right advocate of
DM design nor a supplement for design documents which are going to be
presented here in near future. I would rather have an input for DM
design from this discussion, than discussion about
output/decisions/suggestions from DM project.

Regarding relocation I want to implement it only for DM tree assuming
that I can retain the root pointer in GD and I can traverse the tree
using computed offsets. But there are obvious problems when the
early_malloc is used for another purpose than DM. Another users (which
is what Graeme suggested to take into account) might want to use the
early_heap during early init but he might want not to retain data and
therefore copying that data to RAM would be waste of CPU time and RAM.
In the opposite case when another user would like to retain his data,
he must relocate it on his own (which means retain needed pointers and
rectify them afterwards).

Thinking about possibilities I have: I can try to satisfy both cases
in cost of introducing extra complexity or even dead-code to
early_mallocator by having two heap types (for allocations that are
going to be retained and for allocations that are not to be copied to
RAM). Only the first type is going to be copied to RAM for later
data-specific relocation (done by each user in some relocation chain).
Or I can only satisfy needs of DM which is the way I would prefer.

>
> I am convinced that you _cannot_ reliably relocate the malloc arena if
> you use the standard malloc//calloc/free interface for early
> allocation.

Forgive me my ignorance but why?

Assuming that there is a DM driver core with certain dm_core_init()
function that calls malloc() and then registers the DM driver core
into the DM tree, it is still the same function for early and late
init. The only difference is that pointers inside the DM driver core
have to be recomputed when this driver core is in the DM tree at the
time of relocation. It seems to me that there is no need for code
duplication or adding extra complexity to distinguish early and late
cases and call sometimes early_malloc() instead of malloc().

(Sure I can create a "private" wrapper, for example dm_malloc(),
dm_calloc(), dm_free(),... strictly for DM needs.)

Tomas

-- 
Tom?? Hlav??ek <tmshlvck@gmail.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [U-Boot] early_malloc outline
  2012-08-08 11:47           ` Tomas Hlavacek
@ 2012-08-08 19:32             ` Wolfgang Denk
  2012-08-08 23:33               ` Graeme Russ
  0 siblings, 1 reply; 14+ messages in thread
From: Wolfgang Denk @ 2012-08-08 19:32 UTC (permalink / raw)
  To: u-boot

Dear Tomas Hlavacek,

In message <CAEB7QLD3kSzX8r9q-gox8aww6wOiHKqd-AVCU_Ux7vY7V7TbSA@mail.gmail.com> you wrote:
> 
> > But "relocation" means that you have to add the address difference
> > (aka relocation offset) to _all_ pointers pointing into this area.
> > And there is no way to keep track of _all_ such pointers.
>
> Sure there is no way to do relocation of all data in the early_malloc
> arena generically (= for all data present regardless of their origin
> and use). And neither I want to do so.

OK, we agree on this.

> It comes down to the reason for implementing early_malloc: The reason
> is to facilitate minimalistic heap allocation for driver model tree in
> the early init phase. It is essential to start building the tree in
> the early phase (= in board_init_f) in order to have drivers bound to
> their cores before first use and to be able to call particular drivers
> through DM driver cores from the very beginning. (Driver cores are
> structs that are DM abstractions of particular drivers; this structs
> are held in DM tree.) It is also needed to build DM tree in runtime,
> otherwise the DM tree generated in compile-time would be board
> specific and it would disallow prospective unification of resulting
> ROMs for more than one board.

Agreed, too.

If so, my argument goes, you must not use the standard malloc() /
calloc() / free() API for the early_malloc implementation.  If you do,
there my be any code that is not related to DM, but which happens to
be used early, which suddenly is allocating memory from the DM arena,
without you being able to track any of the pointers potentialy
pointing into this area, which in turn means as soon as you relocate
it the pointers will break.

> presented here in near future. I would rather have an input for DM
> design from this discussion, than discussion about
> output/decisions/suggestions from DM project.

I intend to provide input for the DM design:  you provide a
specialized implementaion, so make sure to use also a specialized
interface, to avoid that other code unintentionally uses your
functions.

> Regarding relocation I want to implement it only for DM tree assuming
> that I can retain the root pointer in GD and I can traverse the tree
> using computed offsets. But there are obvious problems when the
> early_malloc is used for another purpose than DM. Another users (which

Correct. So such thing must not happen.

> is what Graeme suggested to take into account) might want to use the
> early_heap during early init but he might want not to retain data and
> therefore copying that data to RAM would be waste of CPU time and RAM.

I agree with Graeme that it would be nice to have an early malloc that
automagically preserves the alocations until the full U-Boot is
running, but I cannot see how such a thing could be implemented.

Especially with the new SPL technology we allow users to use all kinds
of code very early - for example there are systems which use malloc()
and file system code before relocation.  Such usage would blow your
implementation.

I want to have the DM code to be as simple and robust as possible, so
I recommend against any fancy design which attempts to solve all
problems of this world at once (and fails). Instead, use a simple and
robust design that is tailored to just the DM purposes, and not used
by anything else.

> > I am convinced that you _cannot_ reliably relocate the malloc arena if
> > you use the standard malloc//calloc/free interface for early
> > allocation.
>
> Forgive me my ignorance but why?

Because you cannot track which pointers point into it. They can be
distributed all over the code.  Any function anybody calls might use
malloc() internally, and keep static pointers to allocated data.

> Assuming that there is a DM driver core with certain dm_core_init()
> function that calls malloc() and then registers the DM driver core
> into the DM tree, it is still the same function for early and late

Face it: there will be, and actually is already (on some systems)
other code that uses malloc(), and that doesn't (and should not have
to) know anything about the specific requirements or implementationof
the DM early allocator.

> (Sure I can create a "private" wrapper, for example dm_malloc(),
> dm_calloc(), dm_free(),... strictly for DM needs.)

You wil need such a separate interface, but it will definitely not be
any kind of wrapper.  dm_malloc() and malloc() will have to be kept
strictly separated in the general case.

Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd at denx.de
When Granny says a task is impossible, she means it is impossible for
anyone but herself.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [U-Boot] early_malloc outline
  2012-08-08 19:32             ` Wolfgang Denk
@ 2012-08-08 23:33               ` Graeme Russ
  2012-08-09  8:58                 ` Tomas Hlavacek
  2012-08-09 12:35                 ` Wolfgang Denk
  0 siblings, 2 replies; 14+ messages in thread
From: Graeme Russ @ 2012-08-08 23:33 UTC (permalink / raw)
  To: u-boot

Hi Tomas & Wolfgang,

On Thu, Aug 9, 2012 at 5:32 AM, Wolfgang Denk <wd@denx.de> wrote:
> Dear Tomas Hlavacek,
>
> In message <CAEB7QLD3kSzX8r9q-gox8aww6wOiHKqd-AVCU_Ux7vY7V7TbSA@mail.gmail.com> you wrote:

[snip]

> If so, my argument goes, you must not use the standard malloc() /
> calloc() / free() API for the early_malloc implementation.  If you do,
> there my be any code that is not related to DM, but which happens to
> be used early, which suddenly is allocating memory from the DM arena,
> without you being able to track any of the pointers potentialy
> pointing into this area, which in turn means as soon as you relocate
> it the pointers will break.

I pointed out that those pointers can only by in either GD or the early
malloc heap. But either way, there is no generic way to adjust them. Any
code that uses the early malloc heap and wants to retain that data for
use after relocation will need to implement it's own relocation code.

Having a specific early_malloc() has the advantage of being explicit. The
problem is, the initialisation for some drivers may not necessarily be
globally 'early' or 'late' depending on the board. So for some driver:

int foo_init(struct some_dm_struct *blah)
{
  struct foo_struct *my_data = malloc(sizeof(foo_struct));
  .
  .
  .
}

runs into trouble if the foo driver is neaded early be one board, but not
by another.

> I agree with Graeme that it would be nice to have an early malloc that
> automagically preserves the alocations until the full U-Boot is
> running, but I cannot see how such a thing could be implemented.

That was never how I thought about it. Anything that uses the early heap
needs to 'help' the relocation process. My thought was to make the glue
logic for the helper as seamless as possible.

>> > I am convinced that you _cannot_ reliably relocate the malloc arena if
>> > you use the standard malloc//calloc/free interface for early
>> > allocation.
>>
>> Forgive me my ignorance but why?
>
> Because you cannot track which pointers point into it. They can be

Yes

> distributed all over the code.  Any function anybody calls might use
> malloc() internally, and keep static pointers to allocated data.

No - There can be no static pointers before relocation - All pointers must
be in either GD or in structures allocated on the early malloc heap

>> Assuming that there is a DM driver core with certain dm_core_init()
>> function that calls malloc() and then registers the DM driver core
>> into the DM tree, it is still the same function for early and late
>
> Face it: there will be, and actually is already (on some systems)
> other code that uses malloc(), and that doesn't (and should not have
> to) know anything about the specific requirements or implementationof
> the DM early allocator.
>
>> (Sure I can create a "private" wrapper, for example dm_malloc(),
>> dm_calloc(), dm_free(),... strictly for DM needs.)
>
> You wil need such a separate interface, but it will definitely not be
> any kind of wrapper.  dm_malloc() and malloc() will have to be kept
> strictly separated in the general case.

OK, this got me to thinking about the 'relocation' function in the driver
structure and how we can make the early heap more generic. My thoughts tie
into the DM tree structure being discussed in another thread...

Instead of putting a 'relocation function' pointer in the driver core
structure, we don't we (as I have suggested before) explicitly register the
relocation function. e.g.

int foo_init(struct some_dm_struct *blah)
{
  register_ealry_malloc_relocation(foo_relocation_function);

  struct foo_struct *my_data = malloc(sizeof(foo_struct));
  .
  .
  .
}

(Of course, we can still have the pointer in the driver core struct and do
this in the core code, but it may be a bit wasteful - I don't know)

int foo_relocation_function()
{
  ... do relocation of foo data structures ...
}

Now register_ealry_malloc_relocation() is in the early malloc code

int register_ealry_malloc_relocation(function *blah)
{
  if (gd->flags | relocated)
    return 0;

  ... Add blah to the relocation function list ...
}

Then in relocate_code() we call relocate_early_malloc() which simply walks
the relocation function list and calls each one. Each function is
responsible for allocating new memory form the final heap and copying the
data from the early heap to the newly allocated memory

This way, anyone (not just drivers) can take advantage of the early heap.
And if a user of early heap does not care about the data being available
post relocation, they just don't bother implementing and registering a
relocation function.

So to summarise:
  - Early malloc() needs to be a malloc()
  - Any code using early malloc needs to be aware of this and provide a
    relocation function
  - Why has this taken 6+ months to sort out?

Regards,

Graeme

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [U-Boot] early_malloc outline
  2012-08-08 23:33               ` Graeme Russ
@ 2012-08-09  8:58                 ` Tomas Hlavacek
  2012-08-09 10:09                   ` Graeme Russ
  2012-08-09 12:35                 ` Wolfgang Denk
  1 sibling, 1 reply; 14+ messages in thread
From: Tomas Hlavacek @ 2012-08-09  8:58 UTC (permalink / raw)
  To: u-boot

Hi Graeme,

On Thu, Aug 9, 2012 at 1:33 AM, Graeme Russ <graeme.russ@gmail.com> wrote:

>
> OK, this got me to thinking about the 'relocation' function in the driver
> structure and how we can make the early heap more generic. My thoughts tie
> into the DM tree structure being discussed in another thread...
>
[snip]
>
> Then in relocate_code() we call relocate_early_malloc() which simply walks
> the relocation function list and calls each one. Each function is
> responsible for allocating new memory form the final heap and copying the
> data from the early heap to the newly allocated memory
>
> This way, anyone (not just drivers) can take advantage of the early heap.
> And if a user of early heap does not care about the data being available
> post relocation, they just don't bother implementing and registering a
> relocation function.

I am wandering how could it be implemented on certain platforms (ARM
for instance) without wasting memory for blind copy of all data (even
data that the user does not want to relocate afterwards) or without
initializing real_malloc() and doing relocation when caches are still
disabled. Maybe it is a technical detail now but it seems to me that
some trade off in this situation might be needed.

Tomas



-- 
Tom?? Hlav??ek <tmshlvck@gmail.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [U-Boot] early_malloc outline
  2012-08-09  8:58                 ` Tomas Hlavacek
@ 2012-08-09 10:09                   ` Graeme Russ
  0 siblings, 0 replies; 14+ messages in thread
From: Graeme Russ @ 2012-08-09 10:09 UTC (permalink / raw)
  To: u-boot

Hi Thomas,

On 08/09/2012 06:58 PM, Tomas Hlavacek wrote:
> Hi Graeme,
> 
> On Thu, Aug 9, 2012 at 1:33 AM, Graeme Russ <graeme.russ@gmail.com> wrote:
> 
>>
>> OK, this got me to thinking about the 'relocation' function in the driver
>> structure and how we can make the early heap more generic. My thoughts tie
>> into the DM tree structure being discussed in another thread...
>>
> [snip]
>>
>> Then in relocate_code() we call relocate_early_malloc() which simply walks
>> the relocation function list and calls each one. Each function is
>> responsible for allocating new memory form the final heap and copying the
>> data from the early heap to the newly allocated memory
>>
>> This way, anyone (not just drivers) can take advantage of the early heap.
>> And if a user of early heap does not care about the data being available
>> post relocation, they just don't bother implementing and registering a
>> relocation function.
> 
> I am wandering how could it be implemented on certain platforms (ARM

Forget about platform specifics - Not.....Your.....Problem ;)

> for instance) without wasting memory for blind copy of all data (even
> data that the user does not want to relocate afterwards) or without

Easy - don't blind copy. Each (and every) user of early malloc needs their
own relocation routine. The user of the early heap gets to decide what gets
copied to final heap. Let's look at a couple of scenarios:

a) Pre-console buffer gets allocated pre-relocation. The contents get
   dumped as soon as console is available (usually before relocation) so
   the pre-console buffer does not need to be relocated to final heap

b) Something (who know what) builds a tree during platform init prior to
   relocation. After the tree has been built, no elements will be added or
   deleted. So during relocation, instead of a tree, a simple array is
   allocated and the elements copied into it in order (so a simple binary
   search can be conducted late)

> initializing real_malloc() and doing relocation when caches are still
> disabled. Maybe it is a technical detail now but it seems to me that
> some trade off in this situation might be needed.

Bingo! - Don't sweat the details. Provide a well-structured API and if
there are performance gains to be made later, they can be made.

Remember, the whole point of implementing the new driver model is to bring
in a consistent API. To support that, you made need sub-features with their
own APIs. In a lot of respects, it matters less if the API is imperfect if
it is wholly consistent

Regards,

Graeme

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [U-Boot] early_malloc outline
  2012-08-08 23:33               ` Graeme Russ
  2012-08-09  8:58                 ` Tomas Hlavacek
@ 2012-08-09 12:35                 ` Wolfgang Denk
  1 sibling, 0 replies; 14+ messages in thread
From: Wolfgang Denk @ 2012-08-09 12:35 UTC (permalink / raw)
  To: u-boot

Dear Graeme Russ,

In message <CALButCK7CxJy6fg8Ym2Ao8tVrD5HTihGWaV+TF8Y9qNe62MnFw@mail.gmail.com> you wrote:
> 
> I pointed out that those pointers can only by in either GD or the early
> malloc heap. But either way, there is no generic way to adjust them. Any
> code that uses the early malloc heap and wants to retain that data for
> use after relocation will need to implement it's own relocation code.
> 
> Having a specific early_malloc() has the advantage of being explicit. The

Yes, and we need this explicity to make sure the caller knows which
behaviour he can expect.  Actually, a standard malloc() should either
work reliably or cause build time errors.

> That was never how I thought about it. Anything that uses the early heap
> needs to 'help' the relocation process. My thought was to make the glue
> logic for the helper as seamless as possible.

I agree with that.

> > Because you cannot track which pointers point into it. They can be
> 
> Yes
> 
> > distributed all over the code.  Any function anybody calls might use
> > malloc() internally, and keep static pointers to allocated data.
> 
> No - There can be no static pointers before relocation - All pointers must
> be in either GD or in structures allocated on the early malloc heap

Thanks for correcting me.  That was what I actually meant - the effect
is the same.

> So to summarise:
>   - Early malloc() needs to be a malloc()

Um... no! malloc() is suposed to behave in  standard way, without any
expectations that pointers to storage retured by malloc() will ever
have to be relocated or such.

For early malloc, a function with a different name shall be used.


Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd at denx.de
"Nobody will ever need more than 640k RAM!"       -- Bill Gates, 1981
"Windows 95 needs at least 8 MB RAM."             -- Bill Gates, 1996
"Nobody will ever need Windows 95."             -- logical conclusion

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2012-08-09 12:35 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-31 15:30 [U-Boot] early_malloc outline Tomas Hlavacek
2012-07-31 19:52 ` Wolfgang Denk
2012-08-01 15:19   ` Tomas Hlavacek
2012-08-01 19:09     ` Wolfgang Denk
2012-08-01 22:12       ` Tomas Hlavacek
2012-08-07 21:07         ` Wolfgang Denk
2012-08-08 11:47           ` Tomas Hlavacek
2012-08-08 19:32             ` Wolfgang Denk
2012-08-08 23:33               ` Graeme Russ
2012-08-09  8:58                 ` Tomas Hlavacek
2012-08-09 10:09                   ` Graeme Russ
2012-08-09 12:35                 ` Wolfgang Denk
2012-08-01  2:57 ` Graeme Russ
2012-08-01 14:47   ` Tomas Hlavacek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.