Nathan Froyd wrote:
> On Tue, May 19, 2009 at 06:53:44PM +0200, Jan Kiszka wrote:
>> Nathan Froyd wrote:
>>> The best way I can think of to do this is to maintain a
>>> separately-chained list of CPUStates (through a new field similar to
>>> next_cpu) ordered by gdbstub_index.  Grabbing a new gdbstub_index then
>>> walks through the list, looking for "holes" between adjacent entries in
>>> the list.  A new gdbstub_index is then picked if we find a hole; we die
>>> if we can't find a hole.
>> Why creating a new list? Just generate a new index and then walk through
>> all so far registered CPUStates until no collision is found.
> 
> Do you mean something like:
> 
>   new_index = 0
>  restart:
>   for cpu in all_cpus:
>     if new_index == cpu.gdbstub_index
>       new_index += 1
>       goto restart;
>   new_cpu.gdbstub_index = new_index
> 
> I was trying to keep it a linear-time algorithm, and the above is
> potentially quadratic.  (Not merely an academic exercise, either;
> spinning up your first N threads will be enough to trigger quadratic
> behavior.)  I don't think you can say:
> 
>   new_index = 0
>   for cpu in all_cpus:
>     if new_index == cpu.gdbstub_index
>       new_index = cpu.gdbstub_index + 1
>   new_cpu.gdbstub_index = new_index
> 
> (i.e. maximum+1 of all existing cpus) either, since that fails for a
> list that looks like:
> 
>   1 -> 0
> 
> You could iterate, of course, but then you're just back to quadratic
> behavior.
> 
> Nothing promises that the cpu list will be kept in sorted order
> according to some criteria.  (A quick examination of things indicates
> the list might be accidentally sorted by time of creation--and thereby
> by gdbstub_index--but that falls apart once you wrap gdbstub_index, of
> course.)  AFAICS, you need the cpus sorted by their gdbstub_index to
> keep linear time bounds.  If using a quadratic algorithm is OK, then
> I'll just do that, but that doesn't seem like a good idea, especially
> since you're imposing a large slowdown on everything for debugging
> support which might never get used.

I see. So we basically have two options:

1. Stick head in sand - we will never see so many threads in reality, so
   neither wrap-around nor performance matters.

2. Go for a bitmap-based search of the next free ID (reducing the ID
   space so that the bitmap has a reasonable size)

But I have no clue if and how long option 1 will suffice as I have no
idea how many long-running highly dynamic multi-threaded use cases there
are or will be for qemu user mode. But if you find option 2 easy enough
to implement, I would do this nevertheless.

Jan