* [Qemu-devel] [RFC 0/2] Fix some bugs in usermode cpu tracking @ 2016-07-14 7:57 David Gibson 2016-07-14 7:57 ` [Qemu-devel] [RFC 1/2] linux-user: Don't leak cpus on thread exit David Gibson 2016-07-14 7:57 ` [Qemu-devel] [RFC 2/2] linux-user: Fix cpu_index generation David Gibson 0 siblings, 2 replies; 16+ messages in thread From: David Gibson @ 2016-07-14 7:57 UTC (permalink / raw) To: riku.voipio, imammedo; +Cc: qemu-devel, David Gibson While investigating the mess we have with cpu_index and (possible) other cpu id values, I came across a couple of bugs in CONFIG_USER_ONLY mode. David Gibson (2): linux-user: Don't leak cpus on thread exit linux-user: Fix cpu_index generation exec.c | 19 ------------------- linux-user/syscall.c | 7 ++----- 2 files changed, 2 insertions(+), 24 deletions(-) -- 2.7.4 ^ permalink raw reply [flat|nested] 16+ messages in thread
* [Qemu-devel] [RFC 1/2] linux-user: Don't leak cpus on thread exit 2016-07-14 7:57 [Qemu-devel] [RFC 0/2] Fix some bugs in usermode cpu tracking David Gibson @ 2016-07-14 7:57 ` David Gibson 2016-07-14 9:52 ` Peter Maydell 2016-07-14 7:57 ` [Qemu-devel] [RFC 2/2] linux-user: Fix cpu_index generation David Gibson 1 sibling, 1 reply; 16+ messages in thread From: David Gibson @ 2016-07-14 7:57 UTC (permalink / raw) To: riku.voipio, imammedo; +Cc: qemu-devel, David Gibson Currently linux-user does not correctly clean up CPU instances properly when running a threaded binary. On thread exit, do_syscall() removes the thread's CPU from the cpus list and calls object_unref(). However, the CPU still is still referenced from the QOM tree. To correctly clean up we need to object_unparent() to remove the CPU from the QOM tree, then object_unref() to release the final reference we're holding. Once this is done, the explicit remove from the cpus list is no longer necessary, since that's done automatically in the CPU unrealize path. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> --- linux-user/syscall.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) I believe most full system targets also "leak" cpus in the same way, except that since they don't support cpu hot unplug the cpus never would have been disposed anyway. I'll look into fixing that another time. diff --git a/linux-user/syscall.c b/linux-user/syscall.c index 8bf6205..dd91791 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -6823,10 +6823,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1, if (CPU_NEXT(first_cpu)) { TaskState *ts; - cpu_list_lock(); - /* Remove the CPU from the list. */ - QTAILQ_REMOVE(&cpus, cpu, node); - cpu_list_unlock(); + object_unparent(OBJECT(cpu)); /* Remove from QOM */ ts = cpu->opaque; if (ts->child_tidptr) { put_user_u32(0, ts->child_tidptr); @@ -6834,7 +6831,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1, NULL, NULL, 0); } thread_cpu = NULL; - object_unref(OBJECT(cpu)); + object_unref(OBJECT(cpu)); /* Remove the last ref we're holding */ g_free(ts); rcu_unregister_thread(); pthread_exit(NULL); -- 2.7.4 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [RFC 1/2] linux-user: Don't leak cpus on thread exit 2016-07-14 7:57 ` [Qemu-devel] [RFC 1/2] linux-user: Don't leak cpus on thread exit David Gibson @ 2016-07-14 9:52 ` Peter Maydell 2016-07-14 12:02 ` David Gibson 0 siblings, 1 reply; 16+ messages in thread From: Peter Maydell @ 2016-07-14 9:52 UTC (permalink / raw) To: David Gibson; +Cc: Riku Voipio, Igor Mammedov, QEMU Developers On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote: > Currently linux-user does not correctly clean up CPU instances properly > when running a threaded binary. > > On thread exit, do_syscall() removes the thread's CPU from the cpus list > and calls object_unref(). However, the CPU still is still referenced from > the QOM tree. To correctly clean up we need to object_unparent() to remove > the CPU from the QOM tree, then object_unref() to release the final > reference we're holding. > > Once this is done, the explicit remove from the cpus list is no longer > necessary, since that's done automatically in the CPU unrealize path. > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > --- > linux-user/syscall.c | 7 ++----- > 1 file changed, 2 insertions(+), 5 deletions(-) > > I believe most full system targets also "leak" cpus in the same way, > except that since they don't support cpu hot unplug the cpus never > would have been disposed anyway. I'll look into fixing that another > time. > > diff --git a/linux-user/syscall.c b/linux-user/syscall.c > index 8bf6205..dd91791 100644 > --- a/linux-user/syscall.c > +++ b/linux-user/syscall.c > @@ -6823,10 +6823,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1, > if (CPU_NEXT(first_cpu)) { > TaskState *ts; > > - cpu_list_lock(); > - /* Remove the CPU from the list. */ > - QTAILQ_REMOVE(&cpus, cpu, node); > - cpu_list_unlock(); > + object_unparent(OBJECT(cpu)); /* Remove from QOM */ > ts = cpu->opaque; > if (ts->child_tidptr) { > put_user_u32(0, ts->child_tidptr); > @@ -6834,7 +6831,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1, > NULL, NULL, 0); > } > thread_cpu = NULL; > - object_unref(OBJECT(cpu)); > + object_unref(OBJECT(cpu)); /* Remove the last ref we're holding */ Is it OK to now be removing the CPU from the list after we've done the futex to signal the child task rather than before? > g_free(ts); > rcu_unregister_thread(); > pthread_exit(NULL); > -- > 2.7.4 thanks -- PMM ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [RFC 1/2] linux-user: Don't leak cpus on thread exit 2016-07-14 9:52 ` Peter Maydell @ 2016-07-14 12:02 ` David Gibson 2016-07-14 13:05 ` Igor Mammedov 0 siblings, 1 reply; 16+ messages in thread From: David Gibson @ 2016-07-14 12:02 UTC (permalink / raw) To: Peter Maydell; +Cc: Riku Voipio, Igor Mammedov, QEMU Developers [-- Attachment #1: Type: text/plain, Size: 2974 bytes --] On Thu, Jul 14, 2016 at 10:52:48AM +0100, Peter Maydell wrote: > On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote: > > Currently linux-user does not correctly clean up CPU instances properly > > when running a threaded binary. > > > > On thread exit, do_syscall() removes the thread's CPU from the cpus list > > and calls object_unref(). However, the CPU still is still referenced from > > the QOM tree. To correctly clean up we need to object_unparent() to remove > > the CPU from the QOM tree, then object_unref() to release the final > > reference we're holding. > > > > Once this is done, the explicit remove from the cpus list is no longer > > necessary, since that's done automatically in the CPU unrealize path. > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > --- > > linux-user/syscall.c | 7 ++----- > > 1 file changed, 2 insertions(+), 5 deletions(-) > > > > I believe most full system targets also "leak" cpus in the same way, > > except that since they don't support cpu hot unplug the cpus never > > would have been disposed anyway. I'll look into fixing that another > > time. > > > > diff --git a/linux-user/syscall.c b/linux-user/syscall.c > > index 8bf6205..dd91791 100644 > > --- a/linux-user/syscall.c > > +++ b/linux-user/syscall.c > > @@ -6823,10 +6823,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1, > > if (CPU_NEXT(first_cpu)) { > > TaskState *ts; > > > > - cpu_list_lock(); > > - /* Remove the CPU from the list. */ > > - QTAILQ_REMOVE(&cpus, cpu, node); > > - cpu_list_unlock(); > > + object_unparent(OBJECT(cpu)); /* Remove from QOM */ > > ts = cpu->opaque; > > if (ts->child_tidptr) { > > put_user_u32(0, ts->child_tidptr); > > @@ -6834,7 +6831,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1, > > NULL, NULL, 0); > > } > > thread_cpu = NULL; > > - object_unref(OBJECT(cpu)); > > + object_unref(OBJECT(cpu)); /* Remove the last ref we're holding */ > > Is it OK to now be removing the CPU from the list after we've done > the futex to signal the child task rather than before? Ah.. not sure. I was thinking the object_unparent() would trigger an unrealize (which would do the list remove) even if there was a reference keeping the object in existence. I haven't confirmed that thought. It could obviously be fixed with an explicit unrealize before the futex op. > > > g_free(ts); > > rcu_unregister_thread(); > > pthread_exit(NULL); > > thanks > -- PMM > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [RFC 1/2] linux-user: Don't leak cpus on thread exit 2016-07-14 12:02 ` David Gibson @ 2016-07-14 13:05 ` Igor Mammedov 2016-07-15 2:53 ` David Gibson 0 siblings, 1 reply; 16+ messages in thread From: Igor Mammedov @ 2016-07-14 13:05 UTC (permalink / raw) To: David Gibson; +Cc: Peter Maydell, Riku Voipio, QEMU Developers On Thu, 14 Jul 2016 22:02:36 +1000 David Gibson <david@gibson.dropbear.id.au> wrote: > On Thu, Jul 14, 2016 at 10:52:48AM +0100, Peter Maydell wrote: > > On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote: > > > Currently linux-user does not correctly clean up CPU instances properly > > > when running a threaded binary. > > > > > > On thread exit, do_syscall() removes the thread's CPU from the cpus list > > > and calls object_unref(). However, the CPU still is still referenced from > > > the QOM tree. To correctly clean up we need to object_unparent() to remove > > > the CPU from the QOM tree, then object_unref() to release the final > > > reference we're holding. > > > > > > Once this is done, the explicit remove from the cpus list is no longer > > > necessary, since that's done automatically in the CPU unrealize path. > > > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > > --- > > > linux-user/syscall.c | 7 ++----- > > > 1 file changed, 2 insertions(+), 5 deletions(-) > > > > > > I believe most full system targets also "leak" cpus in the same way, > > > except that since they don't support cpu hot unplug the cpus never > > > would have been disposed anyway. I'll look into fixing that another > > > time. > > > > > > diff --git a/linux-user/syscall.c b/linux-user/syscall.c > > > index 8bf6205..dd91791 100644 > > > --- a/linux-user/syscall.c > > > +++ b/linux-user/syscall.c > > > @@ -6823,10 +6823,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1, > > > if (CPU_NEXT(first_cpu)) { > > > TaskState *ts; > > > > > > - cpu_list_lock(); > > > - /* Remove the CPU from the list. */ > > > - QTAILQ_REMOVE(&cpus, cpu, node); > > > - cpu_list_unlock(); > > > + object_unparent(OBJECT(cpu)); /* Remove from QOM */ > > > ts = cpu->opaque; > > > if (ts->child_tidptr) { > > > put_user_u32(0, ts->child_tidptr); > > > @@ -6834,7 +6831,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1, > > > NULL, NULL, 0); > > > } > > > thread_cpu = NULL; > > > - object_unref(OBJECT(cpu)); > > > + object_unref(OBJECT(cpu)); /* Remove the last ref we're holding */ > > > > Is it OK to now be removing the CPU from the list after we've done > > the futex to signal the child task rather than before? > > Ah.. not sure. I was thinking the object_unparent() would trigger an > unrealize (which would do the list remove) even if there was a > reference keeping the object in existence. I haven't confirmed that > thought. not every cpu->unrealize does list removal, doesn't it? > It could obviously be fixed with an explicit unrealize before the > futex op. > > > > > > > g_free(ts); > > > rcu_unregister_thread(); > > > pthread_exit(NULL); > > > > thanks > > -- PMM > > > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [RFC 1/2] linux-user: Don't leak cpus on thread exit 2016-07-14 13:05 ` Igor Mammedov @ 2016-07-15 2:53 ` David Gibson 0 siblings, 0 replies; 16+ messages in thread From: David Gibson @ 2016-07-15 2:53 UTC (permalink / raw) To: Igor Mammedov; +Cc: Peter Maydell, Riku Voipio, QEMU Developers [-- Attachment #1: Type: text/plain, Size: 3614 bytes --] On Thu, Jul 14, 2016 at 03:05:31PM +0200, Igor Mammedov wrote: > On Thu, 14 Jul 2016 22:02:36 +1000 > David Gibson <david@gibson.dropbear.id.au> wrote: > > > On Thu, Jul 14, 2016 at 10:52:48AM +0100, Peter Maydell wrote: > > > On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote: > > > > Currently linux-user does not correctly clean up CPU instances properly > > > > when running a threaded binary. > > > > > > > > On thread exit, do_syscall() removes the thread's CPU from the cpus list > > > > and calls object_unref(). However, the CPU still is still referenced from > > > > the QOM tree. To correctly clean up we need to object_unparent() to remove > > > > the CPU from the QOM tree, then object_unref() to release the final > > > > reference we're holding. > > > > > > > > Once this is done, the explicit remove from the cpus list is no longer > > > > necessary, since that's done automatically in the CPU unrealize path. > > > > > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > > > --- > > > > linux-user/syscall.c | 7 ++----- > > > > 1 file changed, 2 insertions(+), 5 deletions(-) > > > > > > > > I believe most full system targets also "leak" cpus in the same way, > > > > except that since they don't support cpu hot unplug the cpus never > > > > would have been disposed anyway. I'll look into fixing that another > > > > time. > > > > > > > > diff --git a/linux-user/syscall.c b/linux-user/syscall.c > > > > index 8bf6205..dd91791 100644 > > > > --- a/linux-user/syscall.c > > > > +++ b/linux-user/syscall.c > > > > @@ -6823,10 +6823,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1, > > > > if (CPU_NEXT(first_cpu)) { > > > > TaskState *ts; > > > > > > > > - cpu_list_lock(); > > > > - /* Remove the CPU from the list. */ > > > > - QTAILQ_REMOVE(&cpus, cpu, node); > > > > - cpu_list_unlock(); > > > > + object_unparent(OBJECT(cpu)); /* Remove from QOM */ > > > > ts = cpu->opaque; > > > > if (ts->child_tidptr) { > > > > put_user_u32(0, ts->child_tidptr); > > > > @@ -6834,7 +6831,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1, > > > > NULL, NULL, 0); > > > > } > > > > thread_cpu = NULL; > > > > - object_unref(OBJECT(cpu)); > > > > + object_unref(OBJECT(cpu)); /* Remove the last ref we're holding */ > > > > > > Is it OK to now be removing the CPU from the list after we've done > > > the futex to signal the child task rather than before? > > > > Ah.. not sure. I was thinking the object_unparent() would trigger an > > unrealize (which would do the list remove) even if there was a > > reference keeping the object in existence. I haven't confirmed that > > thought. > not every cpu->unrealize does list removal, doesn't it? Oh, sod. It's in cpu_exec_exit() but that's sometimes called from unrealize, sometimes from finalize depending on arch. Sigh. > > > It could obviously be fixed with an explicit unrealize before the > > futex op. > > > > > > > > > > > g_free(ts); > > > > rcu_unregister_thread(); > > > > pthread_exit(NULL); > > > > > > thanks > > > -- PMM > > > > > > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* [Qemu-devel] [RFC 2/2] linux-user: Fix cpu_index generation 2016-07-14 7:57 [Qemu-devel] [RFC 0/2] Fix some bugs in usermode cpu tracking David Gibson 2016-07-14 7:57 ` [Qemu-devel] [RFC 1/2] linux-user: Don't leak cpus on thread exit David Gibson @ 2016-07-14 7:57 ` David Gibson 2016-07-14 9:54 ` Peter Maydell 1 sibling, 1 reply; 16+ messages in thread From: David Gibson @ 2016-07-14 7:57 UTC (permalink / raw) To: riku.voipio, imammedo; +Cc: qemu-devel, David Gibson With CONFIG_USER_ONLY, generation of cpu_index values is done differently than for full system targets. This method turns out to be broken, since it can fairly easily result in duplicate cpu_index values for simultaneously active cpus (i.e. threads in the emulated process). Consider this sequence: Create thread 1 Create thread 2 Exit thread 1 Create thread 3 With the current logic thread 1 will get cpu_index 1, thread 2 will get cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2 threads in the cpus list at the point of its creation). We mostly get away with this because cpu_index values aren't that important for userspace emulation. Still, it can't be good, so this patch fixes it by making CONFIG_USER_ONLY use the same bitmap based allocation that full system targets already use. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> --- exec.c | 19 ------------------- 1 file changed, 19 deletions(-) diff --git a/exec.c b/exec.c index 011babd..e410dab 100644 --- a/exec.c +++ b/exec.c @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx) } #endif -#ifndef CONFIG_USER_ONLY static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS); static int cpu_get_free_index(Error **errp) @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu) { bitmap_clear(cpu_index_map, cpu->cpu_index, 1); } -#else - -static int cpu_get_free_index(Error **errp) -{ - CPUState *some_cpu; - int cpu_index = 0; - - CPU_FOREACH(some_cpu) { - cpu_index++; - } - return cpu_index; -} - -static void cpu_release_index(CPUState *cpu) -{ - return; -} -#endif void cpu_exec_exit(CPUState *cpu) { -- 2.7.4 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [RFC 2/2] linux-user: Fix cpu_index generation 2016-07-14 7:57 ` [Qemu-devel] [RFC 2/2] linux-user: Fix cpu_index generation David Gibson @ 2016-07-14 9:54 ` Peter Maydell 2016-07-14 10:20 ` Bharata B Rao 0 siblings, 1 reply; 16+ messages in thread From: Peter Maydell @ 2016-07-14 9:54 UTC (permalink / raw) To: David Gibson; +Cc: Riku Voipio, Igor Mammedov, QEMU Developers On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote: > With CONFIG_USER_ONLY, generation of cpu_index values is done differently > than for full system targets. This method turns out to be broken, since > it can fairly easily result in duplicate cpu_index values for > simultaneously active cpus (i.e. threads in the emulated process). > > Consider this sequence: > Create thread 1 > Create thread 2 > Exit thread 1 > Create thread 3 > > With the current logic thread 1 will get cpu_index 1, thread 2 will get > cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2 > threads in the cpus list at the point of its creation). > > We mostly get away with this because cpu_index values aren't that important > for userspace emulation. Still, it can't be good, so this patch fixes it > by making CONFIG_USER_ONLY use the same bitmap based allocation that full > system targets already use. > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > --- > exec.c | 19 ------------------- > 1 file changed, 19 deletions(-) > > diff --git a/exec.c b/exec.c > index 011babd..e410dab 100644 > --- a/exec.c > +++ b/exec.c > @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx) > } > #endif > > -#ifndef CONFIG_USER_ONLY > static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS); > > static int cpu_get_free_index(Error **errp) > @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu) > { > bitmap_clear(cpu_index_map, cpu->cpu_index, 1); > } > -#else > - > -static int cpu_get_free_index(Error **errp) > -{ > - CPUState *some_cpu; > - int cpu_index = 0; > - > - CPU_FOREACH(some_cpu) { > - cpu_index++; > - } > - return cpu_index; > -} > - > -static void cpu_release_index(CPUState *cpu) > -{ > - return; > -} > -#endif Won't this change impose a maximum limit of 256 simultaneous threads? That seems a little low for comfort. thanks -- PMM ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [RFC 2/2] linux-user: Fix cpu_index generation 2016-07-14 9:54 ` Peter Maydell @ 2016-07-14 10:20 ` Bharata B Rao 2016-07-14 11:59 ` David Gibson 0 siblings, 1 reply; 16+ messages in thread From: Bharata B Rao @ 2016-07-14 10:20 UTC (permalink / raw) To: Peter Maydell; +Cc: David Gibson, Igor Mammedov, Riku Voipio, QEMU Developers On Thu, Jul 14, 2016 at 3:24 PM, Peter Maydell <peter.maydell@linaro.org> wrote: > On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote: >> With CONFIG_USER_ONLY, generation of cpu_index values is done differently >> than for full system targets. This method turns out to be broken, since >> it can fairly easily result in duplicate cpu_index values for >> simultaneously active cpus (i.e. threads in the emulated process). >> >> Consider this sequence: >> Create thread 1 >> Create thread 2 >> Exit thread 1 >> Create thread 3 >> >> With the current logic thread 1 will get cpu_index 1, thread 2 will get >> cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2 >> threads in the cpus list at the point of its creation). >> >> We mostly get away with this because cpu_index values aren't that important >> for userspace emulation. Still, it can't be good, so this patch fixes it >> by making CONFIG_USER_ONLY use the same bitmap based allocation that full >> system targets already use. >> >> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> >> --- >> exec.c | 19 ------------------- >> 1 file changed, 19 deletions(-) >> >> diff --git a/exec.c b/exec.c >> index 011babd..e410dab 100644 >> --- a/exec.c >> +++ b/exec.c >> @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx) >> } >> #endif >> >> -#ifndef CONFIG_USER_ONLY >> static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS); >> >> static int cpu_get_free_index(Error **errp) >> @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu) >> { >> bitmap_clear(cpu_index_map, cpu->cpu_index, 1); >> } >> -#else >> - >> -static int cpu_get_free_index(Error **errp) >> -{ >> - CPUState *some_cpu; >> - int cpu_index = 0; >> - >> - CPU_FOREACH(some_cpu) { >> - cpu_index++; >> - } >> - return cpu_index; >> -} >> - >> -static void cpu_release_index(CPUState *cpu) >> -{ >> - return; >> -} >> -#endif > > Won't this change impose a maximum limit of 256 simultaneous > threads? That seems a little low for comfort. This was the reason why the bitmap logic wasn't applied to CONFIG_USER_ONLY when it was introduced. https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg01980.html But then we didn't have actual removal, but we do now. Regards, Bharata. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [RFC 2/2] linux-user: Fix cpu_index generation 2016-07-14 10:20 ` Bharata B Rao @ 2016-07-14 11:59 ` David Gibson 2016-07-15 22:11 ` Greg Kurz 0 siblings, 1 reply; 16+ messages in thread From: David Gibson @ 2016-07-14 11:59 UTC (permalink / raw) To: Bharata B Rao; +Cc: Peter Maydell, Igor Mammedov, Riku Voipio, QEMU Developers [-- Attachment #1: Type: text/plain, Size: 3191 bytes --] On Thu, Jul 14, 2016 at 03:50:56PM +0530, Bharata B Rao wrote: > On Thu, Jul 14, 2016 at 3:24 PM, Peter Maydell <peter.maydell@linaro.org> wrote: > > On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote: > >> With CONFIG_USER_ONLY, generation of cpu_index values is done differently > >> than for full system targets. This method turns out to be broken, since > >> it can fairly easily result in duplicate cpu_index values for > >> simultaneously active cpus (i.e. threads in the emulated process). > >> > >> Consider this sequence: > >> Create thread 1 > >> Create thread 2 > >> Exit thread 1 > >> Create thread 3 > >> > >> With the current logic thread 1 will get cpu_index 1, thread 2 will get > >> cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2 > >> threads in the cpus list at the point of its creation). > >> > >> We mostly get away with this because cpu_index values aren't that important > >> for userspace emulation. Still, it can't be good, so this patch fixes it > >> by making CONFIG_USER_ONLY use the same bitmap based allocation that full > >> system targets already use. > >> > >> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > >> --- > >> exec.c | 19 ------------------- > >> 1 file changed, 19 deletions(-) > >> > >> diff --git a/exec.c b/exec.c > >> index 011babd..e410dab 100644 > >> --- a/exec.c > >> +++ b/exec.c > >> @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx) > >> } > >> #endif > >> > >> -#ifndef CONFIG_USER_ONLY > >> static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS); > >> > >> static int cpu_get_free_index(Error **errp) > >> @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu) > >> { > >> bitmap_clear(cpu_index_map, cpu->cpu_index, 1); > >> } > >> -#else > >> - > >> -static int cpu_get_free_index(Error **errp) > >> -{ > >> - CPUState *some_cpu; > >> - int cpu_index = 0; > >> - > >> - CPU_FOREACH(some_cpu) { > >> - cpu_index++; > >> - } > >> - return cpu_index; > >> -} > >> - > >> -static void cpu_release_index(CPUState *cpu) > >> -{ > >> - return; > >> -} > >> -#endif > > > > Won't this change impose a maximum limit of 256 simultaneous > > threads? That seems a little low for comfort. > > This was the reason why the bitmap logic wasn't applied to > CONFIG_USER_ONLY when it was introduced. > > https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg01980.html Ah.. good point. Hrm, ok, my next idea would be to just (globally) sequentially allocate cpu_index values for CONFIG_USER, and never try to re-use them. Does that seem reasonable? > But then we didn't have actual removal, but we do now. You mean patch 1/2 in this set? Or something else? Even so, 256 does seem a bit low for a number of simultaneously active threads - there are some bug hairy multi-threaded programs out there. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [RFC 2/2] linux-user: Fix cpu_index generation 2016-07-14 11:59 ` David Gibson @ 2016-07-15 22:11 ` Greg Kurz 2016-07-18 1:17 ` David Gibson 0 siblings, 1 reply; 16+ messages in thread From: Greg Kurz @ 2016-07-15 22:11 UTC (permalink / raw) To: David Gibson Cc: Bharata B Rao, Peter Maydell, Riku Voipio, QEMU Developers, Igor Mammedov [-- Attachment #1: Type: text/plain, Size: 3425 bytes --] On Thu, 14 Jul 2016 21:59:45 +1000 David Gibson <david@gibson.dropbear.id.au> wrote: > On Thu, Jul 14, 2016 at 03:50:56PM +0530, Bharata B Rao wrote: > > On Thu, Jul 14, 2016 at 3:24 PM, Peter Maydell <peter.maydell@linaro.org> wrote: > > > On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote: > > >> With CONFIG_USER_ONLY, generation of cpu_index values is done differently > > >> than for full system targets. This method turns out to be broken, since > > >> it can fairly easily result in duplicate cpu_index values for > > >> simultaneously active cpus (i.e. threads in the emulated process). > > >> > > >> Consider this sequence: > > >> Create thread 1 > > >> Create thread 2 > > >> Exit thread 1 > > >> Create thread 3 > > >> > > >> With the current logic thread 1 will get cpu_index 1, thread 2 will get > > >> cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2 > > >> threads in the cpus list at the point of its creation). > > >> > > >> We mostly get away with this because cpu_index values aren't that important > > >> for userspace emulation. Still, it can't be good, so this patch fixes it > > >> by making CONFIG_USER_ONLY use the same bitmap based allocation that full > > >> system targets already use. > > >> > > >> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > >> --- > > >> exec.c | 19 ------------------- > > >> 1 file changed, 19 deletions(-) > > >> > > >> diff --git a/exec.c b/exec.c > > >> index 011babd..e410dab 100644 > > >> --- a/exec.c > > >> +++ b/exec.c > > >> @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx) > > >> } > > >> #endif > > >> > > >> -#ifndef CONFIG_USER_ONLY > > >> static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS); > > >> > > >> static int cpu_get_free_index(Error **errp) > > >> @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu) > > >> { > > >> bitmap_clear(cpu_index_map, cpu->cpu_index, 1); > > >> } > > >> -#else > > >> - > > >> -static int cpu_get_free_index(Error **errp) > > >> -{ > > >> - CPUState *some_cpu; > > >> - int cpu_index = 0; > > >> - > > >> - CPU_FOREACH(some_cpu) { > > >> - cpu_index++; > > >> - } > > >> - return cpu_index; > > >> -} > > >> - > > >> -static void cpu_release_index(CPUState *cpu) > > >> -{ > > >> - return; > > >> -} > > >> -#endif > > > > > > Won't this change impose a maximum limit of 256 simultaneous > > > threads? That seems a little low for comfort. > > > > This was the reason why the bitmap logic wasn't applied to > > CONFIG_USER_ONLY when it was introduced. > > > > https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg01980.html > > Ah.. good point. > > Hrm, ok, my next idea would be to just (globally) sequentially > allocate cpu_index values for CONFIG_USER, and never try to re-use > them. Does that seem reasonable? > Isn't it only deferring the problem to later ? Maybe it is possible to define MAX_CPUMASK_BITS to a much higher value fo CONFIG_USER only instead ? > > But then we didn't have actual removal, but we do now. > > You mean patch 1/2 in this set? Or something else? > > Even so, 256 does seem a bit low for a number of simultaneously active > threads - there are some bug hairy multi-threaded programs out there. > [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [RFC 2/2] linux-user: Fix cpu_index generation 2016-07-15 22:11 ` Greg Kurz @ 2016-07-18 1:17 ` David Gibson 2016-07-18 7:25 ` Igor Mammedov 2016-07-18 8:52 ` Greg Kurz 0 siblings, 2 replies; 16+ messages in thread From: David Gibson @ 2016-07-18 1:17 UTC (permalink / raw) To: Greg Kurz Cc: Bharata B Rao, Peter Maydell, Riku Voipio, QEMU Developers, Igor Mammedov [-- Attachment #1: Type: text/plain, Size: 4514 bytes --] On Sat, Jul 16, 2016 at 12:11:56AM +0200, Greg Kurz wrote: > On Thu, 14 Jul 2016 21:59:45 +1000 > David Gibson <david@gibson.dropbear.id.au> wrote: > > > On Thu, Jul 14, 2016 at 03:50:56PM +0530, Bharata B Rao wrote: > > > On Thu, Jul 14, 2016 at 3:24 PM, Peter Maydell <peter.maydell@linaro.org> wrote: > > > > On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote: > > > >> With CONFIG_USER_ONLY, generation of cpu_index values is done differently > > > >> than for full system targets. This method turns out to be broken, since > > > >> it can fairly easily result in duplicate cpu_index values for > > > >> simultaneously active cpus (i.e. threads in the emulated process). > > > >> > > > >> Consider this sequence: > > > >> Create thread 1 > > > >> Create thread 2 > > > >> Exit thread 1 > > > >> Create thread 3 > > > >> > > > >> With the current logic thread 1 will get cpu_index 1, thread 2 will get > > > >> cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2 > > > >> threads in the cpus list at the point of its creation). > > > >> > > > >> We mostly get away with this because cpu_index values aren't that important > > > >> for userspace emulation. Still, it can't be good, so this patch fixes it > > > >> by making CONFIG_USER_ONLY use the same bitmap based allocation that full > > > >> system targets already use. > > > >> > > > >> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > > >> --- > > > >> exec.c | 19 ------------------- > > > >> 1 file changed, 19 deletions(-) > > > >> > > > >> diff --git a/exec.c b/exec.c > > > >> index 011babd..e410dab 100644 > > > >> --- a/exec.c > > > >> +++ b/exec.c > > > >> @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx) > > > >> } > > > >> #endif > > > >> > > > >> -#ifndef CONFIG_USER_ONLY > > > >> static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS); > > > >> > > > >> static int cpu_get_free_index(Error **errp) > > > >> @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu) > > > >> { > > > >> bitmap_clear(cpu_index_map, cpu->cpu_index, 1); > > > >> } > > > >> -#else > > > >> - > > > >> -static int cpu_get_free_index(Error **errp) > > > >> -{ > > > >> - CPUState *some_cpu; > > > >> - int cpu_index = 0; > > > >> - > > > >> - CPU_FOREACH(some_cpu) { > > > >> - cpu_index++; > > > >> - } > > > >> - return cpu_index; > > > >> -} > > > >> - > > > >> -static void cpu_release_index(CPUState *cpu) > > > >> -{ > > > >> - return; > > > >> -} > > > >> -#endif > > > > > > > > Won't this change impose a maximum limit of 256 simultaneous > > > > threads? That seems a little low for comfort. > > > > > > This was the reason why the bitmap logic wasn't applied to > > > CONFIG_USER_ONLY when it was introduced. > > > > > > https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg01980.html > > > > Ah.. good point. > > > > Hrm, ok, my next idea would be to just (globally) sequentially > > allocate cpu_index values for CONFIG_USER, and never try to re-use > > them. Does that seem reasonable? > > > > Isn't it only deferring the problem to later ? You mean that we could get duplicate indexes after the value wraps around? I suppose, but duplicates after spawning 4 billion threads seems like a substantial improvement over duplicates after spawning 3 in the wrong order.. > Maybe it is possible to define MAX_CPUMASK_BITS to a much higher > value fo CONFIG_USER only instead ? Perhaps. It does mean carrying around a huge bitmap, though. Another option is to remove cpu_index entirely for the user only case. I have some patches for this, which are very ugly but it's possible they can be cleaned up to something reasonable (the biggest chunk is moving a bunch of ARM stuff under #ifndef CONFIG_USER_ONLY for what I think are registers that aren't accessible in user mode). > > > But then we didn't have actual removal, but we do now. > > > > You mean patch 1/2 in this set? Or something else? > > > > Even so, 256 does seem a bit low for a number of simultaneously active > > threads - there are some bug hairy multi-threaded programs out there. > > > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [RFC 2/2] linux-user: Fix cpu_index generation 2016-07-18 1:17 ` David Gibson @ 2016-07-18 7:25 ` Igor Mammedov 2016-07-18 9:58 ` David Gibson 2016-07-18 8:52 ` Greg Kurz 1 sibling, 1 reply; 16+ messages in thread From: Igor Mammedov @ 2016-07-18 7:25 UTC (permalink / raw) To: David Gibson Cc: Greg Kurz, Bharata B Rao, Peter Maydell, Riku Voipio, QEMU Developers On Mon, 18 Jul 2016 11:17:25 +1000 David Gibson <david@gibson.dropbear.id.au> wrote: > On Sat, Jul 16, 2016 at 12:11:56AM +0200, Greg Kurz wrote: > > On Thu, 14 Jul 2016 21:59:45 +1000 > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > On Thu, Jul 14, 2016 at 03:50:56PM +0530, Bharata B Rao wrote: > > > > On Thu, Jul 14, 2016 at 3:24 PM, Peter Maydell <peter.maydell@linaro.org> wrote: > > > > > On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote: > > > > >> With CONFIG_USER_ONLY, generation of cpu_index values is done differently > > > > >> than for full system targets. This method turns out to be broken, since > > > > >> it can fairly easily result in duplicate cpu_index values for > > > > >> simultaneously active cpus (i.e. threads in the emulated process). > > > > >> > > > > >> Consider this sequence: > > > > >> Create thread 1 > > > > >> Create thread 2 > > > > >> Exit thread 1 > > > > >> Create thread 3 > > > > >> > > > > >> With the current logic thread 1 will get cpu_index 1, thread 2 will get > > > > >> cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2 > > > > >> threads in the cpus list at the point of its creation). > > > > >> > > > > >> We mostly get away with this because cpu_index values aren't that important > > > > >> for userspace emulation. Still, it can't be good, so this patch fixes it > > > > >> by making CONFIG_USER_ONLY use the same bitmap based allocation that full > > > > >> system targets already use. > > > > >> > > > > >> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > > > >> --- > > > > >> exec.c | 19 ------------------- > > > > >> 1 file changed, 19 deletions(-) > > > > >> > > > > >> diff --git a/exec.c b/exec.c > > > > >> index 011babd..e410dab 100644 > > > > >> --- a/exec.c > > > > >> +++ b/exec.c > > > > >> @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx) > > > > >> } > > > > >> #endif > > > > >> > > > > >> -#ifndef CONFIG_USER_ONLY > > > > >> static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS); > > > > >> > > > > >> static int cpu_get_free_index(Error **errp) > > > > >> @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu) > > > > >> { > > > > >> bitmap_clear(cpu_index_map, cpu->cpu_index, 1); > > > > >> } > > > > >> -#else > > > > >> - > > > > >> -static int cpu_get_free_index(Error **errp) > > > > >> -{ > > > > >> - CPUState *some_cpu; > > > > >> - int cpu_index = 0; > > > > >> - > > > > >> - CPU_FOREACH(some_cpu) { > > > > >> - cpu_index++; > > > > >> - } > > > > >> - return cpu_index; > > > > >> -} > > > > >> - > > > > >> -static void cpu_release_index(CPUState *cpu) > > > > >> -{ > > > > >> - return; > > > > >> -} > > > > >> -#endif > > > > > > > > > > Won't this change impose a maximum limit of 256 simultaneous > > > > > threads? That seems a little low for comfort. > > > > > > > > This was the reason why the bitmap logic wasn't applied to > > > > CONFIG_USER_ONLY when it was introduced. > > > > > > > > https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg01980.html > > > > > > Ah.. good point. > > > > > > Hrm, ok, my next idea would be to just (globally) sequentially > > > allocate cpu_index values for CONFIG_USER, and never try to re-use > > > them. Does that seem reasonable? > > > > > > > Isn't it only deferring the problem to later ? > > You mean that we could get duplicate indexes after the value wraps > around? > > I suppose, but duplicates after spawning 4 billion threads seems like > a substantial improvement over duplicates after spawning 3 in the > wrong order.. > > > Maybe it is possible to define MAX_CPUMASK_BITS to a much higher > > value fo CONFIG_USER only instead ? > > Perhaps. It does mean carrying around a huge bitmap, though. > > Another option is to remove cpu_index entirely for the user only > case. I have some patches for this, which are very ugly but it's > possible they can be cleaned up to something reasonable (the biggest > chunk is moving a bunch of ARM stuff under #ifndef CONFIG_USER_ONLY > for what I think are registers that aren't accessible in user mode). could we remove cpu_index altogether for bot *-user and *-softmmu targets? > > > > > > But then we didn't have actual removal, but we do now. > > > > > > You mean patch 1/2 in this set? Or something else? > > > > > > Even so, 256 does seem a bit low for a number of simultaneously active > > > threads - there are some bug hairy multi-threaded programs out there. > > > > > > > > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [RFC 2/2] linux-user: Fix cpu_index generation 2016-07-18 7:25 ` Igor Mammedov @ 2016-07-18 9:58 ` David Gibson 0 siblings, 0 replies; 16+ messages in thread From: David Gibson @ 2016-07-18 9:58 UTC (permalink / raw) To: Igor Mammedov Cc: Greg Kurz, Bharata B Rao, Peter Maydell, Riku Voipio, QEMU Developers [-- Attachment #1: Type: text/plain, Size: 5612 bytes --] On Mon, Jul 18, 2016 at 09:25:58AM +0200, Igor Mammedov wrote: > On Mon, 18 Jul 2016 11:17:25 +1000 > David Gibson <david@gibson.dropbear.id.au> wrote: > > > On Sat, Jul 16, 2016 at 12:11:56AM +0200, Greg Kurz wrote: > > > On Thu, 14 Jul 2016 21:59:45 +1000 > > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > > > On Thu, Jul 14, 2016 at 03:50:56PM +0530, Bharata B Rao wrote: > > > > > On Thu, Jul 14, 2016 at 3:24 PM, Peter Maydell <peter.maydell@linaro.org> wrote: > > > > > > On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > >> With CONFIG_USER_ONLY, generation of cpu_index values is done differently > > > > > >> than for full system targets. This method turns out to be broken, since > > > > > >> it can fairly easily result in duplicate cpu_index values for > > > > > >> simultaneously active cpus (i.e. threads in the emulated process). > > > > > >> > > > > > >> Consider this sequence: > > > > > >> Create thread 1 > > > > > >> Create thread 2 > > > > > >> Exit thread 1 > > > > > >> Create thread 3 > > > > > >> > > > > > >> With the current logic thread 1 will get cpu_index 1, thread 2 will get > > > > > >> cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2 > > > > > >> threads in the cpus list at the point of its creation). > > > > > >> > > > > > >> We mostly get away with this because cpu_index values aren't that important > > > > > >> for userspace emulation. Still, it can't be good, so this patch fixes it > > > > > >> by making CONFIG_USER_ONLY use the same bitmap based allocation that full > > > > > >> system targets already use. > > > > > >> > > > > > >> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > > > > >> --- > > > > > >> exec.c | 19 ------------------- > > > > > >> 1 file changed, 19 deletions(-) > > > > > >> > > > > > >> diff --git a/exec.c b/exec.c > > > > > >> index 011babd..e410dab 100644 > > > > > >> --- a/exec.c > > > > > >> +++ b/exec.c > > > > > >> @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx) > > > > > >> } > > > > > >> #endif > > > > > >> > > > > > >> -#ifndef CONFIG_USER_ONLY > > > > > >> static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS); > > > > > >> > > > > > >> static int cpu_get_free_index(Error **errp) > > > > > >> @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu) > > > > > >> { > > > > > >> bitmap_clear(cpu_index_map, cpu->cpu_index, 1); > > > > > >> } > > > > > >> -#else > > > > > >> - > > > > > >> -static int cpu_get_free_index(Error **errp) > > > > > >> -{ > > > > > >> - CPUState *some_cpu; > > > > > >> - int cpu_index = 0; > > > > > >> - > > > > > >> - CPU_FOREACH(some_cpu) { > > > > > >> - cpu_index++; > > > > > >> - } > > > > > >> - return cpu_index; > > > > > >> -} > > > > > >> - > > > > > >> -static void cpu_release_index(CPUState *cpu) > > > > > >> -{ > > > > > >> - return; > > > > > >> -} > > > > > >> -#endif > > > > > > > > > > > > Won't this change impose a maximum limit of 256 simultaneous > > > > > > threads? That seems a little low for comfort. > > > > > > > > > > This was the reason why the bitmap logic wasn't applied to > > > > > CONFIG_USER_ONLY when it was introduced. > > > > > > > > > > https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg01980.html > > > > > > > > Ah.. good point. > > > > > > > > Hrm, ok, my next idea would be to just (globally) sequentially > > > > allocate cpu_index values for CONFIG_USER, and never try to re-use > > > > them. Does that seem reasonable? > > > > > > > > > > Isn't it only deferring the problem to later ? > > > > You mean that we could get duplicate indexes after the value wraps > > around? > > > > I suppose, but duplicates after spawning 4 billion threads seems like > > a substantial improvement over duplicates after spawning 3 in the > > wrong order.. > > > > > Maybe it is possible to define MAX_CPUMASK_BITS to a much higher > > > value fo CONFIG_USER only instead ? > > > > Perhaps. It does mean carrying around a huge bitmap, though. > > > > Another option is to remove cpu_index entirely for the user only > > case. I have some patches for this, which are very ugly but it's > > possible they can be cleaned up to something reasonable (the biggest > > chunk is moving a bunch of ARM stuff under #ifndef CONFIG_USER_ONLY > > for what I think are registers that aren't accessible in user mode). > could we remove cpu_index altogether for bot *-user and *-softmmu targets? Well.. not in the same way I'm looking at removing it for *-user, at any rate. From looking through all the users of cpu_index, nearly all of them are in two categories: 1) Labelling debug or error messages with a CPU # There's not something obvious to replace this with for *-softmmu. For *-user, however, we can use the host tid, which is probably more useful than an essentially arbitrary cpu index. 2) Initializing cpu specific registers That's "cpu specific" in both the sense of ISA specific and in the sense of specific to a particular CPU in an SMP system. These registers are generally privileged and so don't need to be emulated for *-user. Finding a substitute for *-softmmu is rather harder. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [RFC 2/2] linux-user: Fix cpu_index generation 2016-07-18 1:17 ` David Gibson 2016-07-18 7:25 ` Igor Mammedov @ 2016-07-18 8:52 ` Greg Kurz 2016-07-18 9:50 ` David Gibson 1 sibling, 1 reply; 16+ messages in thread From: Greg Kurz @ 2016-07-18 8:52 UTC (permalink / raw) To: David Gibson Cc: Bharata B Rao, Peter Maydell, Riku Voipio, QEMU Developers, Igor Mammedov [-- Attachment #1: Type: text/plain, Size: 4822 bytes --] On Mon, 18 Jul 2016 11:17:25 +1000 David Gibson <david@gibson.dropbear.id.au> wrote: > On Sat, Jul 16, 2016 at 12:11:56AM +0200, Greg Kurz wrote: > > On Thu, 14 Jul 2016 21:59:45 +1000 > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > On Thu, Jul 14, 2016 at 03:50:56PM +0530, Bharata B Rao wrote: > > > > On Thu, Jul 14, 2016 at 3:24 PM, Peter Maydell <peter.maydell@linaro.org> wrote: > > > > > On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote: > > > > >> With CONFIG_USER_ONLY, generation of cpu_index values is done differently > > > > >> than for full system targets. This method turns out to be broken, since > > > > >> it can fairly easily result in duplicate cpu_index values for > > > > >> simultaneously active cpus (i.e. threads in the emulated process). > > > > >> > > > > >> Consider this sequence: > > > > >> Create thread 1 > > > > >> Create thread 2 > > > > >> Exit thread 1 > > > > >> Create thread 3 > > > > >> > > > > >> With the current logic thread 1 will get cpu_index 1, thread 2 will get > > > > >> cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2 > > > > >> threads in the cpus list at the point of its creation). > > > > >> > > > > >> We mostly get away with this because cpu_index values aren't that important > > > > >> for userspace emulation. Still, it can't be good, so this patch fixes it > > > > >> by making CONFIG_USER_ONLY use the same bitmap based allocation that full > > > > >> system targets already use. > > > > >> > > > > >> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > > > >> --- > > > > >> exec.c | 19 ------------------- > > > > >> 1 file changed, 19 deletions(-) > > > > >> > > > > >> diff --git a/exec.c b/exec.c > > > > >> index 011babd..e410dab 100644 > > > > >> --- a/exec.c > > > > >> +++ b/exec.c > > > > >> @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx) > > > > >> } > > > > >> #endif > > > > >> > > > > >> -#ifndef CONFIG_USER_ONLY > > > > >> static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS); > > > > >> > > > > >> static int cpu_get_free_index(Error **errp) > > > > >> @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu) > > > > >> { > > > > >> bitmap_clear(cpu_index_map, cpu->cpu_index, 1); > > > > >> } > > > > >> -#else > > > > >> - > > > > >> -static int cpu_get_free_index(Error **errp) > > > > >> -{ > > > > >> - CPUState *some_cpu; > > > > >> - int cpu_index = 0; > > > > >> - > > > > >> - CPU_FOREACH(some_cpu) { > > > > >> - cpu_index++; > > > > >> - } > > > > >> - return cpu_index; > > > > >> -} > > > > >> - > > > > >> -static void cpu_release_index(CPUState *cpu) > > > > >> -{ > > > > >> - return; > > > > >> -} > > > > >> -#endif > > > > > > > > > > Won't this change impose a maximum limit of 256 simultaneous > > > > > threads? That seems a little low for comfort. > > > > > > > > This was the reason why the bitmap logic wasn't applied to > > > > CONFIG_USER_ONLY when it was introduced. > > > > > > > > https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg01980.html > > > > > > Ah.. good point. > > > > > > Hrm, ok, my next idea would be to just (globally) sequentially > > > allocate cpu_index values for CONFIG_USER, and never try to re-use > > > them. Does that seem reasonable? > > > > > > > Isn't it only deferring the problem to later ? > > You mean that we could get duplicate indexes after the value wraps > around? > Yes. > I suppose, but duplicates after spawning 4 billion threads seems like > a substantial improvement over duplicates after spawning 3 in the > wrong order.. > Agreed. It takes ~ 20 seconds to user QEMU to spawn 10000 threads on my palmetto box, so the wrap around would occur after ~ 100 days. :) > > Maybe it is possible to define MAX_CPUMASK_BITS to a much higher > > value fo CONFIG_USER only instead ? > > Perhaps. It does mean carrying around a huge bitmap, though. > > Another option is to remove cpu_index entirely for the user only > case. I have some patches for this, which are very ugly but it's > possible they can be cleaned up to something reasonable (the biggest > chunk is moving a bunch of ARM stuff under #ifndef CONFIG_USER_ONLY > for what I think are registers that aren't accessible in user mode). > > > > > > But then we didn't have actual removal, but we do now. > > > > > > You mean patch 1/2 in this set? Or something else? > > > > > > Even so, 256 does seem a bit low for a number of simultaneously active > > > threads - there are some bug hairy multi-threaded programs out there. > > > > > > > > [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Qemu-devel] [RFC 2/2] linux-user: Fix cpu_index generation 2016-07-18 8:52 ` Greg Kurz @ 2016-07-18 9:50 ` David Gibson 0 siblings, 0 replies; 16+ messages in thread From: David Gibson @ 2016-07-18 9:50 UTC (permalink / raw) To: Greg Kurz Cc: Bharata B Rao, Peter Maydell, Riku Voipio, QEMU Developers, Igor Mammedov [-- Attachment #1: Type: text/plain, Size: 4559 bytes --] On Mon, Jul 18, 2016 at 10:52:39AM +0200, Greg Kurz wrote: > On Mon, 18 Jul 2016 11:17:25 +1000 > David Gibson <david@gibson.dropbear.id.au> wrote: > > > On Sat, Jul 16, 2016 at 12:11:56AM +0200, Greg Kurz wrote: > > > On Thu, 14 Jul 2016 21:59:45 +1000 > > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > > > On Thu, Jul 14, 2016 at 03:50:56PM +0530, Bharata B Rao wrote: > > > > > On Thu, Jul 14, 2016 at 3:24 PM, Peter Maydell <peter.maydell@linaro.org> wrote: > > > > > > On 14 July 2016 at 08:57, David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > >> With CONFIG_USER_ONLY, generation of cpu_index values is done differently > > > > > >> than for full system targets. This method turns out to be broken, since > > > > > >> it can fairly easily result in duplicate cpu_index values for > > > > > >> simultaneously active cpus (i.e. threads in the emulated process). > > > > > >> > > > > > >> Consider this sequence: > > > > > >> Create thread 1 > > > > > >> Create thread 2 > > > > > >> Exit thread 1 > > > > > >> Create thread 3 > > > > > >> > > > > > >> With the current logic thread 1 will get cpu_index 1, thread 2 will get > > > > > >> cpu_index 2 and thread 3 will also get cpu_index 2 (because there are 2 > > > > > >> threads in the cpus list at the point of its creation). > > > > > >> > > > > > >> We mostly get away with this because cpu_index values aren't that important > > > > > >> for userspace emulation. Still, it can't be good, so this patch fixes it > > > > > >> by making CONFIG_USER_ONLY use the same bitmap based allocation that full > > > > > >> system targets already use. > > > > > >> > > > > > >> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > > > > >> --- > > > > > >> exec.c | 19 ------------------- > > > > > >> 1 file changed, 19 deletions(-) > > > > > >> > > > > > >> diff --git a/exec.c b/exec.c > > > > > >> index 011babd..e410dab 100644 > > > > > >> --- a/exec.c > > > > > >> +++ b/exec.c > > > > > >> @@ -596,7 +596,6 @@ AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx) > > > > > >> } > > > > > >> #endif > > > > > >> > > > > > >> -#ifndef CONFIG_USER_ONLY > > > > > >> static DECLARE_BITMAP(cpu_index_map, MAX_CPUMASK_BITS); > > > > > >> > > > > > >> static int cpu_get_free_index(Error **errp) > > > > > >> @@ -617,24 +616,6 @@ static void cpu_release_index(CPUState *cpu) > > > > > >> { > > > > > >> bitmap_clear(cpu_index_map, cpu->cpu_index, 1); > > > > > >> } > > > > > >> -#else > > > > > >> - > > > > > >> -static int cpu_get_free_index(Error **errp) > > > > > >> -{ > > > > > >> - CPUState *some_cpu; > > > > > >> - int cpu_index = 0; > > > > > >> - > > > > > >> - CPU_FOREACH(some_cpu) { > > > > > >> - cpu_index++; > > > > > >> - } > > > > > >> - return cpu_index; > > > > > >> -} > > > > > >> - > > > > > >> -static void cpu_release_index(CPUState *cpu) > > > > > >> -{ > > > > > >> - return; > > > > > >> -} > > > > > >> -#endif > > > > > > > > > > > > Won't this change impose a maximum limit of 256 simultaneous > > > > > > threads? That seems a little low for comfort. > > > > > > > > > > This was the reason why the bitmap logic wasn't applied to > > > > > CONFIG_USER_ONLY when it was introduced. > > > > > > > > > > https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg01980.html > > > > > > > > Ah.. good point. > > > > > > > > Hrm, ok, my next idea would be to just (globally) sequentially > > > > allocate cpu_index values for CONFIG_USER, and never try to re-use > > > > them. Does that seem reasonable? > > > > > > > > > > Isn't it only deferring the problem to later ? > > > > You mean that we could get duplicate indexes after the value wraps > > around? > > > > Yes. > > > I suppose, but duplicates after spawning 4 billion threads seems like > > a substantial improvement over duplicates after spawning 3 in the > > wrong order.. > > > > Agreed. > > It takes ~ 20 seconds to user QEMU to spawn 10000 threads on my palmetto > box, so the wrap around would occur after ~ 100 days. :) Yeah, I think we can live with that. Especially since the fact this hasn't come up before kind of indicates the duplication isn't that fatal anyway. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2016-07-18 9:59 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-07-14 7:57 [Qemu-devel] [RFC 0/2] Fix some bugs in usermode cpu tracking David Gibson 2016-07-14 7:57 ` [Qemu-devel] [RFC 1/2] linux-user: Don't leak cpus on thread exit David Gibson 2016-07-14 9:52 ` Peter Maydell 2016-07-14 12:02 ` David Gibson 2016-07-14 13:05 ` Igor Mammedov 2016-07-15 2:53 ` David Gibson 2016-07-14 7:57 ` [Qemu-devel] [RFC 2/2] linux-user: Fix cpu_index generation David Gibson 2016-07-14 9:54 ` Peter Maydell 2016-07-14 10:20 ` Bharata B Rao 2016-07-14 11:59 ` David Gibson 2016-07-15 22:11 ` Greg Kurz 2016-07-18 1:17 ` David Gibson 2016-07-18 7:25 ` Igor Mammedov 2016-07-18 9:58 ` David Gibson 2016-07-18 8:52 ` Greg Kurz 2016-07-18 9:50 ` David Gibson
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.