* Segfault on exit after commit ec804ffbd "lib/copperplate: Release main threadobj on exit"
[not found] <233902a7-0911-06e8-a54b-dd7a1d568264.ref@yahoo.com>
@ 2023-02-21 14:13 ` Mauro S.
2023-02-28 12:10 ` Mauro S.
0 siblings, 1 reply; 5+ messages in thread
From: Mauro S. @ 2023-02-21 14:13 UTC (permalink / raw)
To: xenomai
[-- Attachment #1: Type: text/plain, Size: 2025 bytes --]
Hi all,
I'm using Xenomai 3.2.2 at commit ec804ffbd, kernel 5.4.228, cobalt
mode, on a Atom x5-E8000 x86-64.
Since I trarted to use the commit in message, I have a subtle segfault
when I close my application.
I managed to reduce the case in the simple code attached to the message.
If you run the script, almost always the
xeno-test-session-segfault-mainproc generates a segfault at exit.
Analyzing the core, the involved code is always the same:
#0 0x00007ff729498400 in __hash_key () from /usr/lib/libcobalt.so.2
#1 0x00007ff729498897 in hash_remove () from /usr/lib/libcobalt.so.2
#2 0x00007ff7294b49ac in syncluster_delobj () from
/usr/lib/libcopperplate.so.0
#3 0x00007ff7294ce06d in ?? () from /usr/lib/libalchemy.so.0
#4 0x00007ff7294b6774 in semobj_destroy () from
/usr/lib/libcopperplate.so.0
#5 0x00007ff7294ce216 in rt_sem_delete () from /usr/lib/libalchemy.so.0
#6 0x0000559ca0298651 in SubDummyFn (arg=<optimized out>) at main_proc.c:55
#7 0x00007ff7294cca0a in ?? () from /usr/lib/libalchemy.so.0
#8 0x00007ff7294b5ce9 in ?? () from /usr/lib/libcopperplate.so.0
#9 0x00007ff729494f7e in ?? () from /usr/lib/libcobalt.so.2
#10 0x00007ff729461ea4 in start_thread (arg=<optimized out>) at
pthread_create.c:477
#11 0x00007ff72937859f in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Reverting the commit in object the segfault disappears.
Applying the commit to 3.1 branch results in the same problem.
Seems to be a corruption of the session memory shared between processes
caused by the atexit() callback introduced by the commit.
Some notes:
- seems that it depends on when the xeno-test-session-segfault-secproc
exits: if exit happens during the SubDummy tasks start, the segfault is
generated, otherwise no
- seems to be related with the number of calls of the
xeno-test-session-segfault-secproc: with only one call (change i == 2
with i == 1 at main_proc.c:139 and remove one call in the script), the
problem does not happen
Thanks in advance, regards.
--
Mauro S.
[-- Attachment #2: session_segfault_test.tar.xz --]
[-- Type: application/x-xz, Size: 2420 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Segfault on exit after commit ec804ffbd "lib/copperplate: Release main threadobj on exit"
2023-02-21 14:13 ` Segfault on exit after commit ec804ffbd "lib/copperplate: Release main threadobj on exit" Mauro S.
@ 2023-02-28 12:10 ` Mauro S.
2023-02-28 12:51 ` Jan Kiszka
0 siblings, 1 reply; 5+ messages in thread
From: Mauro S. @ 2023-02-28 12:10 UTC (permalink / raw)
To: xenomai
Il 21/02/23 15:13, Mauro S. ha scritto:
> Hi all,
>
> I'm using Xenomai 3.2.2 at commit ec804ffbd, kernel 5.4.228, cobalt
> mode, on a Atom x5-E8000 x86-64.
>
> Since I trarted to use the commit in message, I have a subtle segfault
> when I close my application.
>
> I managed to reduce the case in the simple code attached to the message.
> If you run the script, almost always the
> xeno-test-session-segfault-mainproc generates a segfault at exit.
>
> Analyzing the core, the involved code is always the same:
>
> #0 0x00007ff729498400 in __hash_key () from /usr/lib/libcobalt.so.2
> #1 0x00007ff729498897 in hash_remove () from /usr/lib/libcobalt.so.2
> #2 0x00007ff7294b49ac in syncluster_delobj () from
> /usr/lib/libcopperplate.so.0
> #3 0x00007ff7294ce06d in ?? () from /usr/lib/libalchemy.so.0
> #4 0x00007ff7294b6774 in semobj_destroy () from
> /usr/lib/libcopperplate.so.0
> #5 0x00007ff7294ce216 in rt_sem_delete () from /usr/lib/libalchemy.so.0
> #6 0x0000559ca0298651 in SubDummyFn (arg=<optimized out>) at
> main_proc.c:55
> #7 0x00007ff7294cca0a in ?? () from /usr/lib/libalchemy.so.0
> #8 0x00007ff7294b5ce9 in ?? () from /usr/lib/libcopperplate.so.0
> #9 0x00007ff729494f7e in ?? () from /usr/lib/libcobalt.so.2
> #10 0x00007ff729461ea4 in start_thread (arg=<optimized out>) at
> pthread_create.c:477
> #11 0x00007ff72937859f in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
>
> Reverting the commit in object the segfault disappears.
>
> Applying the commit to 3.1 branch results in the same problem.
>
> Seems to be a corruption of the session memory shared between processes
> caused by the atexit() callback introduced by the commit.
>
> Some notes:
> - seems that it depends on when the xeno-test-session-segfault-secproc
> exits: if exit happens during the SubDummy tasks start, the segfault is
> generated, otherwise no
> - seems to be related with the number of calls of the
> xeno-test-session-segfault-secproc: with only one call (change i == 2
> with i == 1 at main_proc.c:139 and remove one call in the script), the
> problem does not happen
>
> Thanks in advance, regards.
>
Hi all,
just a kind ping.
Is there anything I can do to try to solve this problem / do you have
some indications?
Thanks in advance, regards
--
Mauro S.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Segfault on exit after commit ec804ffbd "lib/copperplate: Release main threadobj on exit"
2023-02-28 12:10 ` Mauro S.
@ 2023-02-28 12:51 ` Jan Kiszka
2023-03-21 18:38 ` Jan Kiszka
0 siblings, 1 reply; 5+ messages in thread
From: Jan Kiszka @ 2023-02-28 12:51 UTC (permalink / raw)
To: Mauro S., xenomai
On 28.02.23 13:10, Mauro S. wrote:
> Il 21/02/23 15:13, Mauro S. ha scritto:
>> Hi all,
>>
>> I'm using Xenomai 3.2.2 at commit ec804ffbd, kernel 5.4.228, cobalt
>> mode, on a Atom x5-E8000 x86-64.
>>
>> Since I trarted to use the commit in message, I have a subtle segfault
>> when I close my application.
>>
>> I managed to reduce the case in the simple code attached to the message.
>> If you run the script, almost always the
>> xeno-test-session-segfault-mainproc generates a segfault at exit.
>>
>> Analyzing the core, the involved code is always the same:
>>
>> #0 0x00007ff729498400 in __hash_key () from /usr/lib/libcobalt.so.2
>> #1 0x00007ff729498897 in hash_remove () from /usr/lib/libcobalt.so.2
>> #2 0x00007ff7294b49ac in syncluster_delobj () from
>> /usr/lib/libcopperplate.so.0
>> #3 0x00007ff7294ce06d in ?? () from /usr/lib/libalchemy.so.0
>> #4 0x00007ff7294b6774 in semobj_destroy () from
>> /usr/lib/libcopperplate.so.0
>> #5 0x00007ff7294ce216 in rt_sem_delete () from /usr/lib/libalchemy.so.0
>> #6 0x0000559ca0298651 in SubDummyFn (arg=<optimized out>) at
>> main_proc.c:55
>> #7 0x00007ff7294cca0a in ?? () from /usr/lib/libalchemy.so.0
>> #8 0x00007ff7294b5ce9 in ?? () from /usr/lib/libcopperplate.so.0
>> #9 0x00007ff729494f7e in ?? () from /usr/lib/libcobalt.so.2
>> #10 0x00007ff729461ea4 in start_thread (arg=<optimized out>) at
>> pthread_create.c:477
>> #11 0x00007ff72937859f in clone () at
>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
>>
>> Reverting the commit in object the segfault disappears.
>>
>> Applying the commit to 3.1 branch results in the same problem.
>>
>> Seems to be a corruption of the session memory shared between
>> processes caused by the atexit() callback introduced by the commit.
>>
>> Some notes:
>> - seems that it depends on when the xeno-test-session-segfault-secproc
>> exits: if exit happens during the SubDummy tasks start, the segfault
>> is generated, otherwise no
>> - seems to be related with the number of calls of the
>> xeno-test-session-segfault-secproc: with only one call (change i == 2
>> with i == 1 at main_proc.c:139 and remove one call in the script), the
>> problem does not happen
>>
>> Thanks in advance, regards.
>>
>
> Hi all,
>
> just a kind ping.
>
> Is there anything I can do to try to solve this problem / do you have
> some indications?
>
> Thanks in advance, regards
>
Sorry, the original email didn't make it to my inbox, but I see it now
in the archives. Will try to have a look but would also welcome if
someone else can debug deeper.
Jan
--
Siemens AG, Technology
Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Segfault on exit after commit ec804ffbd "lib/copperplate: Release main threadobj on exit"
2023-02-28 12:51 ` Jan Kiszka
@ 2023-03-21 18:38 ` Jan Kiszka
2023-03-22 18:41 ` Mauro S.
0 siblings, 1 reply; 5+ messages in thread
From: Jan Kiszka @ 2023-03-21 18:38 UTC (permalink / raw)
To: Mauro S., xenomai
On 28.02.23 13:51, Jan Kiszka wrote:
> On 28.02.23 13:10, Mauro S. wrote:
>> Il 21/02/23 15:13, Mauro S. ha scritto:
>>> Hi all,
>>>
>>> I'm using Xenomai 3.2.2 at commit ec804ffbd, kernel 5.4.228, cobalt
>>> mode, on a Atom x5-E8000 x86-64.
>>>
>>> Since I trarted to use the commit in message, I have a subtle segfault
>>> when I close my application.
>>>
>>> I managed to reduce the case in the simple code attached to the message.
>>> If you run the script, almost always the
>>> xeno-test-session-segfault-mainproc generates a segfault at exit.
>>>
>>> Analyzing the core, the involved code is always the same:
>>>
>>> #0 0x00007ff729498400 in __hash_key () from /usr/lib/libcobalt.so.2
>>> #1 0x00007ff729498897 in hash_remove () from /usr/lib/libcobalt.so.2
>>> #2 0x00007ff7294b49ac in syncluster_delobj () from
>>> /usr/lib/libcopperplate.so.0
>>> #3 0x00007ff7294ce06d in ?? () from /usr/lib/libalchemy.so.0
>>> #4 0x00007ff7294b6774 in semobj_destroy () from
>>> /usr/lib/libcopperplate.so.0
>>> #5 0x00007ff7294ce216 in rt_sem_delete () from /usr/lib/libalchemy.so.0
>>> #6 0x0000559ca0298651 in SubDummyFn (arg=<optimized out>) at
>>> main_proc.c:55
>>> #7 0x00007ff7294cca0a in ?? () from /usr/lib/libalchemy.so.0
>>> #8 0x00007ff7294b5ce9 in ?? () from /usr/lib/libcopperplate.so.0
>>> #9 0x00007ff729494f7e in ?? () from /usr/lib/libcobalt.so.2
>>> #10 0x00007ff729461ea4 in start_thread (arg=<optimized out>) at
>>> pthread_create.c:477
>>> #11 0x00007ff72937859f in clone () at
>>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
>>>
>>> Reverting the commit in object the segfault disappears.
>>>
>>> Applying the commit to 3.1 branch results in the same problem.
>>>
>>> Seems to be a corruption of the session memory shared between
>>> processes caused by the atexit() callback introduced by the commit.
>>>
>>> Some notes:
>>> - seems that it depends on when the xeno-test-session-segfault-secproc
>>> exits: if exit happens during the SubDummy tasks start, the segfault
>>> is generated, otherwise no
>>> - seems to be related with the number of calls of the
>>> xeno-test-session-segfault-secproc: with only one call (change i == 2
>>> with i == 1 at main_proc.c:139 and remove one call in the script), the
>>> problem does not happen
>>>
>>> Thanks in advance, regards.
>>>
>>
>> Hi all,
>>
>> just a kind ping.
>>
>> Is there anything I can do to try to solve this problem / do you have
>> some indications?
>>
>> Thanks in advance, regards
>>
>
> Sorry, the original email didn't make it to my inbox, but I see it now
> in the archives. Will try to have a look but would also welcome if
> someone else can debug deeper.
>
Finally found time to debug. This seems to resolve the issue:
diff --git a/lib/copperplate/threadobj.c b/lib/copperplate/threadobj.c
index 6808bcf164..db18f4ffa3 100644
--- a/lib/copperplate/threadobj.c
+++ b/lib/copperplate/threadobj.c
@@ -1773,7 +1773,10 @@ int threadobj_set_schedprio(struct threadobj *thobj, int priority)
#ifdef CONFIG_XENO_PSHARED
static void main_exit(void)
{
- threadobj_free(threadobj_current());
+ struct threadobj *thobj = threadobj_current();
+
+ sysgroup_remove(thread, &thobj->memspec);
+ threadobj_free(thobj);
}
#endif
Can you confirm this? I hope we are not missing more things. Needs a
second check.
Jan
--
Siemens AG, Technology
Competence Center Embedded Linux
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: Segfault on exit after commit ec804ffbd "lib/copperplate: Release main threadobj on exit"
2023-03-21 18:38 ` Jan Kiszka
@ 2023-03-22 18:41 ` Mauro S.
0 siblings, 0 replies; 5+ messages in thread
From: Mauro S. @ 2023-03-22 18:41 UTC (permalink / raw)
To: xenomai; +Cc: Jan Kiszka
Il 21/03/23 19:38, Jan Kiszka ha scritto:
> On 28.02.23 13:51, Jan Kiszka wrote:
>> On 28.02.23 13:10, Mauro S. wrote:
>>> Il 21/02/23 15:13, Mauro S. ha scritto:
>>>> Hi all,
>>>>
>>>> I'm using Xenomai 3.2.2 at commit ec804ffbd, kernel 5.4.228, cobalt
>>>> mode, on a Atom x5-E8000 x86-64.
>>>>
>>>> Since I trarted to use the commit in message, I have a subtle segfault
>>>> when I close my application.
>>>>
>>>> I managed to reduce the case in the simple code attached to the message.
>>>> If you run the script, almost always the
>>>> xeno-test-session-segfault-mainproc generates a segfault at exit.
>>>>
>>>> Analyzing the core, the involved code is always the same:
>>>>
>>>> #0 0x00007ff729498400 in __hash_key () from /usr/lib/libcobalt.so.2
>>>> #1 0x00007ff729498897 in hash_remove () from /usr/lib/libcobalt.so.2
>>>> #2 0x00007ff7294b49ac in syncluster_delobj () from
>>>> /usr/lib/libcopperplate.so.0
>>>> #3 0x00007ff7294ce06d in ?? () from /usr/lib/libalchemy.so.0
>>>> #4 0x00007ff7294b6774 in semobj_destroy () from
>>>> /usr/lib/libcopperplate.so.0
>>>> #5 0x00007ff7294ce216 in rt_sem_delete () from /usr/lib/libalchemy.so.0
>>>> #6 0x0000559ca0298651 in SubDummyFn (arg=<optimized out>) at
>>>> main_proc.c:55
>>>> #7 0x00007ff7294cca0a in ?? () from /usr/lib/libalchemy.so.0
>>>> #8 0x00007ff7294b5ce9 in ?? () from /usr/lib/libcopperplate.so.0
>>>> #9 0x00007ff729494f7e in ?? () from /usr/lib/libcobalt.so.2
>>>> #10 0x00007ff729461ea4 in start_thread (arg=<optimized out>) at
>>>> pthread_create.c:477
>>>> #11 0x00007ff72937859f in clone () at
>>>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
>>>>
>>>> Reverting the commit in object the segfault disappears.
>>>>
>>>> Applying the commit to 3.1 branch results in the same problem.
>>>>
>>>> Seems to be a corruption of the session memory shared between
>>>> processes caused by the atexit() callback introduced by the commit.
>>>>
>>>> Some notes:
>>>> - seems that it depends on when the xeno-test-session-segfault-secproc
>>>> exits: if exit happens during the SubDummy tasks start, the segfault
>>>> is generated, otherwise no
>>>> - seems to be related with the number of calls of the
>>>> xeno-test-session-segfault-secproc: with only one call (change i == 2
>>>> with i == 1 at main_proc.c:139 and remove one call in the script), the
>>>> problem does not happen
>>>>
>>>> Thanks in advance, regards.
>>>>
>>>
>>> Hi all,
>>>
>>> just a kind ping.
>>>
>>> Is there anything I can do to try to solve this problem / do you have
>>> some indications?
>>>
>>> Thanks in advance, regards
>>>
>>
>> Sorry, the original email didn't make it to my inbox, but I see it now
>> in the archives. Will try to have a look but would also welcome if
>> someone else can debug deeper.
>>
>
> Finally found time to debug. This seems to resolve the issue:
>
> diff --git a/lib/copperplate/threadobj.c b/lib/copperplate/threadobj.c
> index 6808bcf164..db18f4ffa3 100644
> --- a/lib/copperplate/threadobj.c
> +++ b/lib/copperplate/threadobj.c
> @@ -1773,7 +1773,10 @@ int threadobj_set_schedprio(struct threadobj *thobj, int priority)
> #ifdef CONFIG_XENO_PSHARED
> static void main_exit(void)
> {
> - threadobj_free(threadobj_current());
> + struct threadobj *thobj = threadobj_current();
> +
> + sysgroup_remove(thread, &thobj->memspec);
> + threadobj_free(thobj);
> }
> #endif
>
>
> Can you confirm this? I hope we are not missing more things. Needs a
> second check.
>
> Jan
>
Hi Jan,
thank you very much. I confirm that this patch fixes the problem.
Thanks again, best regards
--
Mauro S.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-03-22 18:41 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <233902a7-0911-06e8-a54b-dd7a1d568264.ref@yahoo.com>
2023-02-21 14:13 ` Segfault on exit after commit ec804ffbd "lib/copperplate: Release main threadobj on exit" Mauro S.
2023-02-28 12:10 ` Mauro S.
2023-02-28 12:51 ` Jan Kiszka
2023-03-21 18:38 ` Jan Kiszka
2023-03-22 18:41 ` Mauro S.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).