xenomai.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* Segfault on exit after commit ec804ffbd "lib/copperplate: Release main threadobj on exit"
       [not found] <233902a7-0911-06e8-a54b-dd7a1d568264.ref@yahoo.com>
@ 2023-02-21 14:13 ` Mauro S.
  2023-02-28 12:10   ` Mauro S.
  0 siblings, 1 reply; 5+ messages in thread
From: Mauro S. @ 2023-02-21 14:13 UTC (permalink / raw)
  To: xenomai

[-- Attachment #1: Type: text/plain, Size: 2025 bytes --]

Hi all,

I'm using Xenomai 3.2.2 at commit ec804ffbd, kernel 5.4.228, cobalt 
mode, on a Atom x5-E8000 x86-64.

Since I trarted to use the commit in message, I have a subtle segfault 
when I close my application.

I managed to reduce the case in the simple code attached to the message.
If you run the script, almost always the 
xeno-test-session-segfault-mainproc generates a segfault at exit.

Analyzing the core, the involved code is always the same:

#0  0x00007ff729498400 in __hash_key () from /usr/lib/libcobalt.so.2
#1  0x00007ff729498897 in hash_remove () from /usr/lib/libcobalt.so.2
#2  0x00007ff7294b49ac in syncluster_delobj () from 
/usr/lib/libcopperplate.so.0
#3  0x00007ff7294ce06d in ?? () from /usr/lib/libalchemy.so.0
#4  0x00007ff7294b6774 in semobj_destroy () from 
/usr/lib/libcopperplate.so.0
#5  0x00007ff7294ce216 in rt_sem_delete () from /usr/lib/libalchemy.so.0
#6  0x0000559ca0298651 in SubDummyFn (arg=<optimized out>) at main_proc.c:55
#7  0x00007ff7294cca0a in ?? () from /usr/lib/libalchemy.so.0
#8  0x00007ff7294b5ce9 in ?? () from /usr/lib/libcopperplate.so.0
#9  0x00007ff729494f7e in ?? () from /usr/lib/libcobalt.so.2
#10 0x00007ff729461ea4 in start_thread (arg=<optimized out>) at 
pthread_create.c:477
#11 0x00007ff72937859f in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Reverting the commit in object the segfault disappears.

Applying the commit to 3.1 branch results in the same problem.

Seems to be a corruption of the session memory shared between processes 
caused by the atexit() callback introduced by the commit.

Some notes:
- seems that it depends on when the xeno-test-session-segfault-secproc 
exits: if exit happens during the SubDummy tasks start, the segfault is 
generated, otherwise no
- seems to be related with the number of calls of the 
xeno-test-session-segfault-secproc: with only one call (change i == 2 
with i == 1 at main_proc.c:139 and remove one call in the script), the 
problem does not happen

Thanks in advance, regards.

-- 
Mauro S.

[-- Attachment #2: session_segfault_test.tar.xz --]
[-- Type: application/x-xz, Size: 2420 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Segfault on exit after commit ec804ffbd "lib/copperplate: Release main threadobj on exit"
  2023-02-21 14:13 ` Segfault on exit after commit ec804ffbd "lib/copperplate: Release main threadobj on exit" Mauro S.
@ 2023-02-28 12:10   ` Mauro S.
  2023-02-28 12:51     ` Jan Kiszka
  0 siblings, 1 reply; 5+ messages in thread
From: Mauro S. @ 2023-02-28 12:10 UTC (permalink / raw)
  To: xenomai

Il 21/02/23 15:13, Mauro S. ha scritto:
> Hi all,
> 
> I'm using Xenomai 3.2.2 at commit ec804ffbd, kernel 5.4.228, cobalt 
> mode, on a Atom x5-E8000 x86-64.
> 
> Since I trarted to use the commit in message, I have a subtle segfault 
> when I close my application.
> 
> I managed to reduce the case in the simple code attached to the message.
> If you run the script, almost always the 
> xeno-test-session-segfault-mainproc generates a segfault at exit.
> 
> Analyzing the core, the involved code is always the same:
> 
> #0  0x00007ff729498400 in __hash_key () from /usr/lib/libcobalt.so.2
> #1  0x00007ff729498897 in hash_remove () from /usr/lib/libcobalt.so.2
> #2  0x00007ff7294b49ac in syncluster_delobj () from 
> /usr/lib/libcopperplate.so.0
> #3  0x00007ff7294ce06d in ?? () from /usr/lib/libalchemy.so.0
> #4  0x00007ff7294b6774 in semobj_destroy () from 
> /usr/lib/libcopperplate.so.0
> #5  0x00007ff7294ce216 in rt_sem_delete () from /usr/lib/libalchemy.so.0
> #6  0x0000559ca0298651 in SubDummyFn (arg=<optimized out>) at 
> main_proc.c:55
> #7  0x00007ff7294cca0a in ?? () from /usr/lib/libalchemy.so.0
> #8  0x00007ff7294b5ce9 in ?? () from /usr/lib/libcopperplate.so.0
> #9  0x00007ff729494f7e in ?? () from /usr/lib/libcobalt.so.2
> #10 0x00007ff729461ea4 in start_thread (arg=<optimized out>) at 
> pthread_create.c:477
> #11 0x00007ff72937859f in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> 
> Reverting the commit in object the segfault disappears.
> 
> Applying the commit to 3.1 branch results in the same problem.
> 
> Seems to be a corruption of the session memory shared between processes 
> caused by the atexit() callback introduced by the commit.
> 
> Some notes:
> - seems that it depends on when the xeno-test-session-segfault-secproc 
> exits: if exit happens during the SubDummy tasks start, the segfault is 
> generated, otherwise no
> - seems to be related with the number of calls of the 
> xeno-test-session-segfault-secproc: with only one call (change i == 2 
> with i == 1 at main_proc.c:139 and remove one call in the script), the 
> problem does not happen
> 
> Thanks in advance, regards.
> 

Hi all,

just a kind ping.

Is there anything I can do to try to solve this problem / do you have 
some indications?

Thanks in advance, regards

-- 
Mauro S.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Segfault on exit after commit ec804ffbd "lib/copperplate: Release main threadobj on exit"
  2023-02-28 12:10   ` Mauro S.
@ 2023-02-28 12:51     ` Jan Kiszka
  2023-03-21 18:38       ` Jan Kiszka
  0 siblings, 1 reply; 5+ messages in thread
From: Jan Kiszka @ 2023-02-28 12:51 UTC (permalink / raw)
  To: Mauro S., xenomai

On 28.02.23 13:10, Mauro S. wrote:
> Il 21/02/23 15:13, Mauro S. ha scritto:
>> Hi all,
>>
>> I'm using Xenomai 3.2.2 at commit ec804ffbd, kernel 5.4.228, cobalt
>> mode, on a Atom x5-E8000 x86-64.
>>
>> Since I trarted to use the commit in message, I have a subtle segfault
>> when I close my application.
>>
>> I managed to reduce the case in the simple code attached to the message.
>> If you run the script, almost always the
>> xeno-test-session-segfault-mainproc generates a segfault at exit.
>>
>> Analyzing the core, the involved code is always the same:
>>
>> #0  0x00007ff729498400 in __hash_key () from /usr/lib/libcobalt.so.2
>> #1  0x00007ff729498897 in hash_remove () from /usr/lib/libcobalt.so.2
>> #2  0x00007ff7294b49ac in syncluster_delobj () from
>> /usr/lib/libcopperplate.so.0
>> #3  0x00007ff7294ce06d in ?? () from /usr/lib/libalchemy.so.0
>> #4  0x00007ff7294b6774 in semobj_destroy () from
>> /usr/lib/libcopperplate.so.0
>> #5  0x00007ff7294ce216 in rt_sem_delete () from /usr/lib/libalchemy.so.0
>> #6  0x0000559ca0298651 in SubDummyFn (arg=<optimized out>) at
>> main_proc.c:55
>> #7  0x00007ff7294cca0a in ?? () from /usr/lib/libalchemy.so.0
>> #8  0x00007ff7294b5ce9 in ?? () from /usr/lib/libcopperplate.so.0
>> #9  0x00007ff729494f7e in ?? () from /usr/lib/libcobalt.so.2
>> #10 0x00007ff729461ea4 in start_thread (arg=<optimized out>) at
>> pthread_create.c:477
>> #11 0x00007ff72937859f in clone () at
>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
>>
>> Reverting the commit in object the segfault disappears.
>>
>> Applying the commit to 3.1 branch results in the same problem.
>>
>> Seems to be a corruption of the session memory shared between
>> processes caused by the atexit() callback introduced by the commit.
>>
>> Some notes:
>> - seems that it depends on when the xeno-test-session-segfault-secproc
>> exits: if exit happens during the SubDummy tasks start, the segfault
>> is generated, otherwise no
>> - seems to be related with the number of calls of the
>> xeno-test-session-segfault-secproc: with only one call (change i == 2
>> with i == 1 at main_proc.c:139 and remove one call in the script), the
>> problem does not happen
>>
>> Thanks in advance, regards.
>>
> 
> Hi all,
> 
> just a kind ping.
> 
> Is there anything I can do to try to solve this problem / do you have
> some indications?
> 
> Thanks in advance, regards
> 

Sorry, the original email didn't make it to my inbox, but I see it now
in the archives. Will try to have a look but would also welcome if
someone else can debug deeper.

Jan

-- 
Siemens AG, Technology
Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Segfault on exit after commit ec804ffbd "lib/copperplate: Release main threadobj on exit"
  2023-02-28 12:51     ` Jan Kiszka
@ 2023-03-21 18:38       ` Jan Kiszka
  2023-03-22 18:41         ` Mauro S.
  0 siblings, 1 reply; 5+ messages in thread
From: Jan Kiszka @ 2023-03-21 18:38 UTC (permalink / raw)
  To: Mauro S., xenomai

On 28.02.23 13:51, Jan Kiszka wrote:
> On 28.02.23 13:10, Mauro S. wrote:
>> Il 21/02/23 15:13, Mauro S. ha scritto:
>>> Hi all,
>>>
>>> I'm using Xenomai 3.2.2 at commit ec804ffbd, kernel 5.4.228, cobalt
>>> mode, on a Atom x5-E8000 x86-64.
>>>
>>> Since I trarted to use the commit in message, I have a subtle segfault
>>> when I close my application.
>>>
>>> I managed to reduce the case in the simple code attached to the message.
>>> If you run the script, almost always the
>>> xeno-test-session-segfault-mainproc generates a segfault at exit.
>>>
>>> Analyzing the core, the involved code is always the same:
>>>
>>> #0  0x00007ff729498400 in __hash_key () from /usr/lib/libcobalt.so.2
>>> #1  0x00007ff729498897 in hash_remove () from /usr/lib/libcobalt.so.2
>>> #2  0x00007ff7294b49ac in syncluster_delobj () from
>>> /usr/lib/libcopperplate.so.0
>>> #3  0x00007ff7294ce06d in ?? () from /usr/lib/libalchemy.so.0
>>> #4  0x00007ff7294b6774 in semobj_destroy () from
>>> /usr/lib/libcopperplate.so.0
>>> #5  0x00007ff7294ce216 in rt_sem_delete () from /usr/lib/libalchemy.so.0
>>> #6  0x0000559ca0298651 in SubDummyFn (arg=<optimized out>) at
>>> main_proc.c:55
>>> #7  0x00007ff7294cca0a in ?? () from /usr/lib/libalchemy.so.0
>>> #8  0x00007ff7294b5ce9 in ?? () from /usr/lib/libcopperplate.so.0
>>> #9  0x00007ff729494f7e in ?? () from /usr/lib/libcobalt.so.2
>>> #10 0x00007ff729461ea4 in start_thread (arg=<optimized out>) at
>>> pthread_create.c:477
>>> #11 0x00007ff72937859f in clone () at
>>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
>>>
>>> Reverting the commit in object the segfault disappears.
>>>
>>> Applying the commit to 3.1 branch results in the same problem.
>>>
>>> Seems to be a corruption of the session memory shared between
>>> processes caused by the atexit() callback introduced by the commit.
>>>
>>> Some notes:
>>> - seems that it depends on when the xeno-test-session-segfault-secproc
>>> exits: if exit happens during the SubDummy tasks start, the segfault
>>> is generated, otherwise no
>>> - seems to be related with the number of calls of the
>>> xeno-test-session-segfault-secproc: with only one call (change i == 2
>>> with i == 1 at main_proc.c:139 and remove one call in the script), the
>>> problem does not happen
>>>
>>> Thanks in advance, regards.
>>>
>>
>> Hi all,
>>
>> just a kind ping.
>>
>> Is there anything I can do to try to solve this problem / do you have
>> some indications?
>>
>> Thanks in advance, regards
>>
> 
> Sorry, the original email didn't make it to my inbox, but I see it now
> in the archives. Will try to have a look but would also welcome if
> someone else can debug deeper.
> 

Finally found time to debug. This seems to resolve the issue:

diff --git a/lib/copperplate/threadobj.c b/lib/copperplate/threadobj.c
index 6808bcf164..db18f4ffa3 100644
--- a/lib/copperplate/threadobj.c
+++ b/lib/copperplate/threadobj.c
@@ -1773,7 +1773,10 @@ int threadobj_set_schedprio(struct threadobj *thobj, int priority)
 #ifdef CONFIG_XENO_PSHARED
 static void main_exit(void)
 {
-	threadobj_free(threadobj_current());
+	struct threadobj *thobj = threadobj_current();
+
+	sysgroup_remove(thread, &thobj->memspec);
+	threadobj_free(thobj);
 }
 #endif
 

Can you confirm this? I hope we are not missing more things. Needs a 
second check.

Jan

-- 
Siemens AG, Technology
Competence Center Embedded Linux


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: Segfault on exit after commit ec804ffbd "lib/copperplate: Release main threadobj on exit"
  2023-03-21 18:38       ` Jan Kiszka
@ 2023-03-22 18:41         ` Mauro S.
  0 siblings, 0 replies; 5+ messages in thread
From: Mauro S. @ 2023-03-22 18:41 UTC (permalink / raw)
  To: xenomai; +Cc: Jan Kiszka

Il 21/03/23 19:38, Jan Kiszka ha scritto:
> On 28.02.23 13:51, Jan Kiszka wrote:
>> On 28.02.23 13:10, Mauro S. wrote:
>>> Il 21/02/23 15:13, Mauro S. ha scritto:
>>>> Hi all,
>>>>
>>>> I'm using Xenomai 3.2.2 at commit ec804ffbd, kernel 5.4.228, cobalt
>>>> mode, on a Atom x5-E8000 x86-64.
>>>>
>>>> Since I trarted to use the commit in message, I have a subtle segfault
>>>> when I close my application.
>>>>
>>>> I managed to reduce the case in the simple code attached to the message.
>>>> If you run the script, almost always the
>>>> xeno-test-session-segfault-mainproc generates a segfault at exit.
>>>>
>>>> Analyzing the core, the involved code is always the same:
>>>>
>>>> #0  0x00007ff729498400 in __hash_key () from /usr/lib/libcobalt.so.2
>>>> #1  0x00007ff729498897 in hash_remove () from /usr/lib/libcobalt.so.2
>>>> #2  0x00007ff7294b49ac in syncluster_delobj () from
>>>> /usr/lib/libcopperplate.so.0
>>>> #3  0x00007ff7294ce06d in ?? () from /usr/lib/libalchemy.so.0
>>>> #4  0x00007ff7294b6774 in semobj_destroy () from
>>>> /usr/lib/libcopperplate.so.0
>>>> #5  0x00007ff7294ce216 in rt_sem_delete () from /usr/lib/libalchemy.so.0
>>>> #6  0x0000559ca0298651 in SubDummyFn (arg=<optimized out>) at
>>>> main_proc.c:55
>>>> #7  0x00007ff7294cca0a in ?? () from /usr/lib/libalchemy.so.0
>>>> #8  0x00007ff7294b5ce9 in ?? () from /usr/lib/libcopperplate.so.0
>>>> #9  0x00007ff729494f7e in ?? () from /usr/lib/libcobalt.so.2
>>>> #10 0x00007ff729461ea4 in start_thread (arg=<optimized out>) at
>>>> pthread_create.c:477
>>>> #11 0x00007ff72937859f in clone () at
>>>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
>>>>
>>>> Reverting the commit in object the segfault disappears.
>>>>
>>>> Applying the commit to 3.1 branch results in the same problem.
>>>>
>>>> Seems to be a corruption of the session memory shared between
>>>> processes caused by the atexit() callback introduced by the commit.
>>>>
>>>> Some notes:
>>>> - seems that it depends on when the xeno-test-session-segfault-secproc
>>>> exits: if exit happens during the SubDummy tasks start, the segfault
>>>> is generated, otherwise no
>>>> - seems to be related with the number of calls of the
>>>> xeno-test-session-segfault-secproc: with only one call (change i == 2
>>>> with i == 1 at main_proc.c:139 and remove one call in the script), the
>>>> problem does not happen
>>>>
>>>> Thanks in advance, regards.
>>>>
>>>
>>> Hi all,
>>>
>>> just a kind ping.
>>>
>>> Is there anything I can do to try to solve this problem / do you have
>>> some indications?
>>>
>>> Thanks in advance, regards
>>>
>>
>> Sorry, the original email didn't make it to my inbox, but I see it now
>> in the archives. Will try to have a look but would also welcome if
>> someone else can debug deeper.
>>
> 
> Finally found time to debug. This seems to resolve the issue:
> 
> diff --git a/lib/copperplate/threadobj.c b/lib/copperplate/threadobj.c
> index 6808bcf164..db18f4ffa3 100644
> --- a/lib/copperplate/threadobj.c
> +++ b/lib/copperplate/threadobj.c
> @@ -1773,7 +1773,10 @@ int threadobj_set_schedprio(struct threadobj *thobj, int priority)
>   #ifdef CONFIG_XENO_PSHARED
>   static void main_exit(void)
>   {
> -	threadobj_free(threadobj_current());
> +	struct threadobj *thobj = threadobj_current();
> +
> +	sysgroup_remove(thread, &thobj->memspec);
> +	threadobj_free(thobj);
>   }
>   #endif
>   
> 
> Can you confirm this? I hope we are not missing more things. Needs a
> second check.
> 
> Jan
> 

Hi Jan,

thank you very much. I confirm that this patch fixes the problem.

Thanks again, best regards

-- 
Mauro S.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-03-22 18:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <233902a7-0911-06e8-a54b-dd7a1d568264.ref@yahoo.com>
2023-02-21 14:13 ` Segfault on exit after commit ec804ffbd "lib/copperplate: Release main threadobj on exit" Mauro S.
2023-02-28 12:10   ` Mauro S.
2023-02-28 12:51     ` Jan Kiszka
2023-03-21 18:38       ` Jan Kiszka
2023-03-22 18:41         ` Mauro S.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).