All of lore.kernel.org
 help / color / mirror / Atom feed
* [lttng-dev] shm leak in traced application?
@ 2022-02-23 14:38 zhenyu.ren via lttng-dev
  2022-02-23 15:08 ` [lttng-dev] 回复: " zhenyu.ren via lttng-dev
  0 siblings, 1 reply; 13+ messages in thread
From: zhenyu.ren via lttng-dev @ 2022-02-23 14:38 UTC (permalink / raw)
  To: lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 844 bytes --]

Hi, 
   There are many items such as "/dev/shm/ust-shm-consumer-81132 (deleted)" exist in lttng-sessiond fd spaces. I know it is the result of shm_open() and shm_unlnik() in create_posix_shm(). 
   However, today, I found these items also exist in a traced application which is a long-time running daemon. The most important thing I found is that there seems no reliable way to release share memory.
   I tried to kill lttng-sessiond but not always release share memory. Sometimes I need to kill the traced application to free share memory....But it is not a good idea to kill these applications.
   My questions are: 
   1. Is there any way to release share memory without killing any traced application?
   2. Is it normal that many items such as "/dev/shm/ust-shm-consumer-81132 (deleted)" exist in the traced application?

Thanks
zhenyu.ren

[-- Attachment #1.2: Type: text/html, Size: 4190 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [lttng-dev] 回复: shm leak in traced application?
  2022-02-23 14:38 [lttng-dev] shm leak in traced application? zhenyu.ren via lttng-dev
@ 2022-02-23 15:08 ` zhenyu.ren via lttng-dev
  2022-02-25  4:47   ` [lttng-dev] 回复: " zhenyu.ren via lttng-dev
  0 siblings, 1 reply; 13+ messages in thread
From: zhenyu.ren via lttng-dev @ 2022-02-23 15:08 UTC (permalink / raw)
  To: lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 1289 bytes --]

>"I found these items also exist in a traced application which is a long-time running daemon"
 Even if lttng-sessiond has been killed!!

Thanks
zhenyu.ren
------------------------------------------------------------------
发件人:zhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
发送时间:2022年2月23日(星期三) 22:44
收件人:lttng-dev <lttng-dev@lists.lttng.org>
主 题:[lttng-dev] shm leak in traced application?

Hi, 
   There are many items such as "/dev/shm/ust-shm-consumer-81132 (deleted)" exist in lttng-sessiond fd spaces. I know it is the result of shm_open() and shm_unlnik() in create_posix_shm(). 
   However, today, I found these items also exist in a traced application which is a long-time running daemon. The most important thing I found is that there seems no reliable way to release share memory.
   I tried to kill lttng-sessiond but not always release share memory. Sometimes I need to kill the traced application to free share memory....But it is not a good idea to kill these applications.
   My questions are: 
   1. Is there any way to release share memory without killing any traced application?
   2. Is it normal that many items such as "/dev/shm/ust-shm-consumer-81132 (deleted)" exist in the traced application?

Thanks
zhenyu.ren


[-- Attachment #1.2: Type: text/html, Size: 8258 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [lttng-dev] 回复: 回复: shm leak in traced application?
  2022-02-23 15:08 ` [lttng-dev] 回复: " zhenyu.ren via lttng-dev
@ 2022-02-25  4:47   ` zhenyu.ren via lttng-dev
  2022-02-25 14:21     ` Jonathan Rajotte-Julien via lttng-dev
  0 siblings, 1 reply; 13+ messages in thread
From: zhenyu.ren via lttng-dev @ 2022-02-25  4:47 UTC (permalink / raw)
  To: lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 2599 bytes --]

Hi, lttng-dev team 
   When lttng-sessiond exits, the ust applications should call lttng_ust_objd_table_owner_cleanup() and clean up all shm resource(unmap and close). Howerver I do find that the ust applications keep opening "all" of the shm fds("/dev/shm/ust-shm-consumer-81132 (deleted)") and do NOT free shm.
   If we run lttng-sessiond again, ust applications can get a new piece of shm and a new list of shm fds so double shm usages. Then if we kill lttng-sessiond, what the mostlikely happened is ust applications close the new list of shm fds and free new shm resource but keeping old shm still. In other word, we can not free this piece of shm unless we killing ust applications!!!
  So Is there any possilbe that ust applications failed calling lttng_ust_objd_table_owner_cleanup()? Do you have ever see this problem? Do you have any advice to free the shm without killling ust applications(I tried to dig into kernel shm_open and /dev/shm, but not found any ideas)?

Thanks in advance
zhenyu.ren 



------------------------------------------------------------------
发件人:zhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
发送时间:2022年2月23日(星期三) 23:09
收件人:lttng-dev <lttng-dev@lists.lttng.org>
主 题:[lttng-dev] 回复: shm leak in traced application?

>"I found these items also exist in a traced application which is a long-time running daemon"
 Even if lttng-sessiond has been killed!!

Thanks
zhenyu.ren
------------------------------------------------------------------
发件人:zhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
发送时间:2022年2月23日(星期三) 22:44
收件人:lttng-dev <lttng-dev@lists.lttng.org>
主 题:[lttng-dev] shm leak in traced application?

Hi, 
   There are many items such as "/dev/shm/ust-shm-consumer-81132 (deleted)" exist in lttng-sessiond fd spaces. I know it is the result of shm_open() and shm_unlnik() in create_posix_shm(). 
   However, today, I found these items also exist in a traced application which is a long-time running daemon. The most important thing I found is that there seems no reliable way to release share memory.
   I tried to kill lttng-sessiond but not always release share memory. Sometimes I need to kill the traced application to free share memory....But it is not a good idea to kill these applications.
   My questions are: 
   1. Is there any way to release share memory without killing any traced application?
   2. Is it normal that many items such as "/dev/shm/ust-shm-consumer-81132 (deleted)" exist in the traced application?

Thanks
zhenyu.ren



[-- Attachment #1.2: Type: text/html, Size: 12492 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [lttng-dev]  回复: 回复: shm leak in traced application?
  2022-02-25  4:47   ` [lttng-dev] 回复: " zhenyu.ren via lttng-dev
@ 2022-02-25 14:21     ` Jonathan Rajotte-Julien via lttng-dev
  2022-03-08  5:18       ` [lttng-dev] 回复: " zhenyu.ren via lttng-dev
  0 siblings, 1 reply; 13+ messages in thread
From: Jonathan Rajotte-Julien via lttng-dev @ 2022-02-25 14:21 UTC (permalink / raw)
  To: zhenyu.ren; +Cc: lttng-dev

Hi zhenyu.ren,

Please open a bug on our bug tracker and provide a reproducer against the latest
stable version (2.13.x).

https://bugs.lttng.org/

Please follow the guidelines: https://bugs.lttng.org/#Bug-reporting-guidelines

Cheers

On Fri, Feb 25, 2022 at 12:47:34PM +0800, zhenyu.ren via lttng-dev wrote:
> Hi, lttng-dev team 
>    When lttng-sessiond exits, the ust applications should call lttng_ust_objd_table_owner_cleanup() and clean up all shm resource(unmap and close). Howerver I do find that the ust applications keep opening "all" of the shm fds("/dev/shm/ust-shm-consumer-81132 (deleted)") and do NOT free shm.
>    If we run lttng-sessiond again, ust applications can get a new piece of shm and a new list of shm fds so double shm usages. Then if we kill lttng-sessiond, what the mostlikely happened is ust applications close the new list of shm fds and free new shm resource but keeping old shm still. In other word, we can not free this piece of shm unless we killing ust applications!!!
>   So Is there any possilbe that ust applications failed calling lttng_ust_objd_table_owner_cleanup()? Do you have ever see this problem? Do you have any advice to free the shm without killling ust applications(I tried to dig into kernel shm_open and /dev/shm, but not found any ideas)?
> 
> Thanks in advance
> zhenyu.ren 
> 
> 
> 
> ------------------------------------------------------------------
> 发件人:zhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
> 发送时间:2022年2月23日(星期三) 23:09
> 收件人:lttng-dev <lttng-dev@lists.lttng.org>
> 主 题:[lttng-dev] 回复: shm leak in traced application?
> 
> >"I found these items also exist in a traced application which is a long-time running daemon"
>  Even if lttng-sessiond has been killed!!
> 
> Thanks
> zhenyu.ren
> ------------------------------------------------------------------
> 发件人:zhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
> 发送时间:2022年2月23日(星期三) 22:44
> 收件人:lttng-dev <lttng-dev@lists.lttng.org>
> 主 题:[lttng-dev] shm leak in traced application?
> 
> Hi, 
>    There are many items such as "/dev/shm/ust-shm-consumer-81132 (deleted)" exist in lttng-sessiond fd spaces. I know it is the result of shm_open() and shm_unlnik() in create_posix_shm(). 
>    However, today, I found these items also exist in a traced application which is a long-time running daemon. The most important thing I found is that there seems no reliable way to release share memory.
>    I tried to kill lttng-sessiond but not always release share memory. Sometimes I need to kill the traced application to free share memory....But it is not a good idea to kill these applications.
>    My questions are: 
>    1. Is there any way to release share memory without killing any traced application?
>    2. Is it normal that many items such as "/dev/shm/ust-shm-consumer-81132 (deleted)" exist in the traced application?
> 
> Thanks
> zhenyu.ren
> 
> 

> _______________________________________________
> lttng-dev mailing list
> lttng-dev@lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


-- 
Jonathan Rajotte-Julien
EfficiOS
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [lttng-dev] 回复: 回复: 回复: shm leak in traced application?
  2022-02-25 14:21     ` Jonathan Rajotte-Julien via lttng-dev
@ 2022-03-08  5:18       ` zhenyu.ren via lttng-dev
  2022-03-08 15:17         ` Mathieu Desnoyers via lttng-dev
  0 siblings, 1 reply; 13+ messages in thread
From: zhenyu.ren via lttng-dev @ 2022-03-08  5:18 UTC (permalink / raw)
  To: Jonathan Rajotte-Julien; +Cc: lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 3721 bytes --]

Hi,
   In shm_object_table_append_shm()/alloc_shm(), why not calling FD_CLOEXEC fcntl() to shmfds? I guess this omission leads to shm fds leak.

Thanks
zhenyu.ren 
------------------------------------------------------------------
发件人:Jonathan Rajotte-Julien <jonathan.rajotte-julien@efficios.com>
发送时间:2022年2月25日(星期五) 22:31
收件人:zhenyu.ren <zhenyu.ren@aliyun.com>
抄 送:lttng-dev <lttng-dev@lists.lttng.org>
主 题:Re: [lttng-dev] 回复: 回复: shm leak in traced application?

Hi zhenyu.ren,

Please open a bug on our bug tracker and provide a reproducer against the latest
stable version (2.13.x).

https://bugs.lttng.org/

Please follow the guidelines: https://bugs.lttng.org/#Bug-reporting-guidelines

Cheers

On Fri, Feb 25, 2022 at 12:47:34PM +0800, zhenyu.ren via lttng-dev wrote:
> Hi, lttng-dev team 
>    When lttng-sessiond exits, the ust applications should call lttng_ust_objd_table_owner_cleanup() and clean up all shm resource(unmap and close). Howerver I do find that the ust applications keep opening "all" of the shm fds("/dev/shm/ust-shm-consumer-81132 (deleted)") and do NOT free shm.
>    If we run lttng-sessiond again, ust applications can get a new piece of shm and a new list of shm fds so double shm usages. Then if we kill lttng-sessiond, what the mostlikely happened is ust applications close the new list of shm fds and free new shm resource but keeping old shm still. In other word, we can not free this piece of shm unless we killing ust applications!!!
>   So Is there any possilbe that ust applications failed calling lttng_ust_objd_table_owner_cleanup()? Do you have ever see this problem? Do you have any advice to free the shm without killling ust applications(I tried to dig into kernel shm_open and /dev/shm, but not found any ideas)?
> 
> Thanks in advance
> zhenyu.ren 
> 
> 
> 
> ------------------------------------------------------------------
> 发件人:zhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
> 发送时间:2022年2月23日(星期三) 23:09
> 收件人:lttng-dev <lttng-dev@lists.lttng.org>
> 主 题:[lttng-dev] 回复: shm leak in traced application?
> 
> >"I found these items also exist in a traced application which is a long-time running daemon"
>  Even if lttng-sessiond has been killed!!
> 
> Thanks
> zhenyu.ren
> ------------------------------------------------------------------
> 发件人:zhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
> 发送时间:2022年2月23日(星期三) 22:44
> 收件人:lttng-dev <lttng-dev@lists.lttng.org>
> 主 题:[lttng-dev] shm leak in traced application?
> 
> Hi, 
>    There are many items such as "/dev/shm/ust-shm-consumer-81132 (deleted)" exist in lttng-sessiond fd spaces. I know it is the result of shm_open() and shm_unlnik() in create_posix_shm(). 
>    However, today, I found these items also exist in a traced application which is a long-time running daemon. The most important thing I found is that there seems no reliable way to release share memory.
>    I tried to kill lttng-sessiond but not always release share memory. Sometimes I need to kill the traced application to free share memory....But it is not a good idea to kill these applications.
>    My questions are: 
>    1. Is there any way to release share memory without killing any traced application?
>    2. Is it normal that many items such as "/dev/shm/ust-shm-consumer-81132 (deleted)" exist in the traced application?
> 
> Thanks
> zhenyu.ren
> 
> 

> _______________________________________________
> lttng-dev mailing list
> lttng-dev@lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev


-- 
Jonathan Rajotte-Julien
EfficiOS

[-- Attachment #1.2: Type: text/html, Size: 7970 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [lttng-dev]  回复: 回复: 回复: shm leak in traced application?
  2022-03-08  5:18       ` [lttng-dev] 回复: " zhenyu.ren via lttng-dev
@ 2022-03-08 15:17         ` Mathieu Desnoyers via lttng-dev
  2022-03-09  1:29           ` [lttng-dev] 回复: " zhenyu.ren via lttng-dev
  0 siblings, 1 reply; 13+ messages in thread
From: Mathieu Desnoyers via lttng-dev @ 2022-03-08 15:17 UTC (permalink / raw)
  To: zhenyu.ren; +Cc: lttng-dev



----- On Mar 8, 2022, at 12:18 AM, lttng-dev lttng-dev@lists.lttng.org wrote:

> Hi,
> In shm_object_table_append_shm()/alloc_shm(), why not calling FD_CLOEXEC fcntl()
> to shmfds? I guess this omission leads to shm fds leak.

Those file descriptors are created when received by ustcomm_recv_fds_unix_sock, and
immediately after creation they are set as FD_CLOEXEC.

We should continue this discussion in the bug tracker as suggested by Jonathan.
It would greatly help if you can provide a small reproducer.

Thanks,

Mathieu


> Thanks
> zhenyu.ren

>> ------------------------------------------------------------------
>> 发件人:Jonathan Rajotte-Julien <jonathan.rajotte-julien@efficios.com>
>> 发送时间:2022年2月25日(星期五) 22:31
>> 收件人:zhenyu.ren <zhenyu.ren@aliyun.com>
>> 抄 送:lttng-dev <lttng-dev@lists.lttng.org>
>> 主 题:Re: [lttng-dev] 回复: 回复: shm leak in traced application?

>> Hi zhenyu.ren,

>> Please open a bug on our bug tracker and provide a reproducer against the latest
>> stable version (2.13.x).

>> https://bugs.lttng.org/

>> Please follow the guidelines: https://bugs.lttng.org/#Bug-reporting-guidelines

>> Cheers

>> On Fri, Feb 25, 2022 at 12:47:34PM +0800, zhenyu.ren via lttng-dev wrote:
>> > Hi, lttng-dev team
>>> When lttng-sessiond exits, the ust applications should call
>>> lttng_ust_objd_table_owner_cleanup() and clean up all shm resource(unmap and
>>> close). Howerver I do find that the ust applications keep opening "all" of the
>> > shm fds("/dev/shm/ust-shm-consumer-81132 (deleted)") and do NOT free shm.
>>> If we run lttng-sessiond again, ust applications can get a new piece of shm and
>>> a new list of shm fds so double shm usages. Then if we kill lttng-sessiond,
>>> what the mostlikely happened is ust applications close the new list of shm fds
>>> and free new shm resource but keeping old shm still. In other word, we can not
>> > free this piece of shm unless we killing ust applications!!!
>>> So Is there any possilbe that ust applications failed calling
>>> lttng_ust_objd_table_owner_cleanup()? Do you have ever see this problem? Do you
>>> have any advice to free the shm without killling ust applications(I tried to
>> > dig into kernel shm_open and /dev/shm, but not found any ideas)?

>> > Thanks in advance
>> > zhenyu.ren



>> > ------------------------------------------------------------------
>> > 发件人:zhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
>> > 发送时间:2022年2月23日(星期三) 23:09
>> > 收件人:lttng-dev <lttng-dev@lists.lttng.org>
>> > 主 题:[lttng-dev] 回复: shm leak in traced application?

>>> >"I found these items also exist in a traced application which is a long-time
>> > >running daemon"
>> > Even if lttng-sessiond has been killed!!

>> > Thanks
>> > zhenyu.ren
>> > ------------------------------------------------------------------
>> > 发件人:zhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
>> > 发送时间:2022年2月23日(星期三) 22:44
>> > 收件人:lttng-dev <lttng-dev@lists.lttng.org>
>> > 主 题:[lttng-dev] shm leak in traced application?

>> > Hi,
>>> There are many items such as "/dev/shm/ust-shm-consumer-81132 (deleted)" exist
>>> in lttng-sessiond fd spaces. I know it is the result of shm_open() and
>> > shm_unlnik() in create_posix_shm().
>>> However, today, I found these items also exist in a traced application which is
>>> a long-time running daemon. The most important thing I found is that there
>> > seems no reliable way to release share memory.
>>> I tried to kill lttng-sessiond but not always release share memory. Sometimes I
>>> need to kill the traced application to free share memory....But it is not a
>> > good idea to kill these applications.
>> > My questions are:
>>> 1. Is there any way to release share memory without killing any traced
>> > application?
>>> 2. Is it normal that many items such as "/dev/shm/ust-shm-consumer-81132
>> > (deleted)" exist in the traced application?

>> > Thanks
>> > zhenyu.ren



>> > _______________________________________________
>> > lttng-dev mailing list
>> > lttng-dev@lists.lttng.org
>> > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

>> --
>> Jonathan Rajotte-Julien
>> EfficiOS
> _______________________________________________
> lttng-dev mailing list
> lttng-dev@lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [lttng-dev] 回复: 回复: 回复: 回复: shm leak in traced application?
  2022-03-08 15:17         ` Mathieu Desnoyers via lttng-dev
@ 2022-03-09  1:29           ` zhenyu.ren via lttng-dev
  2022-03-09 16:37             ` Mathieu Desnoyers via lttng-dev
  0 siblings, 1 reply; 13+ messages in thread
From: zhenyu.ren via lttng-dev @ 2022-03-09  1:29 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 5049 bytes --]

Thanks a  lot for reply. I do not reply it in bug tracker since I have not gotten a reliable way to reproduce the leak case. 
------------------------------------------------------------------
发件人:Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
发送时间:2022年3月8日(星期二) 23:26
收件人:zhenyu.ren <zhenyu.ren@aliyun.com>
抄 送:Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>; lttng-dev <lttng-dev@lists.lttng.org>
主 题:Re: [lttng-dev] 回复: 回复: 回复: shm leak in traced application?



----- On Mar 8, 2022, at 12:18 AM, lttng-dev lttng-dev@lists.lttng.org wrote:

> Hi,
> In shm_object_table_append_shm()/alloc_shm(), why not calling FD_CLOEXEC fcntl()
> to shmfds? I guess this omission leads to shm fds leak.

Those file descriptors are created when received by ustcomm_recv_fds_unix_sock, and
immediately after creation they are set as FD_CLOEXEC.

We should continue this discussion in the bug tracker as suggested by Jonathan.
It would greatly help if you can provide a small reproducer.

Thanks,

Mathieu


> Thanks
> zhenyu.ren

>> ------------------------------------------------------------------
>> 发件人:Jonathan Rajotte-Julien <jonathan.rajotte-julien@efficios.com>
>> 发送时间:2022年2月25日(星期五) 22:31
>> 收件人:zhenyu.ren <zhenyu.ren@aliyun.com>
>> 抄 送:lttng-dev <lttng-dev@lists.lttng.org>
>> 主 题:Re: [lttng-dev] 回复: 回复: shm leak in traced application?

>> Hi zhenyu.ren,

>> Please open a bug on our bug tracker and provide a reproducer against the latest
>> stable version (2.13.x).

>> https://bugs.lttng.org/

>> Please follow the guidelines: https://bugs.lttng.org/#Bug-reporting-guidelines

>> Cheers

>> On Fri, Feb 25, 2022 at 12:47:34PM +0800, zhenyu.ren via lttng-dev wrote:
>> > Hi, lttng-dev team
>>> When lttng-sessiond exits, the ust applications should call
>>> lttng_ust_objd_table_owner_cleanup() and clean up all shm resource(unmap and
>>> close). Howerver I do find that the ust applications keep opening "all" of the
>> > shm fds("/dev/shm/ust-shm-consumer-81132 (deleted)") and do NOT free shm.
>>> If we run lttng-sessiond again, ust applications can get a new piece of shm and
>>> a new list of shm fds so double shm usages. Then if we kill lttng-sessiond,
>>> what the mostlikely happened is ust applications close the new list of shm fds
>>> and free new shm resource but keeping old shm still. In other word, we can not
>> > free this piece of shm unless we killing ust applications!!!
>>> So Is there any possilbe that ust applications failed calling
>>> lttng_ust_objd_table_owner_cleanup()? Do you have ever see this problem? Do you
>>> have any advice to free the shm without killling ust applications(I tried to
>> > dig into kernel shm_open and /dev/shm, but not found any ideas)?

>> > Thanks in advance
>> > zhenyu.ren



>> > ------------------------------------------------------------------
>> > 发件人:zhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
>> > 发送时间:2022年2月23日(星期三) 23:09
>> > 收件人:lttng-dev <lttng-dev@lists.lttng.org>
>> > 主 题:[lttng-dev] 回复: shm leak in traced application?

>>> >"I found these items also exist in a traced application which is a long-time
>> > >running daemon"
>> > Even if lttng-sessiond has been killed!!

>> > Thanks
>> > zhenyu.ren
>> > ------------------------------------------------------------------
>> > 发件人:zhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
>> > 发送时间:2022年2月23日(星期三) 22:44
>> > 收件人:lttng-dev <lttng-dev@lists.lttng.org>
>> > 主 题:[lttng-dev] shm leak in traced application?

>> > Hi,
>>> There are many items such as "/dev/shm/ust-shm-consumer-81132 (deleted)" exist
>>> in lttng-sessiond fd spaces. I know it is the result of shm_open() and
>> > shm_unlnik() in create_posix_shm().
>>> However, today, I found these items also exist in a traced application which is
>>> a long-time running daemon. The most important thing I found is that there
>> > seems no reliable way to release share memory.
>>> I tried to kill lttng-sessiond but not always release share memory. Sometimes I
>>> need to kill the traced application to free share memory....But it is not a
>> > good idea to kill these applications.
>> > My questions are:
>>> 1. Is there any way to release share memory without killing any traced
>> > application?
>>> 2. Is it normal that many items such as "/dev/shm/ust-shm-consumer-81132
>> > (deleted)" exist in the traced application?

>> > Thanks
>> > zhenyu.ren



>> > _______________________________________________
>> > lttng-dev mailing list
>> > lttng-dev@lists.lttng.org
>> > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

>> --
>> Jonathan Rajotte-Julien
>> EfficiOS
> _______________________________________________
> lttng-dev mailing list
> lttng-dev@lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

[-- Attachment #1.2: Type: text/html, Size: 9992 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [lttng-dev]  回复: 回复: 回复: 回复: shm leak in traced application?
  2022-03-09  1:29           ` [lttng-dev] 回复: " zhenyu.ren via lttng-dev
@ 2022-03-09 16:37             ` Mathieu Desnoyers via lttng-dev
  2022-03-09 17:07               ` Mathieu Desnoyers via lttng-dev
  2022-03-10  3:19               ` [lttng-dev] 回复:回复: " zhenyu.ren via lttng-dev
  0 siblings, 2 replies; 13+ messages in thread
From: Mathieu Desnoyers via lttng-dev @ 2022-03-09 16:37 UTC (permalink / raw)
  To: zhenyu.ren; +Cc: lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 5988 bytes --]

When this happpens, is the process holding a single (or very few) shm file references, or references to many 
shm files ? 

I wonder if you end up in a scenario where an application very frequently performs exec(), and therefore 
sometimes the exec() will happen in the window between the unix socket file descriptor reception and 
call to fcntl FD_CLOEXEC. 

Thanks, 

Mathieu 

----- On Mar 8, 2022, at 8:29 PM, zhenyu.ren <zhenyu.ren@aliyun.com> wrote: 

> Thanks a lot for reply. I do not reply it in bug tracker since I have not gotten
> a reliable way to reproduce the leak case.

>> ------------------------------------------------------------------
>> 发件人:Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
>> 发送时间:2022年3月8日(星期二) 23:26
>> 收件人:zhenyu.ren <zhenyu.ren@aliyun.com>
>> 抄 送:Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>; lttng-dev
>> <lttng-dev@lists.lttng.org>
>> 主 题:Re: [lttng-dev] 回复: 回复: 回复: shm leak in traced application?

>> ----- On Mar 8, 2022, at 12:18 AM, lttng-dev lttng-dev@lists.lttng.org wrote:

>> > Hi,
>> > In shm_object_table_append_shm()/alloc_shm(), why not calling FD_CLOEXEC fcntl()
>> > to shmfds? I guess this omission leads to shm fds leak.

>> Those file descriptors are created when received by ustcomm_recv_fds_unix_sock,
>> and
>> immediately after creation they are set as FD_CLOEXEC.

>> We should continue this discussion in the bug tracker as suggested by Jonathan.
>> It would greatly help if you can provide a small reproducer.

>> Thanks,

>> Mathieu

>> > Thanks
>> > zhenyu.ren

>> >> ------------------------------------------------------------------
>> >> 发件人:Jonathan Rajotte-Julien <jonathan.rajotte-julien@efficios.com>
>> >> 发送时间:2022年2月25日(星期五) 22:31
>> >> 收件人:zhenyu.ren <zhenyu.ren@aliyun.com>
>> >> 抄 送:lttng-dev <lttng-dev@lists.lttng.org>
>> >> 主 题:Re: [lttng-dev] 回复: 回复: shm leak in traced application?

>> >> Hi zhenyu.ren,

>> >> Please open a bug on our bug tracker and provide a reproducer against the latest
>> >> stable version (2.13.x).

>> >> https://bugs.lttng.org/

>> >> Please follow the guidelines: https://bugs.lttng.org/#Bug-reporting-guidelines

>> >> Cheers

>> >> On Fri, Feb 25, 2022 at 12:47:34PM +0800, zhenyu.ren via lttng-dev wrote:
>> >> > Hi, lttng-dev team
>> >>> When lttng-sessiond exits, the ust applications should call
>> >>> lttng_ust_objd_table_owner_cleanup() and clean up all shm resource(unmap and
>> >>> close). Howerver I do find that the ust applications keep opening "all" of the
>> >> > shm fds("/dev/shm/ust-shm-consumer-81132 (deleted)") and do NOT free shm.
>> >>> If we run lttng-sessiond again, ust applications can get a new piece of shm and
>> >>> a new list of shm fds so double shm usages. Then if we kill lttng-sessiond,
>> >>> what the mostlikely happened is ust applications close the new list of shm fds
>> >>> and free new shm resource but keeping old shm still. In other word, we can not
>> >> > free this piece of shm unless we killing ust applications!!!
>> >>> So Is there any possilbe that ust applications failed calling
>> >>> lttng_ust_objd_table_owner_cleanup()? Do you have ever see this problem? Do you
>> >>> have any advice to free the shm without killling ust applications(I tried to
>> >> > dig into kernel shm_open and /dev/shm, but not found any ideas)?

>> >> > Thanks in advance
>> >> > zhenyu.ren

>> >> > ------------------------------------------------------------------
>> >> > 发件人:zhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
>> >> > 发送时间:2022年2月23日(星期三) 23:09
>> >> > 收件人:lttng-dev <lttng-dev@lists.lttng.org>
>> >> > 主 题:[lttng-dev] 回复: shm leak in traced application?

>> >>> >"I found these items also exist in a traced application which is a long-time
>> >> > >running daemon"
>> >> > Even if lttng-sessiond has been killed!!

>> >> > Thanks
>> >> > zhenyu.ren
>> >> > ------------------------------------------------------------------
>> >> > 发件人:zhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
>> >> > 发送时间:2022年2月23日(星期三) 22:44
>> >> > 收件人:lttng-dev <lttng-dev@lists.lttng.org>
>> >> > 主 题:[lttng-dev] shm leak in traced application?

>> >> > Hi,
>> >>> There are many items such as "/dev/shm/ust-shm-consumer-81132 (deleted)" exist
>> >>> in lttng-sessiond fd spaces. I know it is the result of shm_open() and
>> >> > shm_unlnik() in create_posix_shm().
>> >>> However, today, I found these items also exist in a traced application which is
>> >>> a long-time running daemon. The most important thing I found is that there
>> >> > seems no reliable way to release share memory.
>> >>> I tried to kill lttng-sessiond but not always release share memory. Sometimes I
>> >>> need to kill the traced application to free share memory....But it is not a
>> >> > good idea to kill these applications.
>> >> > My questions are:
>> >>> 1. Is there any way to release share memory without killing any traced
>> >> > application?
>> >>> 2. Is it normal that many items such as "/dev/shm/ust-shm-consumer-81132
>> >> > (deleted)" exist in the traced application?

>> >> > Thanks
>> >> > zhenyu.ren

>> >> > _______________________________________________
>> >> > lttng-dev mailing list
>> >> > lttng-dev@lists.lttng.org
>> >> > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

>> >> --
>> >> Jonathan Rajotte-Julien
>> >> EfficiOS
>> > _______________________________________________
>> > lttng-dev mailing list
>> > lttng-dev@lists.lttng.org
>> > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
>> --
>> Mathieu Desnoyers
>> EfficiOS Inc.
>> http://www.efficios.com
-- 
Mathieu Desnoyers 
EfficiOS Inc. 
http://www.efficios.com 

[-- Attachment #1.2: Type: text/html, Size: 11115 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [lttng-dev]  回复: 回复: 回复: 回复: shm leak in traced application?
  2022-03-09 16:37             ` Mathieu Desnoyers via lttng-dev
@ 2022-03-09 17:07               ` Mathieu Desnoyers via lttng-dev
  2022-03-10  3:19               ` [lttng-dev] 回复:回复: " zhenyu.ren via lttng-dev
  1 sibling, 0 replies; 13+ messages in thread
From: Mathieu Desnoyers via lttng-dev @ 2022-03-09 17:07 UTC (permalink / raw)
  To: zhenyu.ren; +Cc: lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 6427 bytes --]

Hi Zhenyu, 

Can you try this fix please ? 

https://review.lttng.org/c/lttng-ust/+/7530 

And let me know how it goes. 

Thanks, 

Mathieu 

----- On Mar 9, 2022, at 11:37 AM, Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote: 

> When this happpens, is the process holding a single (or very few) shm file
> references, or references to many
> shm files ?

> I wonder if you end up in a scenario where an application very frequently
> performs exec(), and therefore
> sometimes the exec() will happen in the window between the unix socket file
> descriptor reception and
> call to fcntl FD_CLOEXEC.

> Thanks,

> Mathieu

> ----- On Mar 8, 2022, at 8:29 PM, zhenyu.ren <zhenyu.ren@aliyun.com> wrote:

>> Thanks a lot for reply. I do not reply it in bug tracker since I have not gotten
>> a reliable way to reproduce the leak case.

>>> ------------------------------------------------------------------
>>> 发件人:Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
>>> 发送时间:2022年3月8日(星期二) 23:26
>>> 收件人:zhenyu.ren <zhenyu.ren@aliyun.com>
>>> 抄 送:Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>; lttng-dev
>>> <lttng-dev@lists.lttng.org>
>>> 主 题:Re: [lttng-dev] 回复: 回复: 回复: shm leak in traced application?

>>> ----- On Mar 8, 2022, at 12:18 AM, lttng-dev lttng-dev@lists.lttng.org wrote:

>>> > Hi,
>>> > In shm_object_table_append_shm()/alloc_shm(), why not calling FD_CLOEXEC fcntl()
>>> > to shmfds? I guess this omission leads to shm fds leak.

>>> Those file descriptors are created when received by ustcomm_recv_fds_unix_sock,
>>> and
>>> immediately after creation they are set as FD_CLOEXEC.

>>> We should continue this discussion in the bug tracker as suggested by Jonathan.
>>> It would greatly help if you can provide a small reproducer.

>>> Thanks,

>>> Mathieu

>>> > Thanks
>>> > zhenyu.ren

>>> >> ------------------------------------------------------------------
>>> >> 发件人:Jonathan Rajotte-Julien <jonathan.rajotte-julien@efficios.com>
>>> >> 发送时间:2022年2月25日(星期五) 22:31
>>> >> 收件人:zhenyu.ren <zhenyu.ren@aliyun.com>
>>> >> 抄 送:lttng-dev <lttng-dev@lists.lttng.org>
>>> >> 主 题:Re: [lttng-dev] 回复: 回复: shm leak in traced application?

>>> >> Hi zhenyu.ren,

>>> >> Please open a bug on our bug tracker and provide a reproducer against the latest
>>> >> stable version (2.13.x).

>>> >> https://bugs.lttng.org/

>>> >> Please follow the guidelines: https://bugs.lttng.org/#Bug-reporting-guidelines

>>> >> Cheers

>>> >> On Fri, Feb 25, 2022 at 12:47:34PM +0800, zhenyu.ren via lttng-dev wrote:
>>> >> > Hi, lttng-dev team
>>> >>> When lttng-sessiond exits, the ust applications should call
>>> >>> lttng_ust_objd_table_owner_cleanup() and clean up all shm resource(unmap and
>>> >>> close). Howerver I do find that the ust applications keep opening "all" of the
>>> >> > shm fds("/dev/shm/ust-shm-consumer-81132 (deleted)") and do NOT free shm.
>>> >>> If we run lttng-sessiond again, ust applications can get a new piece of shm and
>>> >>> a new list of shm fds so double shm usages. Then if we kill lttng-sessiond,
>>> >>> what the mostlikely happened is ust applications close the new list of shm fds
>>> >>> and free new shm resource but keeping old shm still. In other word, we can not
>>> >> > free this piece of shm unless we killing ust applications!!!
>>> >>> So Is there any possilbe that ust applications failed calling
>>> >>> lttng_ust_objd_table_owner_cleanup()? Do you have ever see this problem? Do you
>>> >>> have any advice to free the shm without killling ust applications(I tried to
>>> >> > dig into kernel shm_open and /dev/shm, but not found any ideas)?

>>> >> > Thanks in advance
>>> >> > zhenyu.ren

>>> >> > ------------------------------------------------------------------
>>> >> > 发件人:zhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
>>> >> > 发送时间:2022年2月23日(星期三) 23:09
>>> >> > 收件人:lttng-dev <lttng-dev@lists.lttng.org>
>>> >> > 主 题:[lttng-dev] 回复: shm leak in traced application?

>>> >>> >"I found these items also exist in a traced application which is a long-time
>>> >> > >running daemon"
>>> >> > Even if lttng-sessiond has been killed!!

>>> >> > Thanks
>>> >> > zhenyu.ren
>>> >> > ------------------------------------------------------------------
>>> >> > 发件人:zhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
>>> >> > 发送时间:2022年2月23日(星期三) 22:44
>>> >> > 收件人:lttng-dev <lttng-dev@lists.lttng.org>
>>> >> > 主 题:[lttng-dev] shm leak in traced application?

>>> >> > Hi,
>>> >>> There are many items such as "/dev/shm/ust-shm-consumer-81132 (deleted)" exist
>>> >>> in lttng-sessiond fd spaces. I know it is the result of shm_open() and
>>> >> > shm_unlnik() in create_posix_shm().
>>> >>> However, today, I found these items also exist in a traced application which is
>>> >>> a long-time running daemon. The most important thing I found is that there
>>> >> > seems no reliable way to release share memory.
>>> >>> I tried to kill lttng-sessiond but not always release share memory. Sometimes I
>>> >>> need to kill the traced application to free share memory....But it is not a
>>> >> > good idea to kill these applications.
>>> >> > My questions are:
>>> >>> 1. Is there any way to release share memory without killing any traced
>>> >> > application?
>>> >>> 2. Is it normal that many items such as "/dev/shm/ust-shm-consumer-81132
>>> >> > (deleted)" exist in the traced application?

>>> >> > Thanks
>>> >> > zhenyu.ren

>>> >> > _______________________________________________
>>> >> > lttng-dev mailing list
>>> >> > lttng-dev@lists.lttng.org
>>> >> > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

>>> >> --
>>> >> Jonathan Rajotte-Julien
>>> >> EfficiOS
>>> > _______________________________________________
>>> > lttng-dev mailing list
>>> > lttng-dev@lists.lttng.org
>>> > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
>>> --
>>> Mathieu Desnoyers
>>> EfficiOS Inc.
>>> http://www.efficios.com
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com

-- 
Mathieu Desnoyers 
EfficiOS Inc. 
http://www.efficios.com 

[-- Attachment #1.2: Type: text/html, Size: 11943 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [lttng-dev] 回复:回复: 回复: 回复: 回复: shm leak in traced application?
  2022-03-09 16:37             ` Mathieu Desnoyers via lttng-dev
  2022-03-09 17:07               ` Mathieu Desnoyers via lttng-dev
@ 2022-03-10  3:19               ` zhenyu.ren via lttng-dev
  2022-03-10  4:24                 ` [lttng-dev] 回复: " zhenyu.ren via lttng-dev
  1 sibling, 1 reply; 13+ messages in thread
From: zhenyu.ren via lttng-dev @ 2022-03-10  3:19 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 7622 bytes --]

>When this happpens, is the process holding a single (or very few) shm file references, or references to many shm files ?

It is holding "all" of shm files' reference , neither a single one nor some few ones.

In fact, yesterday, I tried to fix it as the following and it seems work.

--- a/lttng-ust/libringbuffer/shm.c
+++ b/lttng-ust/libringbuffer/shm.c
@@ -32,7 +32,6 @@
 #include <lttng/align.h>
 #include <limits.h>
 #include <helper.h>
-
 /*
  * Ensure we have the required amount of space available by writing 0
  * into the entire buffer. Not doing so can trigger SIGBUS when going
@@ -122,6 +121,12 @@ struct shm_object *_shm_object_table_alloc_shm(struct shm_object_table *table,
        /* create shm */

        shmfd = stream_fd;
+    if (shmfd >= 0) {
+     ret = fcntl(shmfd, F_SETFD, FD_CLOEXEC);
+     if (ret < 0) {
+   PERROR("fcntl shmfd FD_CLOEXEC");
+     }
+    }
        ret = zero_file(shmfd, memory_map_size);
        if (ret) {
                PERROR("zero_file");
@@ -272,15 +277,22 @@ struct shm_object *shm_object_table_append_shm(struct shm_object_table *table,
        obj->shm_fd = shm_fd;
        obj->shm_fd_ownership = 1;

+    if (shm_fd >= 0) {
+     ret = fcntl(shm_fd, F_SETFD, FD_CLOEXEC);
+     if (ret < 0) {
+   PERROR("fcntl shmfd FD_CLOEXEC");
+   //goto error_fcntl;
+     }
+    }
        ret = fcntl(obj->wait_fd[1], F_SETFD, FD_CLOEXEC);
        if (ret < 0) {

    As it shows, wait_fd[1] has been set FD_CLOEXEC by fcntl() but not shm_fd. Why your patch do with wait_fd but not shm_fd? As far as I know, wait_fd is just a pipe and it seems not related to shm resource.







------------------------------------------------------------------
发件人:Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
发送时间:2022年3月10日(星期四) 00:46
收件人:zhenyu.ren <zhenyu.ren@aliyun.com>
抄 送:Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>; lttng-dev <lttng-dev@lists.lttng.org>
主 题:Re: 回复:[lttng-dev] 回复: 回复: 回复: shm leak in traced application?

When this happpens, is the process holding a single (or very few) shm file references, or references to many
shm files ?

I wonder if you end up in a scenario where an application very frequently performs exec(), and therefore
sometimes the exec() will happen in the window between the unix socket file descriptor reception and
call to fcntl FD_CLOEXEC.

Thanks,

Mathieu

----- On Mar 8, 2022, at 8:29 PM, zhenyu.ren <zhenyu.ren@aliyun.com> wrote:

Thanks a  lot for reply. I do not reply it in bug tracker since I have not gotten a reliable way to reproduce the leak case. 
------------------------------------------------------------------
发件人:Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
发送时间:2022年3月8日(星期二) 23:26
收件人:zhenyu.ren <zhenyu.ren@aliyun.com>
抄 送:Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>; lttng-dev <lttng-dev@lists.lttng.org>
主 题:Re: [lttng-dev] 回复: 回复: 回复: shm leak in traced application?



----- On Mar 8, 2022, at 12:18 AM, lttng-dev lttng-dev@lists.lttng.org wrote:

> Hi,
> In shm_object_table_append_shm()/alloc_shm(), why not calling FD_CLOEXEC fcntl()
> to shmfds? I guess this omission leads to shm fds leak.

Those file descriptors are created when received by ustcomm_recv_fds_unix_sock, and
immediately after creation they are set as FD_CLOEXEC.

We should continue this discussion in the bug tracker as suggested by Jonathan.
It would greatly help if you can provide a small reproducer.

Thanks,

Mathieu


> Thanks
> zhenyu.ren

>> ------------------------------------------------------------------
>> 发件人:Jonathan Rajotte-Julien <jonathan.rajotte-julien@efficios.com>
>> 发送时间:2022年2月25日(星期五) 22:31
>> 收件人:zhenyu.ren <zhenyu.ren@aliyun.com>
>> 抄 送:lttng-dev <lttng-dev@lists.lttng.org>
>> 主 题:Re: [lttng-dev] 回复: 回复: shm leak in traced application?

>> Hi zhenyu.ren,

>> Please open a bug on our bug tracker and provide a reproducer against the latest
>> stable version (2.13.x).

>> https://bugs.lttng.org/

>> Please follow the guidelines: https://bugs.lttng.org/#Bug-reporting-guidelines

>> Cheers

>> On Fri, Feb 25, 2022 at 12:47:34PM +0800, zhenyu.ren via lttng-dev wrote:
>> > Hi, lttng-dev team
>>> When lttng-sessiond exits, the ust applications should call
>>> lttng_ust_objd_table_owner_cleanup() and clean up all shm resource(unmap and
>>> close). Howerver I do find that the ust applications keep opening "all" of the
>> > shm fds("/dev/shm/ust-shm-consumer-81132 (deleted)") and do NOT free shm.
>>> If we run lttng-sessiond again, ust applications can get a new piece of shm and
>>> a new list of shm fds so double shm usages. Then if we kill lttng-sessiond,
>>> what the mostlikely happened is ust applications close the new list of shm fds
>>> and free new shm resource but keeping old shm still. In other word, we can not
>> > free this piece of shm unless we killing ust applications!!!
>>> So Is there any possilbe that ust applications failed calling
>>> lttng_ust_objd_table_owner_cleanup()? Do you have ever see this problem? Do you
>>> have any advice to free the shm without killling ust applications(I tried to
>> > dig into kernel shm_open and /dev/shm, but not found any ideas)?

>> > Thanks in advance
>> > zhenyu.ren



>> > ------------------------------------------------------------------
>> > 发件人:zhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
>> > 发送时间:2022年2月23日(星期三) 23:09
>> > 收件人:lttng-dev <lttng-dev@lists.lttng.org>
>> > 主 题:[lttng-dev] 回复: shm leak in traced application?

>>> >"I found these items also exist in a traced application which is a long-time
>> > >running daemon"
>> > Even if lttng-sessiond has been killed!!

>> > Thanks
>> > zhenyu.ren
>> > ------------------------------------------------------------------
>> > 发件人:zhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
>> > 发送时间:2022年2月23日(星期三) 22:44
>> > 收件人:lttng-dev <lttng-dev@lists.lttng.org>
>> > 主 题:[lttng-dev] shm leak in traced application?

>> > Hi,
>>> There are many items such as "/dev/shm/ust-shm-consumer-81132 (deleted)" exist
>>> in lttng-sessiond fd spaces. I know it is the result of shm_open() and
>> > shm_unlnik() in create_posix_shm().
>>> However, today, I found these items also exist in a traced application which is
>>> a long-time running daemon. The most important thing I found is that there
>> > seems no reliable way to release share memory.
>>> I tried to kill lttng-sessiond but not always release share memory. Sometimes I
>>> need to kill the traced application to free share memory....But it is not a
>> > good idea to kill these applications.
>> > My questions are:
>>> 1. Is there any way to release share memory without killing any traced
>> > application?
>>> 2. Is it normal that many items such as "/dev/shm/ust-shm-consumer-81132
>> > (deleted)" exist in the traced application?

>> > Thanks
>> > zhenyu.ren



>> > _______________________________________________
>> > lttng-dev mailing list
>> > lttng-dev@lists.lttng.org
>> > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

>> --
>> Jonathan Rajotte-Julien
>> EfficiOS
> _______________________________________________
> lttng-dev mailing list
> lttng-dev@lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

[-- Attachment #1.2: Type: text/html, Size: 25976 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [lttng-dev] 回复: 回复:回复: 回复: 回复: 回复: shm leak in traced application?
  2022-03-10  3:19               ` [lttng-dev] 回复:回复: " zhenyu.ren via lttng-dev
@ 2022-03-10  4:24                 ` zhenyu.ren via lttng-dev
  2022-03-10 14:31                   ` Mathieu Desnoyers via lttng-dev
  0 siblings, 1 reply; 13+ messages in thread
From: zhenyu.ren via lttng-dev @ 2022-03-10  4:24 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 8146 bytes --]

Oh, I see. I have an old ust(2.7). So I have no FD_CLOEXEC in ustcomm_recv_fds_unix_sock(). 

Thanks very much!!!
zhenyu.ren
------------------------------------------------------------------
发件人:zhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
发送时间:2022年3月10日(星期四) 11:24
收件人:Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
抄 送:lttng-dev <lttng-dev@lists.lttng.org>
主 题:[lttng-dev] 回复:回复: 回复: 回复: 回复: shm leak in traced application?

>When this happpens, is the process holding a single (or very few) shm file references, or references to many shm files ?

It is holding "all" of shm files' reference , neither a single one nor some few ones.

In fact, yesterday, I tried to fix it as the following and it seems work.

--- a/lttng-ust/libringbuffer/shm.c
+++ b/lttng-ust/libringbuffer/shm.c
@@ -32,7 +32,6 @@
 #include <lttng/align.h>
 #include <limits.h>
 #include <helper.h>
-
 /*
  * Ensure we have the required amount of space available by writing 0
  * into the entire buffer. Not doing so can trigger SIGBUS when going
@@ -122,6 +121,12 @@ struct shm_object *_shm_object_table_alloc_shm(struct shm_object_table *table,
        /* create shm */

        shmfd = stream_fd;
+    if (shmfd >= 0) {
+     ret = fcntl(shmfd, F_SETFD, FD_CLOEXEC);
+     if (ret < 0) {
+   PERROR("fcntl shmfd FD_CLOEXEC");
+     }
+    }
        ret = zero_file(shmfd, memory_map_size);
        if (ret) {
                PERROR("zero_file");
@@ -272,15 +277,22 @@ struct shm_object *shm_object_table_append_shm(struct shm_object_table *table,
        obj->shm_fd = shm_fd;
        obj->shm_fd_ownership = 1;

+    if (shm_fd >= 0) {
+     ret = fcntl(shm_fd, F_SETFD, FD_CLOEXEC);
+     if (ret < 0) {
+   PERROR("fcntl shmfd FD_CLOEXEC");
+   //goto error_fcntl;
+     }
+    }
        ret = fcntl(obj->wait_fd[1], F_SETFD, FD_CLOEXEC);
        if (ret < 0) {

    As it shows, wait_fd[1] has been set FD_CLOEXEC by fcntl() but not shm_fd. Why your patch do with wait_fd but not shm_fd? As far as I know, wait_fd is just a pipe and it seems not related to shm resource.







------------------------------------------------------------------
发件人:Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
发送时间:2022年3月10日(星期四) 00:46
收件人:zhenyu.ren <zhenyu.ren@aliyun.com>
抄 送:Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>; lttng-dev <lttng-dev@lists.lttng.org>
主 题:Re: 回复:[lttng-dev] 回复: 回复: 回复: shm leak in traced application?

When this happpens, is the process holding a single (or very few) shm file references, or references to many
shm files ?

I wonder if you end up in a scenario where an application very frequently performs exec(), and therefore
sometimes the exec() will happen in the window between the unix socket file descriptor reception and
call to fcntl FD_CLOEXEC.

Thanks,

Mathieu

----- On Mar 8, 2022, at 8:29 PM, zhenyu.ren <zhenyu.ren@aliyun.com> wrote:
Thanks a  lot for reply. I do not reply it in bug tracker since I have not gotten a reliable way to reproduce the leak case. 
------------------------------------------------------------------
发件人:Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
发送时间:2022年3月8日(星期二) 23:26
收件人:zhenyu.ren <zhenyu.ren@aliyun.com>
抄 送:Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>; lttng-dev <lttng-dev@lists.lttng.org>
主 题:Re: [lttng-dev] 回复: 回复: 回复: shm leak in traced application?



----- On Mar 8, 2022, at 12:18 AM, lttng-dev lttng-dev@lists.lttng.org wrote:

> Hi,
> In shm_object_table_append_shm()/alloc_shm(), why not calling FD_CLOEXEC fcntl()
> to shmfds? I guess this omission leads to shm fds leak.

Those file descriptors are created when received by ustcomm_recv_fds_unix_sock, and
immediately after creation they are set as FD_CLOEXEC.

We should continue this discussion in the bug tracker as suggested by Jonathan.
It would greatly help if you can provide a small reproducer.

Thanks,

Mathieu


> Thanks
> zhenyu.ren

>> ------------------------------------------------------------------
>> 发件人:Jonathan Rajotte-Julien <jonathan.rajotte-julien@efficios.com>
>> 发送时间:2022年2月25日(星期五) 22:31
>> 收件人:zhenyu.ren <zhenyu.ren@aliyun.com>
>> 抄 送:lttng-dev <lttng-dev@lists.lttng.org>
>> 主 题:Re: [lttng-dev] 回复: 回复: shm leak in traced application?

>> Hi zhenyu.ren,

>> Please open a bug on our bug tracker and provide a reproducer against the latest
>> stable version (2.13.x).

>> https://bugs.lttng.org/

>> Please follow the guidelines: https://bugs.lttng.org/#Bug-reporting-guidelines

>> Cheers

>> On Fri, Feb 25, 2022 at 12:47:34PM +0800, zhenyu.ren via lttng-dev wrote:
>> > Hi, lttng-dev team
>>> When lttng-sessiond exits, the ust applications should call
>>> lttng_ust_objd_table_owner_cleanup() and clean up all shm resource(unmap and
>>> close). Howerver I do find that the ust applications keep opening "all" of the
>> > shm fds("/dev/shm/ust-shm-consumer-81132 (deleted)") and do NOT free shm.
>>> If we run lttng-sessiond again, ust applications can get a new piece of shm and
>>> a new list of shm fds so double shm usages. Then if we kill lttng-sessiond,
>>> what the mostlikely happened is ust applications close the new list of shm fds
>>> and free new shm resource but keeping old shm still. In other word, we can not
>> > free this piece of shm unless we killing ust applications!!!
>>> So Is there any possilbe that ust applications failed calling
>>> lttng_ust_objd_table_owner_cleanup()? Do you have ever see this problem? Do you
>>> have any advice to free the shm without killling ust applications(I tried to
>> > dig into kernel shm_open and /dev/shm, but not found any ideas)?

>> > Thanks in advance
>> > zhenyu.ren



>> > ------------------------------------------------------------------
>> > 发件人:zhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
>> > 发送时间:2022年2月23日(星期三) 23:09
>> > 收件人:lttng-dev <lttng-dev@lists.lttng.org>
>> > 主 题:[lttng-dev] 回复: shm leak in traced application?

>>> >"I found these items also exist in a traced application which is a long-time
>> > >running daemon"
>> > Even if lttng-sessiond has been killed!!

>> > Thanks
>> > zhenyu.ren
>> > ------------------------------------------------------------------
>> > 发件人:zhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
>> > 发送时间:2022年2月23日(星期三) 22:44
>> > 收件人:lttng-dev <lttng-dev@lists.lttng.org>
>> > 主 题:[lttng-dev] shm leak in traced application?

>> > Hi,
>>> There are many items such as "/dev/shm/ust-shm-consumer-81132 (deleted)" exist
>>> in lttng-sessiond fd spaces. I know it is the result of shm_open() and
>> > shm_unlnik() in create_posix_shm().
>>> However, today, I found these items also exist in a traced application which is
>>> a long-time running daemon. The most important thing I found is that there
>> > seems no reliable way to release share memory.
>>> I tried to kill lttng-sessiond but not always release share memory. Sometimes I
>>> need to kill the traced application to free share memory....But it is not a
>> > good idea to kill these applications.
>> > My questions are:
>>> 1. Is there any way to release share memory without killing any traced
>> > application?
>>> 2. Is it normal that many items such as "/dev/shm/ust-shm-consumer-81132
>> > (deleted)" exist in the traced application?

>> > Thanks
>> > zhenyu.ren



>> > _______________________________________________
>> > lttng-dev mailing list
>> > lttng-dev@lists.lttng.org
>> > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

>> --
>> Jonathan Rajotte-Julien
>> EfficiOS
> _______________________________________________
> lttng-dev mailing list
> lttng-dev@lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

[-- Attachment #1.2: Type: text/html, Size: 31231 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [lttng-dev]  回复: 回复:回复: 回复: 回复: 回复: shm leak in traced application?
  2022-03-10  4:24                 ` [lttng-dev] 回复: " zhenyu.ren via lttng-dev
@ 2022-03-10 14:31                   ` Mathieu Desnoyers via lttng-dev
  2022-03-11  2:08                     ` [lttng-dev] 回复:回复: " zhenyu.ren via lttng-dev
  0 siblings, 1 reply; 13+ messages in thread
From: Mathieu Desnoyers via lttng-dev @ 2022-03-10 14:31 UTC (permalink / raw)
  To: zhenyu.ren; +Cc: lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 9547 bytes --]

Hi Zhenyu, 

This is exactly why Jonathan and I asked you to fill a bug report on the bug tracker 
and follow the bug reporting guidelines ( [ https://lttng.org/community/#bug-reporting-guidelines | https://lttng.org/community/#bug-reporting-guidelines ] ). 

This saves time for everyone. 

Thanks, 

Mathieu 

----- On Mar 9, 2022, at 11:24 PM, zhenyu.ren <zhenyu.ren@aliyun.com> wrote: 

> Oh, I see. I have an old ust(2.7). So I have no FD_CLOEXEC in
> ustcomm_recv_fds_unix_sock().

> Thanks very much!!!
> zhenyu.ren

>> ------------------------------------------------------------------
>> 发件人:zhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
>> 发送时间:2022年3月10日(星期四) 11:24
>> 收件人:Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
>> 抄 送:lttng-dev <lttng-dev@lists.lttng.org>
>> 主 题:[lttng-dev] 回复:回复: 回复: 回复: 回复: shm leak in traced application?

>>> When this happpens, is the process holding a single (or very few) shm file
>> > references, or references to many shm files ?

>> It is holding "all" of shm files' reference , neither a single one nor some few
>> ones.

>> In fact, yesterday, I tried to fix it as the following and it seems work.

>> --- a/lttng-ust/libringbuffer/shm.c

>> +++ b/lttng-ust/libringbuffer/shm.c

>> @@ -32,7 +32,6 @@

>> #include <lttng/align.h>

>> #include <limits.h>

>> #include <helper.h>

>> -

>> /*

>> * Ensure we have the required amount of space available by writing 0

>> * into the entire buffer. Not doing so can trigger SIGBUS when going

>> @@ -122,6 +121,12 @@ struct shm_object *_shm_object_table_alloc_shm(struct
>> shm_object_table *table,

>> /* create shm */

>> shmfd = stream_fd;

>> + if (shmfd >= 0) {

>> + ret = fcntl(shmfd, F_SETFD, FD_CLOEXEC);

>> + if (ret < 0) {

>> + PERROR("fcntl shmfd FD_CLOEXEC");

>> + }

>> + }

>> ret = zero_file(shmfd, memory_map_size);

>> if (ret) {

>> PERROR("zero_file");

>> @@ -272,15 +277,22 @@ struct shm_object *shm_object_table_append_shm(struct
>> shm_object_table *table,

>> obj->shm_fd = shm_fd;

>> obj->shm_fd_ownership = 1;

>> + if (shm_fd >= 0) {

>> + ret = fcntl(shm_fd, F_SETFD, FD_CLOEXEC);

>> + if (ret < 0) {

>> + PERROR("fcntl shmfd FD_CLOEXEC");

>> + //goto error_fcntl;

>> + }

>> + }

>> ret = fcntl(obj->wait_fd[1], F_SETFD, FD_CLOEXEC);

>> if (ret < 0) {

>> As it shows, wait_fd[1] has been set FD_CLOEXEC by fcntl() but not shm_fd. Why
>> your patch do with wait_fd but not shm_fd? As far as I know, wait_fd is just a
>> pipe and it seems not related to shm resource.

>> ------------------------------------------------------------------
>> 发件人:Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
>> 发送时间:2022年3月10日(星期四) 00:46
>> 收件人:zhenyu.ren <zhenyu.ren@aliyun.com>
>> 抄 送:Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>; lttng-dev
>> <lttng-dev@lists.lttng.org>
>> 主 题:Re: 回复:[lttng-dev] 回复: 回复: 回复: shm leak in traced application?

>> When this happpens, is the process holding a single (or very few) shm file
>> references, or references to many
>> shm files ?

>> I wonder if you end up in a scenario where an application very frequently
>> performs exec(), and therefore
>> sometimes the exec() will happen in the window between the unix socket file
>> descriptor reception and
>> call to fcntl FD_CLOEXEC.

>> Thanks,

>> Mathieu

>> ----- On Mar 8, 2022, at 8:29 PM, zhenyu.ren <zhenyu.ren@aliyun.com> wrote:
>> Thanks a lot for reply. I do not reply it in bug tracker since I have not gotten
>> a reliable way to reproduce the leak case.
>> ------------------------------------------------------------------
>> 发件人:Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
>> 发送时间:2022年3月8日(星期二) 23:26
>> 收件人:zhenyu.ren <zhenyu.ren@aliyun.com>
>> 抄 送:Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>; lttng-dev
>> <lttng-dev@lists.lttng.org>
>> 主 题:Re: [lttng-dev] 回复: 回复: 回复: shm leak in traced application?

>> ----- On Mar 8, 2022, at 12:18 AM, lttng-dev lttng-dev@lists.lttng.org wrote:

>> > Hi,
>> > In shm_object_table_append_shm()/alloc_shm(), why not calling FD_CLOEXEC fcntl()
>> > to shmfds? I guess this omission leads to shm fds leak.

>> Those file descriptors are created when received by ustcomm_recv_fds_unix_sock,
>> and
>> immediately after creation they are set as FD_CLOEXEC.

>> We should continue this discussion in the bug tracker as suggested by Jonathan.
>> It would greatly help if you can provide a small reproducer.

>> Thanks,

>> Mathieu

>> > Thanks
>> > zhenyu.ren

>> >> ------------------------------------------------------------------
>> >> 发件人:Jonathan Rajotte-Julien <jonathan.rajotte-julien@efficios.com>
>> >> 发送时间:2022年2月25日(星期五) 22:31
>> >> 收件人:zhenyu.ren <zhenyu.ren@aliyun.com>
>> >> 抄 送:lttng-dev <lttng-dev@lists.lttng.org>
>> >> 主 题:Re: [lttng-dev] 回复: 回复: shm leak in traced application?

>> >> Hi zhenyu.ren,

>> >> Please open a bug on our bug tracker and provide a reproducer against the latest
>> >> stable version (2.13.x).

>> >> [ https://bugs.lttng.org/ | https://bugs.lttng.org/ ]

>>>> Please follow the guidelines: [ https://bugs.lttng.org/#Bug-reporting-guidelines
>> >> | https://bugs.lttng.org/#Bug-reporting-guidelines ]

>> >> Cheers

>> >> On Fri, Feb 25, 2022 at 12:47:34PM +0800, zhenyu.ren via lttng-dev wrote:
>> >> > Hi, lttng-dev team
>> >>> When lttng-sessiond exits, the ust applications should call
>> >>> lttng_ust_objd_table_owner_cleanup() and clean up all shm resource(unmap and
>> >>> close). Howerver I do find that the ust applications keep opening "all" of the
>> >> > shm fds("/dev/shm/ust-shm-consumer-81132 (deleted)") and do NOT free shm.
>> >>> If we run lttng-sessiond again, ust applications can get a new piece of shm and
>> >>> a new list of shm fds so double shm usages. Then if we kill lttng-sessiond,
>> >>> what the mostlikely happened is ust applications close the new list of shm fds
>> >>> and free new shm resource but keeping old shm still. In other word, we can not
>> >> > free this piece of shm unless we killing ust applications!!!
>> >>> So Is there any possilbe that ust applications failed calling
>> >>> lttng_ust_objd_table_owner_cleanup()? Do you have ever see this problem? Do you
>> >>> have any advice to free the shm without killling ust applications(I tried to
>> >> > dig into kernel shm_open and /dev/shm, but not found any ideas)?

>> >> > Thanks in advance
>> >> > zhenyu.ren

>> >> > ------------------------------------------------------------------
>> >> > 发件人:zhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
>> >> > 发送时间:2022年2月23日(星期三) 23:09
>> >> > 收件人:lttng-dev <lttng-dev@lists.lttng.org>
>> >> > 主 题:[lttng-dev] 回复: shm leak in traced application?

>> >>> >"I found these items also exist in a traced application which is a long-time
>> >> > >running daemon"
>> >> > Even if lttng-sessiond has been killed!!

>> >> > Thanks
>> >> > zhenyu.ren
>> >> > ------------------------------------------------------------------
>> >> > 发件人:zhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
>> >> > 发送时间:2022年2月23日(星期三) 22:44
>> >> > 收件人:lttng-dev <lttng-dev@lists.lttng.org>
>> >> > 主 题:[lttng-dev] shm leak in traced application?

>> >> > Hi,
>> >>> There are many items such as "/dev/shm/ust-shm-consumer-81132 (deleted)" exist
>> >>> in lttng-sessiond fd spaces. I know it is the result of shm_open() and
>> >> > shm_unlnik() in create_posix_shm().
>> >>> However, today, I found these items also exist in a traced application which is
>> >>> a long-time running daemon. The most important thing I found is that there
>> >> > seems no reliable way to release share memory.
>> >>> I tried to kill lttng-sessiond but not always release share memory. Sometimes I
>> >>> need to kill the traced application to free share memory....But it is not a
>> >> > good idea to kill these applications.
>> >> > My questions are:
>> >>> 1. Is there any way to release share memory without killing any traced
>> >> > application?
>> >>> 2. Is it normal that many items such as "/dev/shm/ust-shm-consumer-81132
>> >> > (deleted)" exist in the traced application?

>> >> > Thanks
>> >> > zhenyu.ren

>> >> > _______________________________________________
>> >> > lttng-dev mailing list
>> >> > lttng-dev@lists.lttng.org
>>>> > [ https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev |
>> >> > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev ]

>> >> --
>> >> Jonathan Rajotte-Julien
>> >> EfficiOS
>> > _______________________________________________
>> > lttng-dev mailing list
>> > lttng-dev@lists.lttng.org
>>> [ https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev |
>> > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev ]
>> --
>> Mathieu Desnoyers
>> EfficiOS Inc.
>> [ http://www.efficios.com/ | http://www.efficios.com ]

>> --
>> Mathieu Desnoyers
>> EfficiOS Inc.
>> [ http://www.efficios.com/ | http://www.efficios.com ]

-- 
Mathieu Desnoyers 
EfficiOS Inc. 
http://www.efficios.com 

[-- Attachment #1.2: Type: text/html, Size: 28843 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [lttng-dev] 回复:回复: 回复:回复: 回复: 回复: 回复: shm leak in traced application?
  2022-03-10 14:31                   ` Mathieu Desnoyers via lttng-dev
@ 2022-03-11  2:08                     ` zhenyu.ren via lttng-dev
  0 siblings, 0 replies; 13+ messages in thread
From: zhenyu.ren via lttng-dev @ 2022-03-11  2:08 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 9758 bytes --]

Hi, Mathieu and Jonathan

    I am sorry for that. I should provide you with ust version in the first place.In fact ,we choose lttng to provide tracing feature long time ago so that we stick to a very old version i.e. 2.7. In fact, there was some chances that we reported some issues with version provided ,but got the similar answers just to upgrade to the lastest software(I know you have not maintained the old version any longer). It is very diffcult for us to upgrade the ust to a new version since it is linked into so many production apps. I think the 2.7 ust is roubust engough and only need some littile fixes,just like this time ,I need a single patch to  ustcomm_recv_fds_unix_sock(). Again I am very very sorry for you take so much time to think our cases. Lttng is the best trace toolsets in the world.

Thanks
zhenyu.ren
------------------------------------------------------------------
发件人:Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
发送时间:2022年3月10日(星期四) 22:41
收件人:zhenyu.ren <zhenyu.ren@aliyun.com>
抄 送:lttng-dev <lttng-dev@lists.lttng.org>
主 题:Re: 回复:[lttng-dev] 回复:回复: 回复: 回复: 回复: shm leak in traced application?

Hi Zhenyu,

This is exactly why Jonathan and I asked you to fill a bug report on the bug tracker
and follow the bug reporting guidelines (https://lttng.org/community/#bug-reporting-guidelines).

This saves time for everyone.

Thanks,

Mathieu

----- On Mar 9, 2022, at 11:24 PM, zhenyu.ren <zhenyu.ren@aliyun.com> wrote:

Oh, I see. I have an old ust(2.7). So I have no FD_CLOEXEC in ustcomm_recv_fds_unix_sock(). 

Thanks very much!!!
zhenyu.ren
------------------------------------------------------------------
发件人:zhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
发送时间:2022年3月10日(星期四) 11:24
收件人:Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
抄 送:lttng-dev <lttng-dev@lists.lttng.org>
主 题:[lttng-dev] 回复:回复: 回复: 回复: 回复: shm leak in traced application?

>When this happpens, is the process holding a single (or very few) shm file references, or references to many shm files ?

It is holding "all" of shm files' reference , neither a single one nor some few ones.

In fact, yesterday, I tried to fix it as the following and it seems work.

--- a/lttng-ust/libringbuffer/shm.c
+++ b/lttng-ust/libringbuffer/shm.c
@@ -32,7 +32,6 @@
 #include <lttng/align.h>
 #include <limits.h>
 #include <helper.h>
-
 /*
  * Ensure we have the required amount of space available by writing 0
  * into the entire buffer. Not doing so can trigger SIGBUS when going
@@ -122,6 +121,12 @@ struct shm_object *_shm_object_table_alloc_shm(struct shm_object_table *table,
        /* create shm */

        shmfd = stream_fd;
+    if (shmfd >= 0) {
+     ret = fcntl(shmfd, F_SETFD, FD_CLOEXEC);
+     if (ret < 0) {
+   PERROR("fcntl shmfd FD_CLOEXEC");
+     }
+    }
        ret = zero_file(shmfd, memory_map_size);
        if (ret) {
                PERROR("zero_file");
@@ -272,15 +277,22 @@ struct shm_object *shm_object_table_append_shm(struct shm_object_table *table,
        obj->shm_fd = shm_fd;
        obj->shm_fd_ownership = 1;

+    if (shm_fd >= 0) {
+     ret = fcntl(shm_fd, F_SETFD, FD_CLOEXEC);
+     if (ret < 0) {
+   PERROR("fcntl shmfd FD_CLOEXEC");
+   //goto error_fcntl;
+     }
+    }
        ret = fcntl(obj->wait_fd[1], F_SETFD, FD_CLOEXEC);
        if (ret < 0) {

    As it shows, wait_fd[1] has been set FD_CLOEXEC by fcntl() but not shm_fd. Why your patch do with wait_fd but not shm_fd? As far as I know, wait_fd is just a pipe and it seems not related to shm resource.





------------------------------------------------------------------
发件人:Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
发送时间:2022年3月10日(星期四) 00:46
收件人:zhenyu.ren <zhenyu.ren@aliyun.com>
抄 送:Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>; lttng-dev <lttng-dev@lists.lttng.org>
主 题:Re: 回复:[lttng-dev] 回复: 回复: 回复: shm leak in traced application?

When this happpens, is the process holding a single (or very few) shm file references, or references to many
shm files ?

I wonder if you end up in a scenario where an application very frequently performs exec(), and therefore
sometimes the exec() will happen in the window between the unix socket file descriptor reception and
call to fcntl FD_CLOEXEC.

Thanks,

Mathieu

----- On Mar 8, 2022, at 8:29 PM, zhenyu.ren <zhenyu.ren@aliyun.com> wrote:
Thanks a  lot for reply. I do not reply it in bug tracker since I have not gotten a reliable way to reproduce the leak case. 
------------------------------------------------------------------
发件人:Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
发送时间:2022年3月8日(星期二) 23:26
收件人:zhenyu.ren <zhenyu.ren@aliyun.com>
抄 送:Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>; lttng-dev <lttng-dev@lists.lttng.org>
主 题:Re: [lttng-dev] 回复: 回复: 回复: shm leak in traced application?



----- On Mar 8, 2022, at 12:18 AM, lttng-dev lttng-dev@lists.lttng.org wrote:

> Hi,
> In shm_object_table_append_shm()/alloc_shm(), why not calling FD_CLOEXEC fcntl()
> to shmfds? I guess this omission leads to shm fds leak.

Those file descriptors are created when received by ustcomm_recv_fds_unix_sock, and
immediately after creation they are set as FD_CLOEXEC.

We should continue this discussion in the bug tracker as suggested by Jonathan.
It would greatly help if you can provide a small reproducer.

Thanks,

Mathieu


> Thanks
> zhenyu.ren

>> ------------------------------------------------------------------
>> 发件人:Jonathan Rajotte-Julien <jonathan.rajotte-julien@efficios.com>
>> 发送时间:2022年2月25日(星期五) 22:31
>> 收件人:zhenyu.ren <zhenyu.ren@aliyun.com>
>> 抄 送:lttng-dev <lttng-dev@lists.lttng.org>
>> 主 题:Re: [lttng-dev] 回复: 回复: shm leak in traced application?

>> Hi zhenyu.ren,

>> Please open a bug on our bug tracker and provide a reproducer against the latest
>> stable version (2.13.x).

>> https://bugs.lttng.org/

>> Please follow the guidelines: https://bugs.lttng.org/#Bug-reporting-guidelines

>> Cheers

>> On Fri, Feb 25, 2022 at 12:47:34PM +0800, zhenyu.ren via lttng-dev wrote:
>> > Hi, lttng-dev team
>>> When lttng-sessiond exits, the ust applications should call
>>> lttng_ust_objd_table_owner_cleanup() and clean up all shm resource(unmap and
>>> close). Howerver I do find that the ust applications keep opening "all" of the
>> > shm fds("/dev/shm/ust-shm-consumer-81132 (deleted)") and do NOT free shm.
>>> If we run lttng-sessiond again, ust applications can get a new piece of shm and
>>> a new list of shm fds so double shm usages. Then if we kill lttng-sessiond,
>>> what the mostlikely happened is ust applications close the new list of shm fds
>>> and free new shm resource but keeping old shm still. In other word, we can not
>> > free this piece of shm unless we killing ust applications!!!
>>> So Is there any possilbe that ust applications failed calling
>>> lttng_ust_objd_table_owner_cleanup()? Do you have ever see this problem? Do you
>>> have any advice to free the shm without killling ust applications(I tried to
>> > dig into kernel shm_open and /dev/shm, but not found any ideas)?

>> > Thanks in advance
>> > zhenyu.ren



>> > ------------------------------------------------------------------
>> > 发件人:zhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
>> > 发送时间:2022年2月23日(星期三) 23:09
>> > 收件人:lttng-dev <lttng-dev@lists.lttng.org>
>> > 主 题:[lttng-dev] 回复: shm leak in traced application?

>>> >"I found these items also exist in a traced application which is a long-time
>> > >running daemon"
>> > Even if lttng-sessiond has been killed!!

>> > Thanks
>> > zhenyu.ren
>> > ------------------------------------------------------------------
>> > 发件人:zhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
>> > 发送时间:2022年2月23日(星期三) 22:44
>> > 收件人:lttng-dev <lttng-dev@lists.lttng.org>
>> > 主 题:[lttng-dev] shm leak in traced application?

>> > Hi,
>>> There are many items such as "/dev/shm/ust-shm-consumer-81132 (deleted)" exist
>>> in lttng-sessiond fd spaces. I know it is the result of shm_open() and
>> > shm_unlnik() in create_posix_shm().
>>> However, today, I found these items also exist in a traced application which is
>>> a long-time running daemon. The most important thing I found is that there
>> > seems no reliable way to release share memory.
>>> I tried to kill lttng-sessiond but not always release share memory. Sometimes I
>>> need to kill the traced application to free share memory....But it is not a
>> > good idea to kill these applications.
>> > My questions are:
>>> 1. Is there any way to release share memory without killing any traced
>> > application?
>>> 2. Is it normal that many items such as "/dev/shm/ust-shm-consumer-81132
>> > (deleted)" exist in the traced application?

>> > Thanks
>> > zhenyu.ren



>> > _______________________________________________
>> > lttng-dev mailing list
>> > lttng-dev@lists.lttng.org
>> > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

>> --
>> Jonathan Rajotte-Julien
>> EfficiOS
> _______________________________________________
> lttng-dev mailing list
> lttng-dev@lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com


-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

[-- Attachment #1.2: Type: text/html, Size: 34726 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-03-11  2:08 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-23 14:38 [lttng-dev] shm leak in traced application? zhenyu.ren via lttng-dev
2022-02-23 15:08 ` [lttng-dev] 回复: " zhenyu.ren via lttng-dev
2022-02-25  4:47   ` [lttng-dev] 回复: " zhenyu.ren via lttng-dev
2022-02-25 14:21     ` Jonathan Rajotte-Julien via lttng-dev
2022-03-08  5:18       ` [lttng-dev] 回复: " zhenyu.ren via lttng-dev
2022-03-08 15:17         ` Mathieu Desnoyers via lttng-dev
2022-03-09  1:29           ` [lttng-dev] 回复: " zhenyu.ren via lttng-dev
2022-03-09 16:37             ` Mathieu Desnoyers via lttng-dev
2022-03-09 17:07               ` Mathieu Desnoyers via lttng-dev
2022-03-10  3:19               ` [lttng-dev] 回复:回复: " zhenyu.ren via lttng-dev
2022-03-10  4:24                 ` [lttng-dev] 回复: " zhenyu.ren via lttng-dev
2022-03-10 14:31                   ` Mathieu Desnoyers via lttng-dev
2022-03-11  2:08                     ` [lttng-dev] 回复:回复: " zhenyu.ren via lttng-dev

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.