All of lore.kernel.org
 help / color / mirror / Atom feed
From: "zhenyu.ren via lttng-dev" <lttng-dev@lists.lttng.org>
To: "Mathieu Desnoyers" <mathieu.desnoyers@efficios.com>
Cc: "lttng-dev" <lttng-dev@lists.lttng.org>
Subject: [lttng-dev] 回复:回复: 回复:回复: 回复: 回复: 回复: shm leak in traced application?
Date: Fri, 11 Mar 2022 10:08:05 +0800	[thread overview]
Message-ID: <09fecb83-300d-4941-9316-fd3b71b9b807.zhenyu.ren@aliyun.com> (raw)
In-Reply-To: <955901820.138250.1646922697445.JavaMail.zimbra@efficios.com>


[-- Attachment #1.1: Type: text/plain, Size: 9758 bytes --]

Hi, Mathieu and Jonathan

    I am sorry for that. I should provide you with ust version in the first place.In fact ,we choose lttng to provide tracing feature long time ago so that we stick to a very old version i.e. 2.7. In fact, there was some chances that we reported some issues with version provided ,but got the similar answers just to upgrade to the lastest software(I know you have not maintained the old version any longer). It is very diffcult for us to upgrade the ust to a new version since it is linked into so many production apps. I think the 2.7 ust is roubust engough and only need some littile fixes,just like this time ,I need a single patch to  ustcomm_recv_fds_unix_sock(). Again I am very very sorry for you take so much time to think our cases. Lttng is the best trace toolsets in the world.

Thanks
zhenyu.ren
------------------------------------------------------------------
发件人:Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
发送时间:2022年3月10日(星期四) 22:41
收件人:zhenyu.ren <zhenyu.ren@aliyun.com>
抄 送:lttng-dev <lttng-dev@lists.lttng.org>
主 题:Re: 回复:[lttng-dev] 回复:回复: 回复: 回复: 回复: shm leak in traced application?

Hi Zhenyu,

This is exactly why Jonathan and I asked you to fill a bug report on the bug tracker
and follow the bug reporting guidelines (https://lttng.org/community/#bug-reporting-guidelines).

This saves time for everyone.

Thanks,

Mathieu

----- On Mar 9, 2022, at 11:24 PM, zhenyu.ren <zhenyu.ren@aliyun.com> wrote:

Oh, I see. I have an old ust(2.7). So I have no FD_CLOEXEC in ustcomm_recv_fds_unix_sock(). 

Thanks very much!!!
zhenyu.ren
------------------------------------------------------------------
发件人:zhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
发送时间:2022年3月10日(星期四) 11:24
收件人:Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
抄 送:lttng-dev <lttng-dev@lists.lttng.org>
主 题:[lttng-dev] 回复:回复: 回复: 回复: 回复: shm leak in traced application?

>When this happpens, is the process holding a single (or very few) shm file references, or references to many shm files ?

It is holding "all" of shm files' reference , neither a single one nor some few ones.

In fact, yesterday, I tried to fix it as the following and it seems work.

--- a/lttng-ust/libringbuffer/shm.c
+++ b/lttng-ust/libringbuffer/shm.c
@@ -32,7 +32,6 @@
 #include <lttng/align.h>
 #include <limits.h>
 #include <helper.h>
-
 /*
  * Ensure we have the required amount of space available by writing 0
  * into the entire buffer. Not doing so can trigger SIGBUS when going
@@ -122,6 +121,12 @@ struct shm_object *_shm_object_table_alloc_shm(struct shm_object_table *table,
        /* create shm */

        shmfd = stream_fd;
+    if (shmfd >= 0) {
+     ret = fcntl(shmfd, F_SETFD, FD_CLOEXEC);
+     if (ret < 0) {
+   PERROR("fcntl shmfd FD_CLOEXEC");
+     }
+    }
        ret = zero_file(shmfd, memory_map_size);
        if (ret) {
                PERROR("zero_file");
@@ -272,15 +277,22 @@ struct shm_object *shm_object_table_append_shm(struct shm_object_table *table,
        obj->shm_fd = shm_fd;
        obj->shm_fd_ownership = 1;

+    if (shm_fd >= 0) {
+     ret = fcntl(shm_fd, F_SETFD, FD_CLOEXEC);
+     if (ret < 0) {
+   PERROR("fcntl shmfd FD_CLOEXEC");
+   //goto error_fcntl;
+     }
+    }
        ret = fcntl(obj->wait_fd[1], F_SETFD, FD_CLOEXEC);
        if (ret < 0) {

    As it shows, wait_fd[1] has been set FD_CLOEXEC by fcntl() but not shm_fd. Why your patch do with wait_fd but not shm_fd? As far as I know, wait_fd is just a pipe and it seems not related to shm resource.





------------------------------------------------------------------
发件人:Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
发送时间:2022年3月10日(星期四) 00:46
收件人:zhenyu.ren <zhenyu.ren@aliyun.com>
抄 送:Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>; lttng-dev <lttng-dev@lists.lttng.org>
主 题:Re: 回复:[lttng-dev] 回复: 回复: 回复: shm leak in traced application?

When this happpens, is the process holding a single (or very few) shm file references, or references to many
shm files ?

I wonder if you end up in a scenario where an application very frequently performs exec(), and therefore
sometimes the exec() will happen in the window between the unix socket file descriptor reception and
call to fcntl FD_CLOEXEC.

Thanks,

Mathieu

----- On Mar 8, 2022, at 8:29 PM, zhenyu.ren <zhenyu.ren@aliyun.com> wrote:
Thanks a  lot for reply. I do not reply it in bug tracker since I have not gotten a reliable way to reproduce the leak case. 
------------------------------------------------------------------
发件人:Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
发送时间:2022年3月8日(星期二) 23:26
收件人:zhenyu.ren <zhenyu.ren@aliyun.com>
抄 送:Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>; lttng-dev <lttng-dev@lists.lttng.org>
主 题:Re: [lttng-dev] 回复: 回复: 回复: shm leak in traced application?



----- On Mar 8, 2022, at 12:18 AM, lttng-dev lttng-dev@lists.lttng.org wrote:

> Hi,
> In shm_object_table_append_shm()/alloc_shm(), why not calling FD_CLOEXEC fcntl()
> to shmfds? I guess this omission leads to shm fds leak.

Those file descriptors are created when received by ustcomm_recv_fds_unix_sock, and
immediately after creation they are set as FD_CLOEXEC.

We should continue this discussion in the bug tracker as suggested by Jonathan.
It would greatly help if you can provide a small reproducer.

Thanks,

Mathieu


> Thanks
> zhenyu.ren

>> ------------------------------------------------------------------
>> 发件人:Jonathan Rajotte-Julien <jonathan.rajotte-julien@efficios.com>
>> 发送时间:2022年2月25日(星期五) 22:31
>> 收件人:zhenyu.ren <zhenyu.ren@aliyun.com>
>> 抄 送:lttng-dev <lttng-dev@lists.lttng.org>
>> 主 题:Re: [lttng-dev] 回复: 回复: shm leak in traced application?

>> Hi zhenyu.ren,

>> Please open a bug on our bug tracker and provide a reproducer against the latest
>> stable version (2.13.x).

>> https://bugs.lttng.org/

>> Please follow the guidelines: https://bugs.lttng.org/#Bug-reporting-guidelines

>> Cheers

>> On Fri, Feb 25, 2022 at 12:47:34PM +0800, zhenyu.ren via lttng-dev wrote:
>> > Hi, lttng-dev team
>>> When lttng-sessiond exits, the ust applications should call
>>> lttng_ust_objd_table_owner_cleanup() and clean up all shm resource(unmap and
>>> close). Howerver I do find that the ust applications keep opening "all" of the
>> > shm fds("/dev/shm/ust-shm-consumer-81132 (deleted)") and do NOT free shm.
>>> If we run lttng-sessiond again, ust applications can get a new piece of shm and
>>> a new list of shm fds so double shm usages. Then if we kill lttng-sessiond,
>>> what the mostlikely happened is ust applications close the new list of shm fds
>>> and free new shm resource but keeping old shm still. In other word, we can not
>> > free this piece of shm unless we killing ust applications!!!
>>> So Is there any possilbe that ust applications failed calling
>>> lttng_ust_objd_table_owner_cleanup()? Do you have ever see this problem? Do you
>>> have any advice to free the shm without killling ust applications(I tried to
>> > dig into kernel shm_open and /dev/shm, but not found any ideas)?

>> > Thanks in advance
>> > zhenyu.ren



>> > ------------------------------------------------------------------
>> > 发件人:zhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
>> > 发送时间:2022年2月23日(星期三) 23:09
>> > 收件人:lttng-dev <lttng-dev@lists.lttng.org>
>> > 主 题:[lttng-dev] 回复: shm leak in traced application?

>>> >"I found these items also exist in a traced application which is a long-time
>> > >running daemon"
>> > Even if lttng-sessiond has been killed!!

>> > Thanks
>> > zhenyu.ren
>> > ------------------------------------------------------------------
>> > 发件人:zhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
>> > 发送时间:2022年2月23日(星期三) 22:44
>> > 收件人:lttng-dev <lttng-dev@lists.lttng.org>
>> > 主 题:[lttng-dev] shm leak in traced application?

>> > Hi,
>>> There are many items such as "/dev/shm/ust-shm-consumer-81132 (deleted)" exist
>>> in lttng-sessiond fd spaces. I know it is the result of shm_open() and
>> > shm_unlnik() in create_posix_shm().
>>> However, today, I found these items also exist in a traced application which is
>>> a long-time running daemon. The most important thing I found is that there
>> > seems no reliable way to release share memory.
>>> I tried to kill lttng-sessiond but not always release share memory. Sometimes I
>>> need to kill the traced application to free share memory....But it is not a
>> > good idea to kill these applications.
>> > My questions are:
>>> 1. Is there any way to release share memory without killing any traced
>> > application?
>>> 2. Is it normal that many items such as "/dev/shm/ust-shm-consumer-81132
>> > (deleted)" exist in the traced application?

>> > Thanks
>> > zhenyu.ren



>> > _______________________________________________
>> > lttng-dev mailing list
>> > lttng-dev@lists.lttng.org
>> > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

>> --
>> Jonathan Rajotte-Julien
>> EfficiOS
> _______________________________________________
> lttng-dev mailing list
> lttng-dev@lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com


-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

[-- Attachment #1.2: Type: text/html, Size: 34726 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

      reply	other threads:[~2022-03-11  2:08 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-23 14:38 [lttng-dev] shm leak in traced application? zhenyu.ren via lttng-dev
2022-02-23 15:08 ` [lttng-dev] 回复: " zhenyu.ren via lttng-dev
2022-02-25  4:47   ` [lttng-dev] 回复: " zhenyu.ren via lttng-dev
2022-02-25 14:21     ` Jonathan Rajotte-Julien via lttng-dev
2022-03-08  5:18       ` [lttng-dev] 回复: " zhenyu.ren via lttng-dev
2022-03-08 15:17         ` Mathieu Desnoyers via lttng-dev
2022-03-09  1:29           ` [lttng-dev] 回复: " zhenyu.ren via lttng-dev
2022-03-09 16:37             ` Mathieu Desnoyers via lttng-dev
2022-03-09 17:07               ` Mathieu Desnoyers via lttng-dev
2022-03-10  3:19               ` [lttng-dev] 回复:回复: " zhenyu.ren via lttng-dev
2022-03-10  4:24                 ` [lttng-dev] 回复: " zhenyu.ren via lttng-dev
2022-03-10 14:31                   ` Mathieu Desnoyers via lttng-dev
2022-03-11  2:08                     ` zhenyu.ren via lttng-dev [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=09fecb83-300d-4941-9316-fd3b71b9b807.zhenyu.ren@aliyun.com \
    --to=lttng-dev@lists.lttng.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=zhenyu.ren@aliyun.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.