From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.lttng.org (lists.lttng.org [167.114.26.123]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AA64FC43219 for ; Thu, 10 Mar 2022 14:31:48 +0000 (UTC) Received: from lists-lttng01.efficios.com (localhost [IPv6:::1]) by lists.lttng.org (Postfix) with ESMTP id 4KDs2V5DFtzBnB; Thu, 10 Mar 2022 09:31:46 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=lists.lttng.org; s=default; t=1646922707; bh=7ZbFN7LNRzDr7mmTw7lE6PZ5U25XyQbPKh3pO8TVTh4=; h=Date:To:Cc:In-Reply-To:References:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=Jbqdz170miBM5Z9RYJxNwYnXWr+8oJADr+OuSWolykn40MOXaynjZQ8pfuK0KoT4r 39swwCRafkDZaGRE8Smm0PRobCLEtFZMwdS56qRMuHN1u8Wp8kJmW+NIAzMhEheqVX k3qwGBlYWenioXZDn+XCv0200ar89fI9KlIptBWSGYDgoEfxWXhr1wHLEHi9qT7GZK hTFPxFoPF6OBkdmnUoewWvxT8+dqkaLsnNl8lXHH96PrFlKlKJ9sYSuOKSiaRPQCyq Sep7bE1YeGSCNFijcU7ojjQsYIB102/CqP3uyNLtdcK3alIQ/YU5bQMVY8uG46wiDB /ZBBujljZqHwQ== Received: from mail.efficios.com (mail.efficios.com [167.114.26.124]) by lists.lttng.org (Postfix) with ESMTPS id 4KDs2T12lfzBmy for ; Thu, 10 Mar 2022 09:31:44 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by mail.efficios.com (Postfix) with ESMTP id A2D8639F6B1 for ; Thu, 10 Mar 2022 09:31:38 -0500 (EST) Received: from mail.efficios.com ([127.0.0.1]) by localhost (mail03.efficios.com [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id VmFyXIV4inl7; Thu, 10 Mar 2022 09:31:37 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by mail.efficios.com (Postfix) with ESMTP id 97C5D39F2FB; Thu, 10 Mar 2022 09:31:37 -0500 (EST) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.efficios.com 97C5D39F2FB X-Virus-Scanned: amavisd-new at efficios.com Received: from mail.efficios.com ([127.0.0.1]) by localhost (mail03.efficios.com [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id JA6gJLt0Ctpx; Thu, 10 Mar 2022 09:31:37 -0500 (EST) Received: from mail03.efficios.com (mail03.efficios.com [167.114.26.124]) by mail.efficios.com (Postfix) with ESMTP id 8621339F34F; Thu, 10 Mar 2022 09:31:37 -0500 (EST) Date: Thu, 10 Mar 2022 09:31:37 -0500 (EST) To: "zhenyu.ren" Cc: lttng-dev Message-ID: <955901820.138250.1646922697445.JavaMail.zimbra@efficios.com> In-Reply-To: <401d796b-8f3c-453f-82f3-bf79e01a25d5.zhenyu.ren@aliyun.com> References: <20220225142111.GC1861057@x> <26341add-b962-4027-8c5e-28d940e8f4dc.zhenyu.ren@aliyun.com> <2119663162.129405.1646752637020.JavaMail.zimbra@efficios.com> <1a87b3ee-9983-4db6-b569-e6e6c1ab8411.zhenyu.ren@aliyun.com> <816104861.134170.1646843822208.JavaMail.zimbra@efficios.com> <401d796b-8f3c-453f-82f3-bf79e01a25d5.zhenyu.ren@aliyun.com> MIME-Version: 1.0 X-Originating-IP: [167.114.26.124] X-Mailer: Zimbra 8.8.15_GA_4203 (ZimbraWebClient - FF97 (Linux)/8.8.15_GA_4232) Thread-Topic: =?utf-8?B?5Zue5aSN77yaW2x0dG5nLWRldl0g5Zue5aSN77ya5Zue5aSN77yaIOWbnuWkje+8miDlm57lpI3vvJog5Zue5aSN77ya?= shm leak in traced application? Thread-Index: uNO72bCd8r28WqIxmguGR40bPARrWA== Subject: Re: [lttng-dev] =?utf-8?b?5Zue5aSN77yaIOWbnuWkje+8muWbnuWkje+8miA=?= =?utf-8?b?5Zue5aSN77yaIOWbnuWkje+8miDlm57lpI3vvJogc2htIGxlYWsgaW4gdHJh?= =?utf-8?q?ced_application=3F?= X-BeenThere: lttng-dev@lists.lttng.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: LTTng development list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Mathieu Desnoyers via lttng-dev Reply-To: Mathieu Desnoyers Content-Type: multipart/mixed; boundary="===============8356484169375359530==" Errors-To: lttng-dev-bounces@lists.lttng.org Sender: "lttng-dev" --===============8356484169375359530== Content-Type: multipart/alternative; boundary="=_16bd5163-5882-4279-83e8-efe4a14e4cbe" --=_16bd5163-5882-4279-83e8-efe4a14e4cbe Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hi Zhenyu,=20 This is exactly why Jonathan and I asked you to fill a bug report on the bu= g tracker=20 and follow the bug reporting guidelines ( [ https://lttng.org/community/#bu= g-reporting-guidelines | https://lttng.org/community/#bug-reporting-guideli= nes ] ).=20 This saves time for everyone.=20 Thanks,=20 Mathieu=20 ----- On Mar 9, 2022, at 11:24 PM, zhenyu.ren wrote= :=20 > Oh, I see. I have an old ust(2.7). So I have no FD_CLOEXEC in > ustcomm_recv_fds_unix_sock(). > Thanks very much!!! > zhenyu.ren >> ------------------------------------------------------------------ >> =E5=8F=91=E4=BB=B6=E4=BA=BA=EF=BC=9Azhenyu.ren via lttng-dev >> =E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4=EF=BC=9A2022=E5=B9=B43=E6=9C=8810= =E6=97=A5(=E6=98=9F=E6=9C=9F=E5=9B=9B) 11:24 >> =E6=94=B6=E4=BB=B6=E4=BA=BA=EF=BC=9AMathieu Desnoyers >> =E6=8A=84 =E9=80=81=EF=BC=9Alttng-dev >> =E4=B8=BB =E9=A2=98=EF=BC=9A[lttng-dev] =E5=9B=9E=E5=A4=8D=EF=BC=9A=E5= =9B=9E=E5=A4=8D=EF=BC=9A =E5=9B=9E=E5=A4=8D=EF=BC=9A =E5=9B=9E=E5=A4=8D=EF= =BC=9A =E5=9B=9E=E5=A4=8D=EF=BC=9A shm leak in traced application? >>> When this happpens, is the process holding a single (or very few) shm f= ile >> > references, or references to many shm files ? >> It is holding "all" of shm files' reference , neither a single one nor s= ome few >> ones. >> In fact, yesterday, I tried to fix it as the following and it seems work= . >> --- a/lttng-ust/libringbuffer/shm.c >> +++ b/lttng-ust/libringbuffer/shm.c >> @@ -32,7 +32,6 @@ >> #include >> #include >> #include >> - >> /* >> * Ensure we have the required amount of space available by writing 0 >> * into the entire buffer. Not doing so can trigger SIGBUS when going >> @@ -122,6 +121,12 @@ struct shm_object *_shm_object_table_alloc_shm(stru= ct >> shm_object_table *table, >> /* create shm */ >> shmfd =3D stream_fd; >> + if (shmfd >=3D 0) { >> + ret =3D fcntl(shmfd, F_SETFD, FD_CLOEXEC); >> + if (ret < 0) { >> + PERROR("fcntl shmfd FD_CLOEXEC"); >> + } >> + } >> ret =3D zero_file(shmfd, memory_map_size); >> if (ret) { >> PERROR("zero_file"); >> @@ -272,15 +277,22 @@ struct shm_object *shm_object_table_append_shm(str= uct >> shm_object_table *table, >> obj->shm_fd =3D shm_fd; >> obj->shm_fd_ownership =3D 1; >> + if (shm_fd >=3D 0) { >> + ret =3D fcntl(shm_fd, F_SETFD, FD_CLOEXEC); >> + if (ret < 0) { >> + PERROR("fcntl shmfd FD_CLOEXEC"); >> + //goto error_fcntl; >> + } >> + } >> ret =3D fcntl(obj->wait_fd[1], F_SETFD, FD_CLOEXEC); >> if (ret < 0) { >> As it shows, wait_fd[1] has been set FD_CLOEXEC by fcntl() but not shm_f= d. Why >> your patch do with wait_fd but not shm_fd? As far as I know, wait_fd is = just a >> pipe and it seems not related to shm resource. >> ------------------------------------------------------------------ >> =E5=8F=91=E4=BB=B6=E4=BA=BA=EF=BC=9AMathieu Desnoyers >> =E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4=EF=BC=9A2022=E5=B9=B43=E6=9C=8810= =E6=97=A5(=E6=98=9F=E6=9C=9F=E5=9B=9B) 00:46 >> =E6=94=B6=E4=BB=B6=E4=BA=BA=EF=BC=9Azhenyu.ren >> =E6=8A=84 =E9=80=81=EF=BC=9AJonathan Rajotte ; lttng-dev >> >> =E4=B8=BB =E9=A2=98=EF=BC=9ARe: =E5=9B=9E=E5=A4=8D=EF=BC=9A[lttng-dev] = =E5=9B=9E=E5=A4=8D=EF=BC=9A =E5=9B=9E=E5=A4=8D=EF=BC=9A =E5=9B=9E=E5=A4=8D= =EF=BC=9A shm leak in traced application? >> When this happpens, is the process holding a single (or very few) shm fi= le >> references, or references to many >> shm files ? >> I wonder if you end up in a scenario where an application very frequentl= y >> performs exec(), and therefore >> sometimes the exec() will happen in the window between the unix socket f= ile >> descriptor reception and >> call to fcntl FD_CLOEXEC. >> Thanks, >> Mathieu >> ----- On Mar 8, 2022, at 8:29 PM, zhenyu.ren wro= te: >> Thanks a lot for reply. I do not reply it in bug tracker since I have no= t gotten >> a reliable way to reproduce the leak case. >> ------------------------------------------------------------------ >> =E5=8F=91=E4=BB=B6=E4=BA=BA=EF=BC=9AMathieu Desnoyers >> =E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4=EF=BC=9A2022=E5=B9=B43=E6=9C=888=E6= =97=A5(=E6=98=9F=E6=9C=9F=E4=BA=8C) 23:26 >> =E6=94=B6=E4=BB=B6=E4=BA=BA=EF=BC=9Azhenyu.ren >> =E6=8A=84 =E9=80=81=EF=BC=9AJonathan Rajotte ; lttng-dev >> >> =E4=B8=BB =E9=A2=98=EF=BC=9ARe: [lttng-dev] =E5=9B=9E=E5=A4=8D=EF=BC=9A = =E5=9B=9E=E5=A4=8D=EF=BC=9A =E5=9B=9E=E5=A4=8D=EF=BC=9A shm leak in traced = application? >> ----- On Mar 8, 2022, at 12:18 AM, lttng-dev lttng-dev@lists.lttng.org w= rote: >> > Hi, >> > In shm_object_table_append_shm()/alloc_shm()=EF=BC=8C why not calling = FD_CLOEXEC fcntl() >> > to shmfds? I guess this omission leads to shm fds leak. >> Those file descriptors are created when received by ustcomm_recv_fds_uni= x_sock, >> and >> immediately after creation they are set as FD_CLOEXEC. >> We should continue this discussion in the bug tracker as suggested by Jo= nathan. >> It would greatly help if you can provide a small reproducer. >> Thanks, >> Mathieu >> > Thanks >> > zhenyu.ren >> >> ------------------------------------------------------------------ >> >> =E5=8F=91=E4=BB=B6=E4=BA=BA=EF=BC=9AJonathan Rajotte-Julien >> >> =E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4=EF=BC=9A2022=E5=B9=B42=E6=9C=882= 5=E6=97=A5(=E6=98=9F=E6=9C=9F=E4=BA=94) 22:31 >> >> =E6=94=B6=E4=BB=B6=E4=BA=BA=EF=BC=9Azhenyu.ren >> >> =E6=8A=84 =E9=80=81=EF=BC=9Alttng-dev >> >> =E4=B8=BB =E9=A2=98=EF=BC=9ARe: [lttng-dev] =E5=9B=9E=E5=A4=8D=EF=BC= =9A =E5=9B=9E=E5=A4=8D=EF=BC=9A shm leak in traced application? >> >> Hi zhenyu.ren, >> >> Please open a bug on our bug tracker and provide a reproducer against= the latest >> >> stable version (2.13.x). >> >> [ https://bugs.lttng.org/ | https://bugs.lttng.org/ ] >>>> Please follow the guidelines: [ https://bugs.lttng.org/#Bug-reporting-= guidelines >> >> | https://bugs.lttng.org/#Bug-reporting-guidelines ] >> >> Cheers >> >> On Fri, Feb 25, 2022 at 12:47:34PM +0800, zhenyu.ren via lttng-dev wr= ote: >> >> > Hi, lttng-dev team >> >>> When lttng-sessiond exits, the ust applications should call >> >>> lttng_ust_objd_table_owner_cleanup() and clean up all shm resource(u= nmap and >> >>> close). Howerver I do find that the ust applications keep opening "a= ll" of the >> >> > shm fds("/dev/shm/ust-shm-consumer-81132 (deleted)") and do NOT fre= e shm. >> >>> If we run lttng-sessiond again, ust applications can get a new piece= of shm and >> >>> a new list of shm fds so double shm usages. Then if we kill lttng-se= ssiond, >> >>> what the mostlikely happened is ust applications close the new list = of shm fds >> >>> and free new shm resource but keeping old shm still. In other word, = we can not >> >> > free this piece of shm unless we killing ust applications!!! >> >>> So Is there any possilbe that ust applications failed calling >> >>> lttng_ust_objd_table_owner_cleanup()? Do you have ever see this prob= lem? Do you >> >>> have any advice to free the shm without killling ust applications(I = tried to >> >> > dig into kernel shm_open and /dev/shm, but not found any ideas)? >> >> > Thanks in advance >> >> > zhenyu.ren >> >> > ------------------------------------------------------------------ >> >> > =E5=8F=91=E4=BB=B6=E4=BA=BA=EF=BC=9Azhenyu.ren via lttng-dev >> >> > =E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4=EF=BC=9A2022=E5=B9=B42=E6=9C= =8823=E6=97=A5(=E6=98=9F=E6=9C=9F=E4=B8=89) 23:09 >> >> > =E6=94=B6=E4=BB=B6=E4=BA=BA=EF=BC=9Alttng-dev >> >> > =E4=B8=BB =E9=A2=98=EF=BC=9A[lttng-dev] =E5=9B=9E=E5=A4=8D=EF=BC=9A= shm leak in traced application? >> >>> >"I found these items also exist in a traced application which is a = long-time >> >> > >running daemon" >> >> > Even if lttng-sessiond has been killed!! >> >> > Thanks >> >> > zhenyu.ren >> >> > ------------------------------------------------------------------ >> >> > =E5=8F=91=E4=BB=B6=E4=BA=BA=EF=BC=9Azhenyu.ren via lttng-dev >> >> > =E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4=EF=BC=9A2022=E5=B9=B42=E6=9C= =8823=E6=97=A5(=E6=98=9F=E6=9C=9F=E4=B8=89) 22:44 >> >> > =E6=94=B6=E4=BB=B6=E4=BA=BA=EF=BC=9Alttng-dev >> >> > =E4=B8=BB =E9=A2=98=EF=BC=9A[lttng-dev] shm leak in traced applicat= ion? >> >> > Hi, >> >>> There are many items such as "/dev/shm/ust-shm-consumer-81132 (delet= ed)" exist >> >>> in lttng-sessiond fd spaces. I know it is the result of shm_open() a= nd >> >> > shm_unlnik() in create_posix_shm(). >> >>> However, today, I found these items also exist in a traced applicati= on which is >> >>> a long-time running daemon. The most important thing I found is that= there >> >> > seems no reliable way to release share memory. >> >>> I tried to kill lttng-sessiond but not always release share memory. = Sometimes I >> >>> need to kill the traced application to free share memory....But it i= s not a >> >> > good idea to kill these applications. >> >> > My questions are: >> >>> 1. Is there any way to release share memory without killing any trac= ed >> >> > application? >> >>> 2. Is it normal that many items such as "/dev/shm/ust-shm-consumer-8= 1132 >> >> > (deleted)" exist in the traced application? >> >> > Thanks >> >> > zhenyu.ren >> >> > _______________________________________________ >> >> > lttng-dev mailing list >> >> > lttng-dev@lists.lttng.org >>>> > [ https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev | >> >> > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev ] >> >> -- >> >> Jonathan Rajotte-Julien >> >> EfficiOS >> > _______________________________________________ >> > lttng-dev mailing list >> > lttng-dev@lists.lttng.org >>> [ https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev | >> > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev ] >> -- >> Mathieu Desnoyers >> EfficiOS Inc. >> [ http://www.efficios.com/ | http://www.efficios.com ] >> -- >> Mathieu Desnoyers >> EfficiOS Inc. >> [ http://www.efficios.com/ | http://www.efficios.com ] --=20 Mathieu Desnoyers=20 EfficiOS Inc.=20 http://www.efficios.com=20 --=_16bd5163-5882-4279-83e8-efe4a14e4cbe Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable
Hi Zhenyu,

This is exactly why Jonathan and I asked you to fill a bug re= port on the bug tracker
and follow the b= ug reporting guidelines (https://lttng.org/community/#bug-reporting-guidelines).=

Thi= s saves time for everyone.

Thanks,

Mathieu

<= /div>
----- On Mar 9, 20= 22, at 11:24 PM, zhenyu.ren <zhenyu.ren@aliyun.com> wrote:
=
Oh, I see. I have an old ust(2.7). So I have no FD_CL= OEXEC in ustcomm_recv_fds_unix_sock(). 

Thanks very m= uch!!!
zhenyu.ren
-----------------------------------------------------------= -------
=E5=8F=91=E4=BB=B6=E4=BA= =BA=EF=BC=9Azhenyu.ren via lttng-dev <lttng-dev@lists.lttng.org>
=E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4= =EF=BC=9A2022=E5=B9=B43=E6=9C=8810=E6=97=A5(=E6=98=9F=E6=9C=9F=E5=9B=9B) 11= :24
=E6=94=B6=E4=BB=B6=E4=BA=BA=EF= =BC=9AMathieu Desnoyers <mathieu.desnoyers@efficios.com>
=
=E6=8A=84=E3=80=80=E9=80=81=EF=BC=9Alttng-dev = <lttng-dev@lists.lttng.org>
= =E4=B8=BB=E3=80=80=E9=A2=98=EF=BC=9A[lttng-dev] =E5=9B=9E=E5=A4=8D=EF=BC=9A= =E5=9B=9E=E5=A4=8D=EF=BC=9A =E5=9B=9E=E5=A4=8D=EF=BC=9A =E5=9B=9E=E5=A4=8D= =EF=BC=9A =E5=9B=9E=E5=A4=8D=EF=BC=9A shm leak in traced application?

>When this happpens, is the process holding a = single (or very few) shm file references, or references to many shm files ?

I= t is holding "all" of shm files' reference , neither a single one nor some = few ones.

In fact, yesterday, I trie= d to fix it as the following and it seems work.

--- a/lttng-ust/libringbuffer/shm.c<= /b>

+++ b/lttng-ust/libringbuffer/shm.c

@@ -32,7 +32,6 @@

&nbs= p;#include <lttng/align.h>

 #include = <limits.h>

 #include <helper.h>=

-

 /*

<= span class=3D"s1">  * Ensure we have the required amount of space avai= lable by writing 0

  * into the entire buff= er. Not doing so can trigger SIGBUS when going

@@ -122,6 +121,12 @@<= span class=3D"s1"> struct shm_object *_shm_object_table_alloc_shm(struct sh= m_object_table *table,

      &nbs= p; /* create shm */

 

        shmfd =3D stream_fd;

+    if (shmfd >=3D 0) {

+           ret = =3D fcntl(shmfd, F_SETFD, FD_CLOEXEC);

+<= span class=3D"s3 __aliyun_node_has_color" style=3D"color:#ffffff">   &= nbsp;       if (ret < 0) {

+               PERROR("fcntl shmfd FD_CLOEXEC");

+  = ;         }

+    }<= /span>

        ret =3D zero_file(sh= mfd, memory_map_size);

      &nbs= p; if (ret) {

       = ;         PERROR("zero_file");

@@ -272,15 +277,22 @@ struct shm_object *shm_object_table_append_shm(stru= ct shm_object_table *table,

     =   obj->shm_fd =3D shm_fd;

   =     obj->shm_fd_ownership =3D 1;

&= nbsp;

+    if (shm_fd >=3D 0) {

+           ret =3D fcntl(shm_fd, F_SETFD, FD_CLOEXEC);

+          = ; if (ret < 0) {

+     &nbs= p;     <= span class=3D"s3 __aliyun_node_has_color" style=3D"color:#ffffff">  &n= bsp; PERROR("fcntl shmfd FD_CLOEXEC");

=

+               //goto= error_fcntl;

+          = }

+    }

&= nbsp;       ret =3D fcntl(obj->wait_fd[1], F_SETFD, FD_CL= OEXEC);

        if (ret <= 0) {


    As it shows= , wait_fd[1] has been set FD_CLOEXEC by fcntl() but not shm_fd. Why your pa= tch do with wait_fd but not shm_fd? As far as I know, wait_fd is just a pip= e and it seems not related to shm resource.





=

----------------------------------------------------------= --------
=E5=8F=91=E4=BB=B6=E4=BA=BA=EF=BC=9AMathieu Desnoyers <mathieu= .desnoyers@efficios.com>
=E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4=EF=BC=9A2= 022=E5=B9=B43=E6=9C=8810=E6=97=A5(=E6=98=9F=E6=9C=9F=E5=9B=9B) 00:46=
=E6= =94=B6=E4=BB=B6=E4=BA=BA=EF=BC=9Azhenyu.ren <zhenyu.ren@aliyun.com>
= =E6=8A=84=E3=80=80=E9=80=81=EF=BC=9AJonathan Rajotte <jonathan.rajotte-j= ulien@efficios.com>; lttng-dev <lttng-dev@lists.lttng.org><= /div>
=E4=B8= =BB=E3=80=80=E9=A2=98=EF=BC=9ARe: =E5=9B=9E=E5=A4=8D=EF=BC=9A[lttng-dev] = =E5=9B=9E=E5=A4=8D=EF=BC=9A =E5=9B=9E=E5=A4=8D=EF=BC=9A =E5=9B=9E=E5=A4=8D= =EF=BC=9A shm leak in traced application?

When this = happpens, is the process holding a single (or very few) shm file references= , or references to many
shm files ?

I wonde= r if you end up in a scenario where an application very frequently performs= exec(), and therefore
sometimes the exec() will happen in th= e window between the unix socket file descriptor reception and
call to fcntl FD_CLOEXEC.

Thanks,

Mat= hieu

----- On Mar 8, 2022, at 8:29 PM,= zhenyu.ren <zhenyu.ren@aliyun.com> wrote:
= Thanks a  lot for reply. I do not reply it in bug tracker sinc= e I have not gotten a reliable way to reproduce the leak case. =
--------------------------------------------------------= ----------
=E5=8F=91=E4=BB=B6=E4=BA=BA=EF=BC=9AMat= hieu Desnoyers <mathieu.desnoyers@efficios.com>
=E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4=EF=BC=9A2022=E5=B9=B43=E6=9C=888= =E6=97=A5(=E6=98=9F=E6=9C=9F=E4=BA=8C) 23:26
=E6= =94=B6=E4=BB=B6=E4=BA=BA=EF=BC=9Azhenyu.ren <zhenyu.ren@aliyun.com>
=E6=8A=84=E3=80=80=E9=80=81=EF=BC=9AJonathan Rajott= e <jonathan.rajotte-julien@efficios.com>; lttng-dev <lttng-dev@lis= ts.lttng.org>
=E4=B8=BB=E3=80=80=E9=A2=98=EF=BC= =9ARe: [lttng-dev] =E5=9B=9E=E5=A4=8D=EF=BC=9A =E5=9B=9E=E5=A4=8D=EF=BC=9A = =E5=9B=9E=E5=A4=8D=EF=BC=9A shm leak in traced application?



----- On Mar 8, 2022= , at 12:18 AM, lttng-dev lttng-dev@lists.lttng.org=  wrote:

> Hi,
> In shm_object_table_app= end_shm()/alloc_shm()=EF=BC=8C why not calling FD_CLOEX= EC fcntl()
> to shmfds? I guess this&nb= sp;omission leads to shm fds leak.

Those&nb= sp;file descriptors are created when received = ;by ustcomm_recv_fds_unix_sock, and
immediately after&nbs= p;creation they are set as FD_CLOEXEC.

We&n= bsp;should continue this discussion in the bu= g tracker as suggested by Jonathan.
It wou= ld greatly help if you can provide a&nbs= p;small reproducer.

Thanks,

Mathieu


> = ;Thanks
> zhenyu.ren

>> ----------------------= --------------------------------------------
>> =E5=8F=91=E4= =BB=B6=E4=BA=BA=EF=BC=9AJonathan Rajotte-Julien <jonathan.rajo= tte-julien@efficios.com>
>> =E5=8F=91=E9=80=81=E6=97=B6=E9= =97=B4=EF=BC=9A2022=E5=B9=B42=E6=9C=8825=E6=97=A5(=E6=98=9F=E6=9C=9F=E4=BA= =94) 22:31
>> =E6=94=B6=E4=BB=B6=E4=BA=BA=EF=BC=9Azhenyu= .ren <zhenyu.ren@aliyun.com>
>> =E6=8A=84 =E9= =80=81=EF=BC=9Alttng-dev <lttng-dev@lists.lttng.org>
>>=  =E4=B8=BB =E9=A2=98=EF=BC=9ARe: [lttng-dev] =E5=9B=9E= =E5=A4=8D=EF=BC=9A =E5=9B=9E=E5=A4=8D=EF=BC=9A shm leak = ;in traced application?

>> Hi zhenyu.ren,<= br>
>> Please open a bug on our = ;bug tracker and provide a reproducer against=  the latest
>> stable version (2.13.x).
>> https://bugs.lttng.org/

>= > Please follow the guidelines: https://bugs.lttng.org/#Bug-reporting-guidelines<= /a>

>> Cheers

>> On Fri, Feb&= nbsp;25, 2022 at 12:47:34PM +0800, zhenyu.ren = ;via lttng-dev wrote:
>> > Hi, lttng-d= ev team
>>> When lttng-sessiond exits, = ;the ust applications should call
>>> = lttng_ust_objd_table_owner_cleanup() and clean up all&n= bsp;shm resource(unmap and
>>> close). Howe= rver I do find that the ust applications=  keep opening "all" of the
>> >&n= bsp;shm fds("/dev/shm/ust-shm-consumer-81132 (deleted)") and=  do NOT free shm.
>>> If we = run lttng-sessiond again, ust applications can&nbs= p;get a new piece of shm and
>>>&= nbsp;a new list of shm fds so double&nbs= p;shm usages. Then if we kill lttng-sessiond,=
>>> what the mostlikely happened is&n= bsp;ust applications close the new list of&nb= sp;shm fds
>>> and free new shm r= esource but keeping old shm still. In ot= her word, we can not
>> > free&nb= sp;this piece of shm unless we killing u= st applications!!!
>>> So Is there any=  possilbe that ust applications failed callin= g
>>> lttng_ust_objd_table_owner_cleanup()? Do y= ou have ever see this problem? Do you>>> have any advice to free the&nbs= p;shm without killling ust applications(I tried&nb= sp;to
>> > dig into kernel shm_open&nb= sp;and /dev/shm, but not found any ideas)?
>> > Thanks in advance
>> &= gt; zhenyu.ren



>> > ----------------= --------------------------------------------------
>> >&nb= sp;=E5=8F=91=E4=BB=B6=E4=BA=BA=EF=BC=9Azhenyu.ren via lttng-dev&n= bsp;<lttng-dev@lists.lttng.org>
>> > =E5=8F=91= =E9=80=81=E6=97=B6=E9=97=B4=EF=BC=9A2022=E5=B9=B42=E6=9C=8823=E6=97=A5(=E6= =98=9F=E6=9C=9F=E4=B8=89) 23:09
>> > =E6=94=B6= =E4=BB=B6=E4=BA=BA=EF=BC=9Alttng-dev <lttng-dev@lists.lttng.org>=
>> > =E4=B8=BB =E9=A2=98=EF=BC=9A[lttng-dev]&nb= sp;=E5=9B=9E=E5=A4=8D=EF=BC=9A shm leak in traced = application?

>>> >"I found these ite= ms also exist in a traced application wh= ich is a long-time
>> > >running&nb= sp;daemon"
>> > Even if lttng-sessiond = ;has been killed!!

>> > Thanks
>&= gt; > zhenyu.ren
>> > ------------------= ------------------------------------------------
>> > = ;=E5=8F=91=E4=BB=B6=E4=BA=BA=EF=BC=9Azhenyu.ren via lttng-dev&nbs= p;<lttng-dev@lists.lttng.org>
>> > =E5=8F=91=E9= =80=81=E6=97=B6=E9=97=B4=EF=BC=9A2022=E5=B9=B42=E6=9C=8823=E6=97=A5(=E6=98= =9F=E6=9C=9F=E4=B8=89) 22:44
>> > =E6=94=B6=E4= =BB=B6=E4=BA=BA=EF=BC=9Alttng-dev <lttng-dev@lists.lttng.org>>> > =E4=B8=BB =E9=A2=98=EF=BC=9A[lttng-dev] = shm leak in traced application?

>> &g= t; Hi,
>>> There are many items s= uch as "/dev/shm/ust-shm-consumer-81132 (deleted)" exis= t
>>> in lttng-sessiond fd spaces. I&n= bsp;know it is the result of shm_open() = and
>> > shm_unlnik() in create_posix_shm()= .
>>> However, today, I found these&nb= sp;items also exist in a traced application&n= bsp;which is
>>> a long-time running d= aemon. The most important thing I found = is that there
>> > seems no relia= ble way to release share memory.
>>>&n= bsp;I tried to kill lttng-sessiond but not&nb= sp;always release share memory. Sometimes I
>= ;>> need to kill the traced application=  to free share memory....But it is not&n= bsp;a
>> > good idea to kill thes= e applications.
>> > My questions are:=
>>> 1. Is there any way to = release share memory without killing any trac= ed
>> > application?
>>> 2. Is&= nbsp;it normal that many items such as "= /dev/shm/ust-shm-consumer-81132
>> > (deleted)" = exist in the traced application?

>> &= gt; Thanks
>> > zhenyu.ren



>&g= t; > _______________________________________________
>&g= t; > lttng-dev mailing list
>> >&nb= sp;lttng-dev@lists.lttng.org
>> > 
https://lists.lttng.org/cgi-bin/mailman/l= istinfo/lttng-dev

>> --
>> Jonathan&nbs= p;Rajotte-Julien
>> EfficiOS
> ___________________= ____________________________
> lttng-dev mailing list<= br>> lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinf= o/lttng-dev
-- 
Mathieu Desnoyers
EfficiOS Inc.=
http://www.efficios.com

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.e= fficios.com

=

--
=
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
--=_16bd5163-5882-4279-83e8-efe4a14e4cbe-- --===============8356484169375359530== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ lttng-dev mailing list lttng-dev@lists.lttng.org https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev --===============8356484169375359530==--