From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.lttng.org (lists.lttng.org [167.114.26.123]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1CD1BC433F5 for ; Wed, 9 Mar 2022 17:07:21 +0000 (UTC) Received: from lists-lttng01.efficios.com (localhost [IPv6:::1]) by lists.lttng.org (Postfix) with ESMTP id 4KDJXS3GnDzBSZ; Wed, 9 Mar 2022 12:07:20 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=lists.lttng.org; s=default; t=1646845641; bh=LHrvEBSo/R7sn+JfJ5fkaFyp+A+G1D0uytzq6lgF5hw=; h=Date:To:In-Reply-To:References:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=cx8l6BgzuVMxIyAEHQzibaivAgpbGPz2JqJx25Pphu90chXxr8NeqzBAQfe1Lj1SZ f+mP/EnFZq/MBBjfaEPbUUjZYT9k08dPtaRpK1Kab9t3EIlwxNVIX6Su0qb0JISkf6 XL8X58R65QhP94CVMKAqmhacj4gc09irHKKOqpvedcuI63LAFLUcn4ortdxgPcPjt4 6EaRg5VII880KQBPCmEYOZUL56payLmYdpQYF75Qk7s+Kom2FPhcimvGWmx1hlOJjt /VbT5ovlYW557Gk1aRWygOUVY0nLmK6Wn4Ck4WCrmOpFr1mRow0PPWjysynnSOVxkI dGYwcVV1h6Ajw== Received: from mail.efficios.com (mail.efficios.com [167.114.26.124]) by lists.lttng.org (Postfix) with ESMTPS id 4KDJXR0HvqzBN3 for ; Wed, 9 Mar 2022 12:07:18 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by mail.efficios.com (Postfix) with ESMTP id 988E3394650; Wed, 9 Mar 2022 12:07:12 -0500 (EST) Received: from mail.efficios.com ([127.0.0.1]) by localhost (mail03.efficios.com [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id p6__luPsqRhO; Wed, 9 Mar 2022 12:07:11 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by mail.efficios.com (Postfix) with ESMTP id 9CE1F394771; Wed, 9 Mar 2022 12:07:11 -0500 (EST) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.efficios.com 9CE1F394771 X-Virus-Scanned: amavisd-new at efficios.com Received: from mail.efficios.com ([127.0.0.1]) by localhost (mail03.efficios.com [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 6IHSwPdnh5Ge; Wed, 9 Mar 2022 12:07:11 -0500 (EST) Received: from mail03.efficios.com (mail03.efficios.com [167.114.26.124]) by mail.efficios.com (Postfix) with ESMTP id 8594B394A17; Wed, 9 Mar 2022 12:07:11 -0500 (EST) Date: Wed, 9 Mar 2022 12:07:11 -0500 (EST) To: "zhenyu.ren" Message-ID: <1773574115.134463.1646845631405.JavaMail.zimbra@efficios.com> In-Reply-To: <816104861.134170.1646843822208.JavaMail.zimbra@efficios.com> References: <20220225142111.GC1861057@x> <26341add-b962-4027-8c5e-28d940e8f4dc.zhenyu.ren@aliyun.com> <2119663162.129405.1646752637020.JavaMail.zimbra@efficios.com> <1a87b3ee-9983-4db6-b569-e6e6c1ab8411.zhenyu.ren@aliyun.com> <816104861.134170.1646843822208.JavaMail.zimbra@efficios.com> MIME-Version: 1.0 X-Originating-IP: [167.114.26.124] X-Mailer: Zimbra 8.8.15_GA_4203 (ZimbraWebClient - FF97 (Linux)/8.8.15_GA_4232) Thread-Topic: =?utf-8?B?5Zue5aSN77yaW2x0dG5nLWRldl0g5Zue5aSN77yaIOWbnuWkje+8miDlm57lpI3vvJo=?= shm leak in traced application? Thread-Index: jw7LRmFAcVsMJPBj4cp2PGx88gfkXAQVU1V6 Subject: Re: [lttng-dev] =?utf-8?b?5Zue5aSN77yaIOWbnuWkje+8miDlm57lpI3vvJog?= =?utf-8?b?5Zue5aSN77yaIHNobSBsZWFrIGluIHRyYWNlZCBhcHBsaWNhdGlvbj8=?= X-BeenThere: lttng-dev@lists.lttng.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: LTTng development list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Mathieu Desnoyers via lttng-dev Reply-To: Mathieu Desnoyers Cc: lttng-dev Content-Type: multipart/mixed; boundary="===============5146134758815724167==" Errors-To: lttng-dev-bounces@lists.lttng.org Sender: "lttng-dev" --===============5146134758815724167== Content-Type: multipart/alternative; boundary="=_33316504-2f92-49a7-a1f5-21db56df6daf" --=_33316504-2f92-49a7-a1f5-21db56df6daf Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hi Zhenyu,=20 Can you try this fix please ?=20 https://review.lttng.org/c/lttng-ust/+/7530=20 And let me know how it goes.=20 Thanks,=20 Mathieu=20 ----- On Mar 9, 2022, at 11:37 AM, Mathieu Desnoyers wrote:=20 > When this happpens, is the process holding a single (or very few) shm fil= e > references, or references to many > shm files ? > I wonder if you end up in a scenario where an application very frequently > performs exec(), and therefore > sometimes the exec() will happen in the window between the unix socket fi= le > descriptor reception and > call to fcntl FD_CLOEXEC. > Thanks, > Mathieu > ----- On Mar 8, 2022, at 8:29 PM, zhenyu.ren wrot= e: >> Thanks a lot for reply. I do not reply it in bug tracker since I have no= t gotten >> a reliable way to reproduce the leak case. >>> ------------------------------------------------------------------ >>> =E5=8F=91=E4=BB=B6=E4=BA=BA=EF=BC=9AMathieu Desnoyers >>> =E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4=EF=BC=9A2022=E5=B9=B43=E6=9C=888= =E6=97=A5(=E6=98=9F=E6=9C=9F=E4=BA=8C) 23:26 >>> =E6=94=B6=E4=BB=B6=E4=BA=BA=EF=BC=9Azhenyu.ren >>> =E6=8A=84 =E9=80=81=EF=BC=9AJonathan Rajotte ; lttng-dev >>> >>> =E4=B8=BB =E9=A2=98=EF=BC=9ARe: [lttng-dev] =E5=9B=9E=E5=A4=8D=EF=BC=9A= =E5=9B=9E=E5=A4=8D=EF=BC=9A =E5=9B=9E=E5=A4=8D=EF=BC=9A shm leak in traced= application? >>> ----- On Mar 8, 2022, at 12:18 AM, lttng-dev lttng-dev@lists.lttng.org = wrote: >>> > Hi, >>> > In shm_object_table_append_shm()/alloc_shm()=EF=BC=8C why not calling= FD_CLOEXEC fcntl() >>> > to shmfds? I guess this omission leads to shm fds leak. >>> Those file descriptors are created when received by ustcomm_recv_fds_un= ix_sock, >>> and >>> immediately after creation they are set as FD_CLOEXEC. >>> We should continue this discussion in the bug tracker as suggested by J= onathan. >>> It would greatly help if you can provide a small reproducer. >>> Thanks, >>> Mathieu >>> > Thanks >>> > zhenyu.ren >>> >> ------------------------------------------------------------------ >>> >> =E5=8F=91=E4=BB=B6=E4=BA=BA=EF=BC=9AJonathan Rajotte-Julien >>> >> =E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4=EF=BC=9A2022=E5=B9=B42=E6=9C=88= 25=E6=97=A5(=E6=98=9F=E6=9C=9F=E4=BA=94) 22:31 >>> >> =E6=94=B6=E4=BB=B6=E4=BA=BA=EF=BC=9Azhenyu.ren >>> >> =E6=8A=84 =E9=80=81=EF=BC=9Alttng-dev >>> >> =E4=B8=BB =E9=A2=98=EF=BC=9ARe: [lttng-dev] =E5=9B=9E=E5=A4=8D=EF=BC= =9A =E5=9B=9E=E5=A4=8D=EF=BC=9A shm leak in traced application? >>> >> Hi zhenyu.ren, >>> >> Please open a bug on our bug tracker and provide a reproducer agains= t the latest >>> >> stable version (2.13.x). >>> >> https://bugs.lttng.org/ >>> >> Please follow the guidelines: https://bugs.lttng.org/#Bug-reporting-= guidelines >>> >> Cheers >>> >> On Fri, Feb 25, 2022 at 12:47:34PM +0800, zhenyu.ren via lttng-dev w= rote: >>> >> > Hi, lttng-dev team >>> >>> When lttng-sessiond exits, the ust applications should call >>> >>> lttng_ust_objd_table_owner_cleanup() and clean up all shm resource(= unmap and >>> >>> close). Howerver I do find that the ust applications keep opening "= all" of the >>> >> > shm fds("/dev/shm/ust-shm-consumer-81132 (deleted)") and do NOT fr= ee shm. >>> >>> If we run lttng-sessiond again, ust applications can get a new piec= e of shm and >>> >>> a new list of shm fds so double shm usages. Then if we kill lttng-s= essiond, >>> >>> what the mostlikely happened is ust applications close the new list= of shm fds >>> >>> and free new shm resource but keeping old shm still. In other word,= we can not >>> >> > free this piece of shm unless we killing ust applications!!! >>> >>> So Is there any possilbe that ust applications failed calling >>> >>> lttng_ust_objd_table_owner_cleanup()? Do you have ever see this pro= blem? Do you >>> >>> have any advice to free the shm without killling ust applications(I= tried to >>> >> > dig into kernel shm_open and /dev/shm, but not found any ideas)? >>> >> > Thanks in advance >>> >> > zhenyu.ren >>> >> > ------------------------------------------------------------------ >>> >> > =E5=8F=91=E4=BB=B6=E4=BA=BA=EF=BC=9Azhenyu.ren via lttng-dev >>> >> > =E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4=EF=BC=9A2022=E5=B9=B42=E6=9C= =8823=E6=97=A5(=E6=98=9F=E6=9C=9F=E4=B8=89) 23:09 >>> >> > =E6=94=B6=E4=BB=B6=E4=BA=BA=EF=BC=9Alttng-dev >>> >> > =E4=B8=BB =E9=A2=98=EF=BC=9A[lttng-dev] =E5=9B=9E=E5=A4=8D=EF=BC= =9A shm leak in traced application? >>> >>> >"I found these items also exist in a traced application which is a= long-time >>> >> > >running daemon" >>> >> > Even if lttng-sessiond has been killed!! >>> >> > Thanks >>> >> > zhenyu.ren >>> >> > ------------------------------------------------------------------ >>> >> > =E5=8F=91=E4=BB=B6=E4=BA=BA=EF=BC=9Azhenyu.ren via lttng-dev >>> >> > =E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4=EF=BC=9A2022=E5=B9=B42=E6=9C= =8823=E6=97=A5(=E6=98=9F=E6=9C=9F=E4=B8=89) 22:44 >>> >> > =E6=94=B6=E4=BB=B6=E4=BA=BA=EF=BC=9Alttng-dev >>> >> > =E4=B8=BB =E9=A2=98=EF=BC=9A[lttng-dev] shm leak in traced applica= tion? >>> >> > Hi, >>> >>> There are many items such as "/dev/shm/ust-shm-consumer-81132 (dele= ted)" exist >>> >>> in lttng-sessiond fd spaces. I know it is the result of shm_open() = and >>> >> > shm_unlnik() in create_posix_shm(). >>> >>> However, today, I found these items also exist in a traced applicat= ion which is >>> >>> a long-time running daemon. The most important thing I found is tha= t there >>> >> > seems no reliable way to release share memory. >>> >>> I tried to kill lttng-sessiond but not always release share memory.= Sometimes I >>> >>> need to kill the traced application to free share memory....But it = is not a >>> >> > good idea to kill these applications. >>> >> > My questions are: >>> >>> 1. Is there any way to release share memory without killing any tra= ced >>> >> > application? >>> >>> 2. Is it normal that many items such as "/dev/shm/ust-shm-consumer-= 81132 >>> >> > (deleted)" exist in the traced application? >>> >> > Thanks >>> >> > zhenyu.ren >>> >> > _______________________________________________ >>> >> > lttng-dev mailing list >>> >> > lttng-dev@lists.lttng.org >>> >> > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev >>> >> -- >>> >> Jonathan Rajotte-Julien >>> >> EfficiOS >>> > _______________________________________________ >>> > lttng-dev mailing list >>> > lttng-dev@lists.lttng.org >>> > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev >>> -- >>> Mathieu Desnoyers >>> EfficiOS Inc. >>> http://www.efficios.com > -- > Mathieu Desnoyers > EfficiOS Inc. > http://www.efficios.com --=20 Mathieu Desnoyers=20 EfficiOS Inc.=20 http://www.efficios.com=20 --=_33316504-2f92-49a7-a1f5-21db56df6daf Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable
Hi Zhenyu,

Can you try this fix please ?

https://review.lttng.org/c/lttn= g-ust/+/7530

And let me know how it goes.
Thanks,

Mathieu

----- O= n Mar 9, 2022, at 11:37 AM, Mathieu Desnoyers <mathieu.desnoyers@efficio= s.com> wrote:
= When this happpens, is the process holding a single (or very few) shm file = references, or references to many
shm files ?

I wonder if you end up in a scenario where an application very frequentl= y performs exec(), and therefore
sometimes the exec() will ha= ppen in the window between the unix socket file descriptor reception and
call to fcntl FD_CLOEXEC.

Thanks,
Mathieu

----- On Mar 8, 2022, a= t 8:29 PM, zhenyu.ren <zhenyu.ren@aliyun.com> wrote:
=
Thanks a&= nbsp; lot for reply. I do not reply it in bug tracker since I have not gott= en a reliable way to reproduce the leak case. 
------------------------------------------------------------------=
=E5=8F=91=E4=BB=B6=E4=BA=BA=EF=BC= =9AMathieu Desnoyers <mathieu.desnoyers@efficios.com>
=E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4=EF=BC=9A2022= =E5=B9=B43=E6=9C=888=E6=97=A5(=E6=98=9F=E6=9C=9F=E4=BA=8C) 23:26
=E6=94=B6=E4=BB=B6=E4=BA=BA=EF=BC=9Azhenyu.r= en <zhenyu.ren@aliyun.com>
= =E6=8A=84=E3=80=80=E9=80=81=EF=BC=9AJonathan Rajotte <jonathan.rajotte-j= ulien@efficios.com>; lttng-dev <lttng-dev@lists.lttng.org><= /div>
=E4=B8=BB=E3=80=80=E9=A2=98=EF=BC=9ARe: [= lttng-dev] =E5=9B=9E=E5=A4=8D=EF=BC=9A =E5=9B=9E=E5=A4=8D=EF=BC=9A =E5=9B= =9E=E5=A4=8D=EF=BC=9A shm leak in traced application?



----- On Mar 8,=  2022, at 12:18 AM, lttng-dev lttng-dev@lists= .lttng.org wrote:

> Hi,
> In shm_object= _table_append_shm()/alloc_shm()=EF=BC=8C why not calling&nbs= p;FD_CLOEXEC fcntl()
> to shmfds? I guess&nb= sp;this omission leads to shm fds leak.
Those file descriptors are created when rec= eived by ustcomm_recv_fds_unix_sock, and
immediately = ;after creation they are set as FD_CLOEXEC.
We should continue this discussion in t= he bug tracker as suggested by Jonathan.
I= t would greatly help if you can provide&= nbsp;a small reproducer.

Thanks,

Mathieu

> Thanks
> zhenyu.ren

>> ------------= ------------------------------------------------------
>> =E5= =8F=91=E4=BB=B6=E4=BA=BA=EF=BC=9AJonathan Rajotte-Julien <jona= than.rajotte-julien@efficios.com>
>> =E5=8F=91=E9=80=81=E6= =97=B6=E9=97=B4=EF=BC=9A2022=E5=B9=B42=E6=9C=8825=E6=97=A5(=E6=98=9F=E6=9C= =9F=E4=BA=94) 22:31
>> =E6=94=B6=E4=BB=B6=E4=BA=BA=EF=BC= =9Azhenyu.ren <zhenyu.ren@aliyun.com>
>> =E6=8A=84=  =E9=80=81=EF=BC=9Alttng-dev <lttng-dev@lists.lttng.org>>> =E4=B8=BB =E9=A2=98=EF=BC=9ARe: [lttng-dev] = =E5=9B=9E=E5=A4=8D=EF=BC=9A =E5=9B=9E=E5=A4=8D=EF=BC=9A shm = leak in traced application?

>> Hi zhe= nyu.ren,

>> Please open a bug on = ;our bug tracker and provide a reproducer&nbs= p;against the latest
>> stable version (2= .13.x).

>> https://bugs.lttng.org/

>> P= lease follow the guidelines: https://bugs.lttng.org/#Bu= g-reporting-guidelines

>> Cheers

>> On&= nbsp;Fri, Feb 25, 2022 at 12:47:34PM +0800,&n= bsp;zhenyu.ren via lttng-dev wrote:
>> >&nb= sp;Hi, lttng-dev team
>>> When lttng-sessio= nd exits, the ust applications should call>>> lttng_ust_objd_table_owner_cleanup() and clean=  up all shm resource(unmap and
>>>&nbs= p;close). Howerver I do find that the us= t applications keep opening "all" of the
&= gt;> > shm fds("/dev/shm/ust-shm-consumer-81132 (= deleted)") and do NOT free shm.
>>>&nb= sp;If we run lttng-sessiond again, ust applic= ations can get a new piece of shm a= nd
>>> a new list of shm fds = ;so double shm usages. Then if we kill&n= bsp;lttng-sessiond,
>>> what the mostlikely = ;happened is ust applications close the new&n= bsp;list of shm fds
>>> and free = new shm resource but keeping old shm sti= ll. In other word, we can not
>>&nbs= p;> free this piece of shm unless we&= nbsp;killing ust applications!!!
>>> So Is&= nbsp;there any possilbe that ust applications = ;failed calling
>>> lttng_ust_objd_table_owner_cleanu= p()? Do you have ever see this problem?&= nbsp;Do you
>>> have any advice to&nbs= p;free the shm without killling ust applicati= ons(I tried to
>> > dig into kern= el shm_open and /dev/shm, but not found = any ideas)?

>> > Thanks in advance=
>> > zhenyu.ren



>> >&nb= sp;------------------------------------------------------------------
&g= t;> > =E5=8F=91=E4=BB=B6=E4=BA=BA=EF=BC=9Azhenyu.ren v= ia lttng-dev <lttng-dev@lists.lttng.org>
>> &= gt; =E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4=EF=BC=9A2022=E5=B9=B42=E6=9C= =8823=E6=97=A5(=E6=98=9F=E6=9C=9F=E4=B8=89) 23:09
>> >= ; =E6=94=B6=E4=BB=B6=E4=BA=BA=EF=BC=9Alttng-dev <lttng-dev@lis= ts.lttng.org>
>> > =E4=B8=BB =E9=A2=98=EF=BC= =9A[lttng-dev] =E5=9B=9E=E5=A4=8D=EF=BC=9A shm leak in&= nbsp;traced application?

>>> >"I found&nb= sp;these items also exist in a traced ap= plication which is a long-time
>> >&nb= sp;>running daemon"
>> > Even if lt= tng-sessiond has been killed!!

>> >&nbs= p;Thanks
>> > zhenyu.ren
>> > -= -----------------------------------------------------------------
>&g= t; > =E5=8F=91=E4=BB=B6=E4=BA=BA=EF=BC=9Azhenyu.ren via&n= bsp;lttng-dev <lttng-dev@lists.lttng.org>
>> >&= nbsp;=E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4=EF=BC=9A2022=E5=B9=B42=E6=9C=8823= =E6=97=A5(=E6=98=9F=E6=9C=9F=E4=B8=89) 22:44
>> >&nbs= p;=E6=94=B6=E4=BB=B6=E4=BA=BA=EF=BC=9Alttng-dev <lttng-dev@lists.lt= tng.org>
>> > =E4=B8=BB =E9=A2=98=EF=BC=9A[lt= tng-dev] shm leak in traced application?

&g= t;> > Hi,
>>> There are many&nbs= p;items such as "/dev/shm/ust-shm-consumer-81132 (delet= ed)" exist
>>> in lttng-sessiond fd sp= aces. I know it is the result of sh= m_open() and
>> > shm_unlnik() in crea= te_posix_shm().
>>> However, today, I found=  these items also exist in a traced = ;application which is
>>> a long-time = running daemon. The most important thing I&nb= sp;found is that there
>> > seems = ;no reliable way to release share memory.
= >>> I tried to kill lttng-sessiond b= ut not always release share memory. Sometimes=  I
>>> need to kill the traced&nb= sp;application to free share memory....But it = ;is not a
>> > good idea to = kill these applications.
>> > My quest= ions are:
>>> 1. Is there any way=  to release share memory without killing = ;any traced
>> > application?
>>>&nb= sp;2. Is it normal that many items such&= nbsp;as "/dev/shm/ust-shm-consumer-81132
>> > (d= eleted)" exist in the traced application?

&= gt;> > Thanks
>> > zhenyu.ren

=

>> > ___________________________________________= ____
>> > lttng-dev mailing list
>>= ; > lttng-dev@lists.lttng.org
>> > https= ://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

>> = --
>> Jonathan Rajotte-Julien
>> EfficiOS<= br>> _______________________________________________
> l= ttng-dev mailing list
> lttng-dev@lists.lttng.org
&= gt; https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
--&n= bsp;
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios= .com


--
Ma= thieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

=

--
=
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
--=_33316504-2f92-49a7-a1f5-21db56df6daf-- --===============5146134758815724167== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ lttng-dev mailing list lttng-dev@lists.lttng.org https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev --===============5146134758815724167==--