From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.0 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1340FC433ED for ; Thu, 29 Apr 2021 17:31:17 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 45E8B61456 for ; Thu, 29 Apr 2021 17:31:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 45E8B61456 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:45286 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lcAVP-0006Rd-3F for qemu-devel@archiver.kernel.org; Thu, 29 Apr 2021 13:31:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35618) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lcASX-0004j8-Gf for qemu-devel@nongnu.org; Thu, 29 Apr 2021 13:28:17 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:49663) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lcASU-0000Ha-FF for qemu-devel@nongnu.org; Thu, 29 Apr 2021 13:28:16 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1619717293; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qEj9rTbRYL/4FuC3vNeTZtjh+C+YHb6HJW/7DPiM/h8=; b=RbDAuJdnLZ8EG/LsQ8YTLeS/Jdav2/6p31I9qipr5AL06TqB6H+wbgQj5mzh2iiVd+libS 7lAQn13O9COx4jQaXdFT24W9IXJs2Mu7MG3mMaEqcNtFf8UO0S9WhtqusmbpqhYu4kZdXl Zzehjm6s/S9+vf0tMxUGb5pJaRndHkk= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-370-ZEuEL4v1PRi4B51YOxUsow-1; Thu, 29 Apr 2021 13:28:03 -0400 X-MC-Unique: ZEuEL4v1PRi4B51YOxUsow-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4B27F1922036; Thu, 29 Apr 2021 17:28:02 +0000 (UTC) Received: from redhat.com (ovpn-115-232.ams2.redhat.com [10.36.115.232]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 9F89B6E407; Thu, 29 Apr 2021 17:27:51 +0000 (UTC) Date: Thu, 29 Apr 2021 18:27:48 +0100 From: Daniel =?utf-8?B?UC4gQmVycmFuZ8Op?= To: "Richard W.M. Jones" Subject: Re: libnbd thread-local errors and dlclose (was: Re: [ANNOUNCE] libblkio v0.1.0 preview release) Message-ID: References: <20210429142259.GR26415@redhat.com> <20210429150038.GT26415@redhat.com> <20210429171732.GX26415@redhat.com> MIME-Version: 1.0 In-Reply-To: <20210429171732.GX26415@redhat.com> User-Agent: Mutt/2.0.6 (2021-03-06) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=berrange@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=170.10.133.124; envelope-from=berrange@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -29 X-Spam_score: -3.0 X-Spam_bar: --- X-Spam_report: (-3.0 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.22, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Daniel =?utf-8?B?UC4gQmVycmFuZ8Op?= Cc: Kevin Wolf , pkrempa@redhat.com, Alberto Garcia , slp@redhat.com, qemu-block@nongnu.org, Markus Armbruster , qemu-devel@nongnu.org, mreitz@redhat.com, Vladimir Sementsov-Ogievskiy , Stefan Hajnoczi , Klaus Jensen , philmd@redhat.com, sgarzare@redhat.com Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" On Thu, Apr 29, 2021 at 06:17:32PM +0100, Richard W.M. Jones wrote: > On Thu, Apr 29, 2021 at 04:08:22PM +0100, Daniel P. Berrangé wrote: > > On Thu, Apr 29, 2021 at 04:00:38PM +0100, Richard W.M. Jones wrote: > > > On Thu, Apr 29, 2021 at 03:41:29PM +0100, Stefan Hajnoczi wrote: > > > > On Thu, Apr 29, 2021 at 03:22:59PM +0100, Richard W.M. Jones wrote: > > > > > libvirt originally, and now libnbd, keep a per-thread error message > > > > > (stored in thread-local storage). It's a lot nicer than having to > > > > > pass &errmsg to every function. You can just write: > > > > > > > > > > if (nbd_connect_tcp (nbd, "remote", "nbd") == -1) { > > > > > fprintf (stderr, > > > > > "failed to connect to remote server: %s (errno = %d)\n", > > > > > nbd_get_error (), nbd_get_errno ()); > > > > > exit (EXIT_FAILURE); > > > > > } > > > > > > > > > > (https://libguestfs.org/libnbd.3.html#ERROR-HANDLING) > > > > > > > > > > It means you can extend the range of error information available in > > > > > future. Also you can return a 'const char *' and the application > > > > > doesn't have to worry about lifetimes, at least in the common case. > > > > > > > > Thanks for sharing the idea, I think it would work well for libblkio > > > > too. > > > > > > > > Do you ignore the dlclose(3) memory leak? > > > > > > I believe this mechanism in libnbd ensures there is no leak in the > > > ordinary shared library (not dlopen/dlclose) case: > > > > > > https://gitlab.com/nbdkit/libnbd/-/blob/f9257a9fdc68706a4079deb4ced61e1d468f28d6/lib/errors.c#L35 > > > > > > However I'm not sure what happens in the dlopen case, so I'm going to > > > test that out now ... > > > > pthread_key destructors are a disaster waiting to happen with > > dlclose, if the dlclose happens while non-main threads are > > still running. When those threads exit, they'll run the > > destructor which points to a function that is no longer > > resident in memory. > > While libnbd had a memory leak there, now fixed by: > > https://gitlab.com/nbdkit/libnbd/-/commit/026d281c57dd95485cc9bf829918b5efd9e32ddb So on dlclose(), the errors_free() method will be run by the linker. The pthread_key_delete method doesn't run the cleanup callbacks, so your fix here frees the thread local error string in the thread that called dlclose explicitly. If any other thread had an error string associated with the thread local, that will still be leaked. I don't see any attractive way to avoid that leak. IOW, I think the fix is incomplete, but its better than nothing, and I don't have suggestion to improve it. > I don't actually think we have the bug you're describing. We have a > destructor (errors_free()) which runs on dlclose, which deletes the > thread-local storage key, so the destructor will not run after the > library has been unloaded. > > I added a test for this which works fine for me: > > https://gitlab.com/nbdkit/libnbd/-/commit/831d142787aba4f5b638418e02cf7e0f2a051565 Yes, I think your using of ELF destructor function is enough in this scenario. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|