From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3953FC433EF for ; Tue, 3 May 2022 04:29:14 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8AA5A10F868; Tue, 3 May 2022 04:29:13 +0000 (UTC) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id E55C310F868 for ; Tue, 3 May 2022 04:29:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651552151; x=1683088151; h=date:message-id:from:to:cc:subject:in-reply-to: references:mime-version:content-transfer-encoding; bh=xndV76CUyD6/BAF9jkAkEvXoiE4IZMkSjzdoWT5p9LY=; b=Sg2h1G4dwBFpC4xjjqucWAQcLIp9Js2bYggCvzeYhWG23OXuO4+IaWbG 7sQ1tdUjz3WQtWREQeDv67muZS5oyz5OriH3FdJOUr6JssNvLdDr6OSe4 qNMVSwuayn5xjzRPMynFokkxyRbxKwkOV8EtatxOcpeN/mdRwYlf6cuqq Za0wEibl5nOQ7sCwA0pfR1Tsue0ebAFKU8NA65xjGhnP1jFaiinMwxSFt JMdHX0XBShFBs08stD7l+KDSJpYHF5cemcXjxzuRsa40jE9iOOHppnX16 I2GO46fYGw4vA3LFdZELfHHdSFlur6GmlsjEFBzIMSgSTKoqFiAho2fON g==; X-IronPort-AV: E=McAfee;i="6400,9594,10335"; a="247303625" X-IronPort-AV: E=Sophos;i="5.91,194,1647327600"; d="scan'208";a="247303625" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 May 2022 21:29:11 -0700 X-IronPort-AV: E=Sophos;i="5.91,194,1647327600"; d="scan'208";a="536189510" Received: from adixit-mobl1.amr.corp.intel.com (HELO adixit-arch.intel.com) ([10.209.43.60]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 May 2022 21:29:10 -0700 Date: Mon, 02 May 2022 21:29:10 -0700 Message-ID: <87sfpr44tl.wl-ashutosh.dixit@intel.com> From: "Dixit, Ashutosh" To: Andrzej Hajda In-Reply-To: References: <9ed5af1177ad08c7c2d9c5d9b32ab0154dbd950f.1650430271.git.ashutosh.dixit@intel.com> <1339a2be-5fd0-cf65-d361-06c60d938ce5@intel.com> <87levzag3a.wl-ashutosh.dixit@intel.com> <87ee1i5k58.wl-ashutosh.dixit@intel.com> <871qxg5xda.wl-ashutosh.dixit@intel.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.2 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Subject: Re: [Intel-gfx] [PATCH 7/9] drm/i915/gt: Fix memory leaks in per-gt sysfs X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: intel-gfx@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" On Sun, 01 May 2022 23:22:02 -0700, Andrzej Hajda wrote: > On 29.04.2022 06:25, Dixit, Ashutosh wrote: > > On Thu, 28 Apr 2022 07:36:14 -0700, Andrzej Hajda wrote: > >> See [1], it is quite old, so maybe it is not valid anymore, but I see = no > >> code proving sth has changed. > > Hi Andrzej, > > > > A lot has changed since that article from 2003 (for 2.5 kernel). For > > instance there is kernfs (as I mention above): > > > > https://lwn.net/Articles/571590/ > > > > A process having a sysfs file open today in my view will result in the > > following: > > * It will take a reference on kernfs_node (not on kobject as was the ca= se > > in kernel 2.5 in [1]) > > * An open file will prevent the module from being unloaded (not the ker= nel > > crashing as in 2.5 in [1]) > > Thats nice, but kernfs_node->priv still points to kobject so their > lifetimes are bounded. Yes, of course there has to be some connection between the kernfs and kobje= ct. > > So this is what I would expect with today's kernel. I am not seeing > > anything we've done here which violates anything in [1] or [2]. > >> Also current doc says also [2] similar things, especially: > >> "Once you registered your kobject via kobject_add(), you must never use > >> kfree() to free it directly" > > Correct, we are using kobject_put(), not kfree'ing the kobject. > > That I wouldn't agree. kobject_put is called, then the object in which > kobject is embedded is kfree'd somewhere later on driver removal, without > awareness of this kobject. According to your analysis it should have 0 > refs, but this is analysis of the current code, even if it is true now it > could change in the future. Yes but we cannot anticipate all changes which can happen in the future, (though we should handle and make any changes which we can anticipate at present). This is also basically what the kernel philosophy is, don't make unnecessary generalizations and try to handle unforseen situations which can happen in the future. Let me add some explanations about the patch before addressing your next point. 1. We are adding 'struct kobject sysfs_gt' to 'struct intel_gt'. We are adding the kobject directly, not pointer to kobject. This allows us to "reach" 'struct intel_gt' from the kobject using a simple container_of: see kobj_to_gt(). 2. Because the kobject is not kmalloc'd it cannot be kfree'd so the release method has to be empty (or NULL). 'struct intel_gt' is kmalloc'd separately elsewhere and memory for the kobject will be freed as part of intel_gt. 3. To provide a NULL or empty release method we need to provide a 'struct kobj_type kobj_gt_type' associated with the sysfs_gt kobject. This works nicely because we were anyway need one for .default_groups (we may add other attributes to 'id_groups' in the future). Note that the kobject is initialized and added to sysfs using kobject_init_and_add(). 4. The only reason for providing an empty release method rather than a NULL release method is the following pr_debug in kobject_cleanup(): if (t && !t->release) pr_debug("kobject: '%s' (%p): does not have a release() fun= ction, it is broken and must be fixed. See Documentation/core-api/kobject.r= st.\n", kobject_name(kobj), kobj); This statement could possibly be removed because the release method is not needed in the case I just described above, maybe I'll send a patch to suggest removing it. Though I think what they will say is that since NULL release methods are uncommon maybe just provide an empty release method when you need a NULL release method (which is what I have done in the patch). Also note that, as described below, there are several other cases in the kernel which either have NULL or an empty release methods. See below. > And IMO it is against docs[2]: > - "One important point cannot be overstated: every kobject must have a > release() method, and the kobject must persist (in a consistent state) > until that method is called. If these constraints are not met, the code is > flawed." - empty release method means clearly it is against the docs. > -"The end result is that a structure protected by a kobject cannot be fre= ed > before its reference count goes to zero. The reference count is not under > the direct control of the code which created the kobject.". > > So either docs and part of kobject code were not updated to reflect > changes you are assuming, either your assumption is incorrect. In my view the doc is a general introduction to kobjects and simplifies things. As shown below there are numerous examples in the kernel of both NULL and empty release methods. I just went with the empty method because of the reason mentioned above. > Looking at other users of kobject it seems they follow docs, their > release method either frees memory directly either kref_put on containing > struct, it was just quick scan so I could overlooked sth. > > >> [1]: https://lwn.net/Articles/36850/ > >> [2]: > >> https://elixir.bootlin.com/linux/v5.18-rc4/source/Documentation/core-a= pi/kobject.rst#L246 =46rom my comments above, the trick to finding NULL or empty methods is to search for the following construct: kobject_init_and_add(&xyz, ...) If we do this we find the following: NULL release method: procfs_queue_type integrity_ktype cppc_ktype acpi_hotplug_profile_ktype rnbd_dev_ktype ioat_ktype ab8500_fg_ktype Empty release method: blk_ia_range_sysfs_nop_release() I hope these examples should be sufficient to show that the release method can be both NULL or empty. So I still haven't found any reason to make changes to the v2 patch which I have previously shown works correctly and without issues. Thanks. -- Ashutosh