linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Marc Gonzalez <marc.w.gonzalez@free.fr>
To: Robin Murphy <robin.murphy@arm.com>,
	Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>,
	Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>,
	Stephen Boyd <sboyd@kernel.org>,
	Michael Turquette <mturquette@baylibre.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Sudip Mukherjee <sudipm.mukherjee@gmail.com>,
	Russell King <rmk+kernel@armlinux.org.uk>,
	Guenter Roeck <linux@roeck-us.net>,
	linux-clk <linux-clk@vger.kernel.org>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH v1] clk: Convert managed get functions to devm_add_action API
Date: Wed, 11 Dec 2019 17:17:28 +0100	[thread overview]
Message-ID: <ba630966-5479-c831-d0e2-bc2eb12bc317@free.fr> (raw)
In-Reply-To: <c0ccca86-b7b1-b587-60c1-4794376fa789@arm.com>

On 02/12/2019 14:51, Robin Murphy wrote:

> On 02/12/2019 9:25 am, Marc Gonzalez wrote:
>
>> On 02/12/2019 02:42, Dmitry Torokhov wrote:
>>
>>> On Thu, Nov 28, 2019 at 10:56:30AM -0800, Bjorn Andersson wrote:
>>>
>>>> On Tue 26 Nov 08:13 PST 2019, Marc Gonzalez wrote:
>>>>
>>>>> Date: Tue, 26 Nov 2019 13:56:53 +0100
>>>>>
>>>>> Using devm_add_action_or_reset() produces simpler code and smaller
>>>>> object size:
>>>>>
>>>>> 1 file changed, 16 insertions(+), 46 deletions(-)
>>>>>
>>>>>      text	   data	    bss	    dec	    hex	filename
>>>>> -   1797	     80	      0	   1877	    755	drivers/clk/clk-devres.o
>>>>> +   1499	     56	      0	   1555	    613	drivers/clk/clk-devres.o
>>>>>
>>>>> Signed-off-by: Marc Gonzalez <marc.w.gonzalez@free.fr>
>>>>
>>>> Looks neat
>>>>
>>>> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
>>>
>>> This however increases the runtime costs as each custom action cost us
>>> an extra pointer. Given that in a system we likely have many clocks
>>> managed by devres, I am not sure that this code savings is actually
>>> gives us overall win. It might still, I just want to understand how we
>>> are allocating/packing devres structures.
>>
>> I'm not 100% sure what you are saying.
> 
> You reduce the text size by a constant amount, at the cost of allocating 
> twice as much runtime data per clock (struct action_devres  vs. void*). 
> Assuming 64-bit pointers, that means that in principle your ~320-byte 
> saving would be cancelled out at ~40 managed clocks. However, that's 
> also assuming that the minimum allocation granularity is no larger than 
> a single pointer, which generally isn't true, so in reality it depends 
> on whether the difference in data pushes the total struct devres 
> allocation over the next ARCH_KMALLOC_MINALIGN boundary - if it doesn't, 
> the difference comes entirely for free; if it does, the memory cost 
> tradeoff gets even worse.

Aaah... memory overhead. Thanks for pointing it out.

BEFORE

devm_clk_get()
  -> devres_alloc(devm_clk_release, sizeof(*ptr), GFP_KERNEL);
     allocates space for a struct devres + a pointer

struct devres {
	struct devres_node		node;
	/*
	 * Some archs want to perform DMA into kmalloc caches
	 * and need a guaranteed alignment larger than
	 * the alignment of a 64-bit integer.
	 * Thus we use ARCH_KMALLOC_MINALIGN here and get exactly the same
	 * buffer alignment as if it was allocated by plain kmalloc().
	 */
	u8 __aligned(ARCH_KMALLOC_MINALIGN) data[];
};

Not sure what it means for a flexible array member to be X-aligned...

(Since the field's address depends on the start address, which is only
determined at run-time...)

For example, on arm64, ARCH_KMALLOC_MINALIGN appears to be 128 (sometimes).

/*
 * Memory returned by kmalloc() may be used for DMA, so we must make
 * sure that all such allocations are cache aligned. Otherwise,
 * unrelated code may cause parts of the buffer to be read into the
 * cache before the transfer is done, causing old data to be seen by
 * the CPU.
 */
#define ARCH_DMA_MINALIGN	(128)


Unless the strict alignment is also imposed on kmalloc?

So basically, a struct devres starts on a multiple-of-128 address,
first the devres_node member, then padding to the next 128, then the
data member?


/*
 * Some archs want to perform DMA into kmalloc caches and need a guaranteed
 * alignment larger than the alignment of a 64-bit integer.
 * Setting ARCH_KMALLOC_MINALIGN in arch headers allows that.
 */
#if defined(ARCH_DMA_MINALIGN) && ARCH_DMA_MINALIGN > 8
#define ARCH_KMALLOC_MINALIGN ARCH_DMA_MINALIGN
#define KMALLOC_MIN_SIZE ARCH_DMA_MINALIGN
#define KMALLOC_SHIFT_LOW ilog2(ARCH_DMA_MINALIGN)
#else
#define ARCH_KMALLOC_MINALIGN __alignof__(unsigned long long)
#endif


A devres_node boils down to 2 object pointers + 1 function pointer.

Are there architectures supported by Linux where a function pointer
is not the same size as an object pointer? (ia64 maybe?)



OK, I will give this patch some more thought.

But I need to ask: what is the rationale for the devm_add_action API?

Regards.

  reply	other threads:[~2019-12-11 16:17 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-26 16:13 [PATCH v1] clk: Convert managed get functions to devm_add_action API Marc Gonzalez
2019-11-28 18:56 ` Bjorn Andersson
2019-12-02  1:42   ` Dmitry Torokhov
2019-12-02  9:25     ` Marc Gonzalez
2019-12-02 13:51       ` Robin Murphy
2019-12-11 16:17         ` Marc Gonzalez [this message]
2019-12-11 22:28           ` Dmitry Torokhov
2019-12-12 13:53             ` Marc Gonzalez
2019-12-12 14:17               ` Russell King - ARM Linux admin
2019-12-12 14:41                 ` Marc Gonzalez
2019-12-12 14:46                   ` Russell King - ARM Linux admin
2019-12-12 15:51                     ` Marc Gonzalez
2019-12-12 16:13                       ` Russell King - ARM Linux admin
2019-12-12 14:47               ` Robin Murphy
2019-12-12 16:59                 ` Marc Gonzalez
2019-12-12 17:05                   ` Russell King - ARM Linux admin
2019-12-12 18:15                   ` Robin Murphy
2019-12-12 19:10                     ` Dmitry Torokhov
2019-12-12 21:08                       ` Robin Murphy
2019-12-13  0:16                         ` Dmitry Torokhov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ba630966-5479-c831-d0e2-bc2eb12bc317@free.fr \
    --to=marc.w.gonzalez@free.fr \
    --cc=bjorn.andersson@linaro.org \
    --cc=dmitry.torokhov@gmail.com \
    --cc=kuninori.morimoto.gx@renesas.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-clk@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=mturquette@baylibre.com \
    --cc=rmk+kernel@armlinux.org.uk \
    --cc=robin.murphy@arm.com \
    --cc=sboyd@kernel.org \
    --cc=sudipm.mukherjee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).