From: Frank Rowand <frowand.list@gmail.com>
To: Rob Herring <robh+dt@kernel.org>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>,
devicetree@vger.kernel.org,
linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Paul Mackerras <paulus@samba.org>,
Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [RFC] Efficiency of the phandle_cache on ppc64/SLOF
Date: Tue, 10 Dec 2019 06:46:08 -0600 [thread overview]
Message-ID: <174936b9-bd2c-8850-aa8f-f4c605b0f1d5@gmail.com> (raw)
In-Reply-To: <c2334575-fa38-eb73-bb56-21a530e773bf@gmail.com>
On 12/10/19 2:17 AM, Frank Rowand wrote:
> On 12/9/19 7:51 PM, Rob Herring wrote:
>> On Mon, Dec 9, 2019 at 7:35 AM Sebastian Andrzej Siewior
>> <bigeasy@linutronix.de> wrote:
>>>
>>> On 2019-12-05 20:01:41 [-0600], Frank Rowand wrote:
>>>> Is there a memory usage issue for the systems that led to this thread?
>>>
>>> No, no memory issue led to this thread. I was just testing my patch and
>>> I assumed that I did something wrong in the counting/lock drop/lock
>>> acquire/allocate path because the array was hardly used. So I started to
>>> look deeper…
>>> Once I figured out everything was fine, I was curious if everyone is
>>> aware of the different phandle creation by dtc vs POWER. And I posted
>>> the mail in the thread.
>>> Once you confirmed that everything is "known / not an issue" I was ready
>>> to take off [0].
>>>
>>> Later more replies came in such as one mail [1] from Rob describing the
>>> original reason with 814 phandles. _Here_ I was just surprised that 1024
>>> were used over 64 entries for a benefit of 60ms. I understand that this
>>> is low concern for you because that memory is released if modules are
>>> not enabled. I usually see that module support is left enabled.
>>>
>>> However, Rob suggested / asked about the fixed size array (this is how I
>>> understood it):
>>> |And yes, as mentioned earlier I don't like the complexity. I didn't
>>> |from the start and I'm I'm still of the opinion we should have a
>>> |fixed or 1 time sized true cache (i.e. smaller than total # of
>>> |phandles). That would solve the RT memory allocation and locking issue
>>> |too.
>>>
>>> so I attempted to ask if we should have the fixed size array maybe
>>> with the hash_32() instead the mask. This would make my other patch
>>> obsolete because the fixed size array should not have a RT issue. The
>>> hash_32() part here would address the POWER issue where the cache is
>>> currently not used efficiently.
>>>
>>> If you want instead to keep things as-is then this is okay from my side.
>>> If you want to keep this cache off on POWER then I could contribute a
>>> patch doing so.
>>
>> It turns out there's actually a bug in the current implementation. If
>> we have multiple phandles with the same mask, then we leak node
>> references if we miss in the cache and re-assign the cache entry.
>
> Aaargh. Patch sent.
>
>> Easily fixed I suppose, but holding a ref count for a cached entry
>> seems wrong. That means we never have a ref count of 0 on every node
>> with a phandle.
>
> It will go to zero when the cache is freed.
>
> My memory is that we free the cache as part of removing an overlay. I'll
> verify whether my memory is correct.
And I'll look at non-overlay use of dynamic devicetree too.
-Frank
>
> -Frank
>
>
>>
>> I've done some more experiments with the performance. I've come to the
>> conclusion that just measuring boot time is not a good way at least if
>> the time is not a significant percentage of the total. I couldn't get
>> any measurable data. I'm using a RK3399 Rock960 board. It has 190
>> phandles. I get about 1500 calls to of_find_node_by_phandle() during
>> boot. Note that about the first 300 are before we have any timekeeping
>> (the prior measurements also would not account for this). Those calls
>> have no cache in the current implementation and are cached in my
>> implementation.
>>
>> no cache: 20124 us
>> current cache: 819 us
>>
>> new cache (64 entry): 4342 us
>> new cache (128 entry): 2875 us
>> new cache (256 entry): 1235 us
>>
>> To get some idea on the time spent before timekeeping is up, if we
>> take the avg miss time is ~17us (20124/1200), then we're spending
>> about ~5ms on lookups before the cache is enabled. I'd estimate the
>> new cache takes ~400us before timekeeping is up as there's 11 misses
>> early.
>>
>> >From these numbers, it seems the miss rate has a significant impact on
>> performance for the different sizes. But taken from the original 20+
>> ms, they all look like good improvement.
>>
>> Rob
>>
>
>
next prev parent reply other threads:[~2019-12-10 12:46 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-29 15:10 [RFC] Efficiency of the phandle_cache on ppc64/SLOF Sebastian Andrzej Siewior
2019-11-30 2:14 ` Frank Rowand
2019-12-02 11:07 ` Sebastian Andrzej Siewior
2019-12-03 4:12 ` Michael Ellerman
2019-12-03 4:28 ` Frank Rowand
2019-12-03 16:56 ` Rob Herring
2019-12-05 16:35 ` Sebastian Andrzej Siewior
2019-12-06 2:01 ` Frank Rowand
2019-12-09 13:35 ` Sebastian Andrzej Siewior
2019-12-10 1:51 ` Rob Herring
2019-12-10 8:17 ` Frank Rowand
2019-12-10 12:46 ` Frank Rowand [this message]
2019-12-11 14:42 ` Rob Herring
2019-12-06 1:52 ` Frank Rowand
2019-12-08 6:59 ` Frank Rowand
2019-12-03 4:03 ` Michael Ellerman
2019-12-03 18:35 ` Segher Boessenkool
2019-12-06 1:37 ` Frank Rowand
2019-12-06 23:40 ` Segher Boessenkool
2019-12-08 4:30 ` Frank Rowand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=174936b9-bd2c-8850-aa8f-f4c605b0f1d5@gmail.com \
--to=frowand.list@gmail.com \
--cc=benh@kernel.crashing.org \
--cc=bigeasy@linutronix.de \
--cc=devicetree@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mpe@ellerman.id.au \
--cc=paulus@samba.org \
--cc=robh+dt@kernel.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).