From: Qian Cai <cai@gmx.us>
To: Waiman Long <longman@redhat.com>, Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Yang Shi <yang.shi@linux.alibaba.com>,
arnd@arndb.de, linux kernel <linux-kernel@vger.kernel.org>,
Catalin Marinas <catalin.marinas@arm.com>
Subject: Re: [PATCH v4] debugobjects: scale the static pool size
Date: Sun, 25 Nov 2018 23:52:24 -0500 [thread overview]
Message-ID: <5abb31e1-b5f2-718d-3a48-b0d8a73d6e5c@gmx.us> (raw)
In-Reply-To: <473f6a6e-1a14-d07c-b0f0-4d96e3232d1a@redhat.com>
On 11/25/18 8:31 PM, Waiman Long wrote:
> On 11/25/2018 03:42 PM, Qian Cai wrote:
>>
>>
>> On 11/23/18 10:01 PM, Qian Cai wrote:
>>>
>>>
>>>> On Nov 22, 2018, at 4:56 PM, Thomas Gleixner <tglx@linutronix.de>
>>>> wrote:
>>>>
>>>> On Tue, 20 Nov 2018, Qian Cai wrote:
>>>>
>>>> Looking deeper at that.
>>>>
>>>>> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
>>>>> index 70935ed91125..140571aa483c 100644
>>>>> --- a/lib/debugobjects.c
>>>>> +++ b/lib/debugobjects.c
>>>>> @@ -23,9 +23,81 @@
>>>>> #define ODEBUG_HASH_BITS 14
>>>>> #define ODEBUG_HASH_SIZE (1 << ODEBUG_HASH_BITS)
>>>>>
>>>>> -#define ODEBUG_POOL_SIZE 1024
>>>>> +#define ODEBUG_DEFAULT_POOL 512
>>>>> #define ODEBUG_POOL_MIN_LEVEL 256
>>>>>
>>>>> +/*
>>>>> + * Some debug objects are allocated during the early boot.
>>>>> Enabling some options
>>>>> + * like timers or workqueue objects may increase the size required
>>>>> significantly
>>>>> + * with large number of CPUs. For example (as today, 20 Nov. 2018),
>>>>> + *
>>>>> + * No. CPUs x 2 (worker pool) objects:
>>>>> + *
>>>>> + * start_kernel
>>>>> + * workqueue_init_early
>>>>> + * init_worker_pool
>>>>> + * init_timer_key
>>>>> + * debug_object_init
>>>>> + *
>>>>> + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS):
>>>>> + *
>>>>> + * sched_init
>>>>> + * hrtick_rq_init
>>>>> + * hrtimer_init
>>>>> + *
>>>>> + * CONFIG_DEBUG_OBJECTS_WORK:
>>>>> + * No. CPUs x 6 (workqueue) objects:
>>>>> + *
>>>>> + * workqueue_init_early
>>>>> + * alloc_workqueue
>>>>> + * __alloc_workqueue_key
>>>>> + * alloc_and_link_pwqs
>>>>> + * init_pwq
>>>>> + *
>>>>> + * Also, plus No. CPUs objects:
>>>>> + *
>>>>> + * perf_event_init
>>>>> + * __init_srcu_struct
>>>>> + * init_srcu_struct_fields
>>>>> + * init_srcu_struct_nodes
>>>>> + * __init_work
>>>>
>>>> None of the things are actually used or required _BEFORE_
>>>> debug_objects_mem_init() is invoked.
>>>>
>>>> The reason why the call is at this place in start_kernel() is
>>>> historical. It's because back in the days when debugobjects were
>>>> added the
>>>> memory allocator was enabled way later than today. So we can just
>>>> move the
>>>> debug_objects_mem_init() call right before sched_init() I think.
>>>
>>> Well, now that kmemleak_init() seems complains that
>>> debug_objects_mem_init()
>>> is called before it.
>>>
>>> [ 0.078805] kmemleak: Cannot insert 0xc000000dff930000 into the
>>> object search tree (overlaps existing)
>>> [ 0.078860] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.20.0-rc3+ #3
>>> [ 0.078883] Call Trace:
>>> [ 0.078904] [c000000001c8fcd0] [c000000000c96b34]
>>> dump_stack+0xe8/0x164 (unreliable)
>>> [ 0.078935] [c000000001c8fd20] [c000000000486e84]
>>> create_object+0x344/0x380
>>> [ 0.078962] [c000000001c8fde0] [c000000000489544]
>>> early_alloc+0x108/0x1f8
>>> [ 0.078989] [c000000001c8fe20] [c00000000109738c]
>>> kmemleak_init+0x1d8/0x3d4
>>> [ 0.079016] [c000000001c8ff00] [c000000001054028]
>>> start_kernel+0x5c0/0x6f8
>>> [ 0.079043] [c000000001c8ff90] [c00000000000ae7c]
>>> start_here_common+0x1c/0x520
>>> [ 0.079070] kmemleak: Kernel memory leak detector disabled
>>> [ 0.079091] kmemleak: Object 0xc000000ffd587b68 (size 40):
>>> [ 0.079112] kmemleak: comm "swapper/0", pid 0, jiffies 4294937299
>>> [ 0.079135] kmemleak: min_count = -1
>>> [ 0.079153] kmemleak: count = 0
>>> [ 0.079170] kmemleak: flags = 0x5
>>> [ 0.079188] kmemleak: checksum = 0
>>> [ 0.079206] kmemleak: backtrace:
>>> [ 0.079227] __debug_object_init+0x688/0x700
>>> [ 0.079250] debug_object_activate+0x1e0/0x350
>>> [ 0.079272] __call_rcu+0x60/0x430
>>> [ 0.079292] put_object+0x60/0x80
>>> [ 0.079311] kmemleak_init+0x2cc/0x3d4
>>> [ 0.079331] start_kernel+0x5c0/0x6f8
>>> [ 0.079351] start_here_common+0x1c/0x520
>>> [ 0.079380] kmemleak: Early log backtrace:
>>> [ 0.079399] memblock_alloc_try_nid_raw+0x90/0xcc
>>> [ 0.079421] sparse_init_nid+0x144/0x51c
>>> [ 0.079440] sparse_init+0x1a0/0x238
>>> [ 0.079459] initmem_init+0x1d8/0x25c
>>> [ 0.079498] setup_arch+0x3e0/0x464
>>> [ 0.079517] start_kernel+0xa4/0x6f8
>>> [ 0.079536] start_here_common+0x1c/0x520
>>>
>>
>> So this is an chicken-egg problem. Debug objects need kmemleak_init()
>> first, so it can make use of kmemleak_ignore() for all debug objects
>> in order to avoid the overlapping like the above.
>>
>> while (obj_pool_free < debug_objects_pool_min_level) {
>>
>> new = kmem_cache_zalloc(obj_cache, gfp);
>> if (!new)
>> return;
>>
>> kmemleak_ignore(new);
>>
>> However, there seems no way to move kmemleak_init() together this
>> early in start_kernel() just before vmalloc_init() [1] because it
>> looks like it depends on things like workqueue
>> (schedule_work(&cleanup_work)) and rcu. Hence, it needs to be after
>> workqueue_init_early() and rcu_init()
>>
>> Given that, maybe the best outcome is to stick to the alternative
>> approach that works [1] rather messing up with the order of
>> debug_objects_mem_init() in start_kernel() which seems tricky. What do
>> you think?
>>
>> [1] https://goo.gl/18N78g
>> [2] https://goo.gl/My6ig6
>
> Could you move kmemleak_init() and debug_objects_mem_init() as far up as
> possible, like before the hrtimer_init() to at least make static count
> calculation as simple as possible?
>
Well, there is only 2 x NR_CPUS difference after moved both calls just after
rcu_init().
Before After
64-CPU: 1114 974
160-CPU: 2774 2429
256-CPU: 3853 4378
I suppose it is possible that the timers only need the scale factor 5 instead of
10. However, it needs to be retested for all the configurations to be sure, and
likely need to remove all irqs calls in kmemleak_init() and subroutines because
it is now called with irq disabled. Given the initdata will be freed anyway,
does it really worth doing?
BTW, calling debug_objects_mem_init() before kmemleak_init() actually could
trigger a loop on machines with 160+ CPUs until the pool is filled up,
debug_objects_pool_min_level += num_possible_cpus() * 4;
[1] while (obj_pool_free < debug_objects_pool_min_level)
kmemleak_init
kmemleak_ignore (from replaced static debug objects)
make_black_object
put_object
__call_rcu (kernel/rcu/tree.c)
debug_rcu_head_queue
debug_object_activate
debug_object_init
fill_pool
kmemleak_ignore (looping in [1])
make_black_object
...
I think until this is resolved, there is no way to move debug_objects_mem_init()
before kmemleak_init().
next prev parent reply other threads:[~2018-11-26 4:54 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-20 20:14 [PATCH v2] debugobjects: scale the static pool size Qian Cai
2018-11-20 20:54 ` Waiman Long
2018-11-20 20:56 ` Thomas Gleixner
2018-11-20 23:28 ` [PATCH v3] " Qian Cai
2018-11-20 23:38 ` Waiman Long
2018-11-20 23:54 ` Qian Cai
2018-11-20 23:58 ` Joe Perches
2018-11-21 2:11 ` [PATCH v4] " Qian Cai
2018-11-21 14:45 ` Waiman Long
2018-11-22 21:56 ` Thomas Gleixner
2018-11-23 4:31 ` [PATCH] debugobjects: call debug_objects_mem_init eariler Qian Cai
2018-11-23 4:59 ` Waiman Long
2018-11-23 21:46 ` Thomas Gleixner
2018-11-24 2:54 ` Qian Cai
2018-11-24 3:01 ` [PATCH v4] debugobjects: scale the static pool size Qian Cai
2018-11-25 20:42 ` Qian Cai
2018-11-26 1:31 ` Waiman Long
2018-11-26 4:52 ` Qian Cai [this message]
2018-11-26 9:45 ` Qian Cai
2018-11-26 10:50 ` Catalin Marinas
2018-11-25 20:48 ` Qian Cai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5abb31e1-b5f2-718d-3a48-b0d8a73d6e5c@gmx.us \
--to=cai@gmx.us \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=catalin.marinas@arm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=longman@redhat.com \
--cc=tglx@linutronix.de \
--cc=yang.shi@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.