From: Mike Galbraith <efault@gmx.de>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Vlastimil Babka <vbabka@suse.cz>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Christoph Lameter <cl@linux.com>,
David Rientjes <rientjes@google.com>,
Pekka Enberg <penberg@kernel.org>,
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
Thomas Gleixner <tglx@linutronix.de>,
Mel Gorman <mgorman@techsingularity.net>,
Jesper Dangaard Brouer <brouer@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Jann Horn <jannh@google.com>
Subject: Re: [RFC v2 00/34] SLUB: reduce irq disabled scope and make it RT compatible
Date: Sat, 03 Jul 2021 17:47:01 +0200 [thread overview]
Message-ID: <ca0474137c1e5a16a1215693298c9cd93218e24c.camel@gmx.de> (raw)
In-Reply-To: <891dc24e38106f8542f4c72831d52dc1a1863ae8.camel@gmx.de>
On Sat, 2021-07-03 at 09:24 +0200, Mike Galbraith wrote:
>
> It also appears to be saying that there's something RT specific to
> stare at in addition to the list_lock business.
The what is ___slab_alloc() consuming 3.9% CPU in tip-rt-slub whereas
it consumes < 1% in both tip-rt (sans slub patches) and tip-slub.
The why remains to ponder.
5.13.0.g60ab3ed-tip-rt 5.13.0.g60ab3ed-tip-rt-slub 5.13.0.g60ab3ed-tip-slub
25.18% copy_user_enhanced_fast_string copy_user_enhanced_fast_string copy_user_enhanced_fast_string
5.08% unix_stream_read_generic unix_stream_read_generic unix_stream_read_generic
3.39% rt_spin_lock *** ___slab_alloc *** __skb_datagram_iter
2.80% __skb_datagram_iter rt_spin_lock _raw_spin_lock
2.11% get_page_from_freelist __skb_datagram_iter __alloc_skb
2.01% skb_release_data rt_spin_unlock skb_release_data
1.94% rt_spin_unlock get_page_from_freelist __alloc_pages
1.85% __alloc_skb migrate_enable unix_stream_sendmsg
1.68% __schedule skb_release_data _raw_spin_lock_irqsave
1.67% unix_stream_sendmsg __schedule free_pcppages_bulk
1.50% free_pcppages_bulk unix_stream_sendmsg __slab_free
1.38% migrate_enable free_pcppages_bulk __fget_light
1.24% __fget_light __alloc_pages vfs_write
1.16% __slab_free migrate_disable __schedule
1.14% __alloc_pages __fget_light get_page_from_freelist
1.10% fsnotify __slab_free new_sync_write
1.07% kfree fsnotify fsnotify
5.13.0.g60ab3ed-tip-rt-slub ___slab_alloc() consumes 3.90%
0.40 │ mov 0x28(%r13),%edx
0.42 │ add %r15,%rdx
│ __swab():
│ #endif
│
│ static __always_inline unsigned long __swab(const unsigned long y)
│ {
│ #if __BITS_PER_LONG == 64
│ return __swab64(y);
0.05 │ mov %rdx,%rax
1.14 │ bswap %rax
│ freelist_ptr():
│ return (void *)((unsigned long)ptr ^ s->random ^ <== CONFIG_SLAB_FREELIST_HARDENED
0.72 │ xor 0xb0(%r13),%rax
65.41 │ xor (%rdx),%rax <== huh? miss = 65% of that 3.9% kernel util?
│ next_tid():
│ return tid + TID_STEP;
0.09 │ addq $0x200,0x48(%r12)
│ ___slab_alloc():
│ * freelist is pointing to the list of objects to be used.
│ * page is pointing to the page from which the objects are obtained.
│ * That page must be frozen for per cpu allocations to work.
│ */
│ VM_BUG_ON(!c->page->frozen);
│ c->freelist = get_freepointer(s, freelist);
0.05 │ mov %rax,0x40(%r12)
│ c->tid = next_tid(c->tid);
│ local_unlock_irqrestore(&s->cpu_slab->lock, flags);
5.13.0.g60ab3ed-tip-rt ___slab_alloc() consumes < 1%
Percent│ }
│
│ /* must check again c->freelist in case of cpu migration or IRQ */
│ freelist = c->freelist;
0.02 │ a1: mov (%r14),%r13
│ if (freelist)
│ test %r13,%r13
0.02 │ ↓ je 460
│ get_freepointer():
│ return freelist_dereference(s, object + s->offset);
0.23 │ ad: mov 0x28(%r12),%edx
0.18 │ add %r13,%rdx
│ __swab():
│ #endif
│
│ static __always_inline unsigned long __swab(const unsigned long y)
│ {
│ #if __BITS_PER_LONG == 64
│ return __swab64(y);
0.06 │ mov %rdx,%rax
1.16 │ bswap %rax
│ freelist_ptr():
│ return (void *)((unsigned long)ptr ^ s->random ^
0.23 │ xor 0xb0(%r12),%rax
35.25 │ xor (%rdx),%rax <== 35% of < 1% kernel util
│ next_tid():
│ return tid + TID_STEP;
0.28 │ addq $0x200,0x8(%r14)
│ ___slab_alloc():
│ * freelist is pointing to the list of objects to be used.
│ * page is pointing to the page from which the objects are obtained.
│ * That page must be frozen for per cpu allocations to work.
│ */
│ VM_BUG_ON(!c->page->frozen);
│ c->freelist = get_freepointer(s, freelist);
5.13.0.g60ab3ed-tip-slub ___slab_alloc() also consumes < 1%
Percent│ load_freelist:
│
│ lockdep_assert_held(this_cpu_ptr(&s->cpu_slab->lock));
0.28 │ 84: add this_cpu_off,%rax
│ get_freepointer():
│ return freelist_dereference(s, object + s->offset);
0.14 │ mov 0x28(%r14),%eax
│ ___slab_alloc():
│ * freelist is pointing to the list of objects to be used.
│ * page is pointing to the page from which the objects are obtained.
│ * That page must be frozen for per cpu allocations to work.
│ */
│ VM_BUG_ON(!c->page->frozen);
│ c->freelist = get_freepointer(s, freelist);
34.36 │ mov 0x0(%r13,%rax,1),%rax
│ next_tid():
│ return tid + TID_STEP;
0.10 │ addq $0x1,0x8(%r12)
│ ___slab_alloc():
│ c->freelist = get_freepointer(s, freelist);
0.04 │ mov %rax,(%r12)
│ c->tid = next_tid(c->tid);
│ local_unlock_irqrestore(&s->cpu_slab->lock, flags);
0.12 │ mov (%r14),%rax
0.03 │ add this_cpu_off,%rax
│ arch_local_irq_restore():
│ return arch_irqs_disabled_flags(flags);
│ }
next prev parent reply other threads:[~2021-07-03 15:47 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-09 11:38 [RFC v2 00/34] SLUB: reduce irq disabled scope and make it RT compatible Vlastimil Babka
2021-06-09 11:38 ` [RFC v2 01/34] mm, slub: don't call flush_all() from list_locations() Vlastimil Babka
2021-06-09 11:38 ` [RFC v2 02/34] mm, slub: allocate private object map for sysfs listings Vlastimil Babka
2021-06-09 13:29 ` Christoph Lameter
2021-06-09 13:29 ` Christoph Lameter
2021-06-09 11:38 ` [RFC v2 03/34] mm, slub: allocate private object map for validate_slab_cache() Vlastimil Babka
2021-06-09 11:38 ` [RFC v2 04/34] mm, slub: don't disable irq for debug_check_no_locks_freed() Vlastimil Babka
2021-06-09 11:38 ` [RFC v2 05/34] mm, slub: remove redundant unfreeze_partials() from put_cpu_partial() Vlastimil Babka
2021-06-09 11:38 ` [RFC v2 06/34] mm, slub: unify cmpxchg_double_slab() and __cmpxchg_double_slab() Vlastimil Babka
2021-06-09 11:38 ` [RFC v2 07/34] mm, slub: extract get_partial() from new_slab_objects() Vlastimil Babka
2021-06-09 11:38 ` [RFC v2 08/34] mm, slub: dissolve new_slab_objects() into ___slab_alloc() Vlastimil Babka
2021-06-09 11:38 ` [RFC v2 09/34] mm, slub: return slab page from get_partial() and set c->page afterwards Vlastimil Babka
2021-06-09 11:38 ` [RFC v2 10/34] mm, slub: restructure new page checks in ___slab_alloc() Vlastimil Babka
2021-06-09 11:38 ` [RFC v2 11/34] mm, slub: simplify kmem_cache_cpu and tid setup Vlastimil Babka
2021-06-09 11:38 ` [RFC v2 12/34] mm, slub: move disabling/enabling irqs to ___slab_alloc() Vlastimil Babka
2021-07-06 4:38 ` Mike Galbraith
2021-07-06 4:38 ` Mike Galbraith
2021-06-09 11:38 ` [RFC v2 13/34] mm, slub: do initial checks in ___slab_alloc() with irqs enabled Vlastimil Babka
2021-06-09 11:38 ` [RFC v2 14/34] mm, slub: move disabling irqs closer to get_partial() in ___slab_alloc() Vlastimil Babka
2021-06-09 11:38 ` [RFC v2 15/34] mm, slub: restore irqs around calling new_slab() Vlastimil Babka
2021-06-09 11:38 ` [RFC v2 16/34] mm, slub: validate slab from partial list or page allocator before making it cpu slab Vlastimil Babka
2021-06-09 11:38 ` [RFC v2 17/34] mm, slub: check new pages with restored irqs Vlastimil Babka
2021-06-09 11:38 ` [RFC v2 18/34] mm, slub: stop disabling irqs around get_partial() Vlastimil Babka
2021-06-09 11:38 ` [RFC v2 19/34] mm, slub: move reset of c->page and freelist out of deactivate_slab() Vlastimil Babka
2021-06-09 11:38 ` [RFC v2 20/34] mm, slub: make locking in deactivate_slab() irq-safe Vlastimil Babka
2021-06-09 11:38 ` [RFC v2 21/34] mm, slub: call deactivate_slab() without disabling irqs Vlastimil Babka
2021-06-09 11:38 ` [RFC v2 22/34] mm, slub: move irq control into unfreeze_partials() Vlastimil Babka
2021-06-09 11:38 ` [RFC v2 23/34] mm, slub: discard slabs in unfreeze_partials() without irqs disabled Vlastimil Babka
2021-06-09 11:38 ` [RFC v2 24/34] mm, slub: detach whole partial list at once in unfreeze_partials() Vlastimil Babka
2021-06-09 11:38 ` [RFC v2 25/34] mm, slub: detach percpu partial list in unfreeze_partials() using this_cpu_cmpxchg() Vlastimil Babka
2021-06-09 11:38 ` [RFC v2 26/34] mm, slub: only disable irq with spin_lock in __unfreeze_partials() Vlastimil Babka
2021-06-09 11:38 ` [RFC v2 27/34] mm, slub: don't disable irqs in slub_cpu_dead() Vlastimil Babka
2021-06-09 11:38 ` [RFC v2 28/34] mm, slab: make flush_slab() possible to call with irqs enabled Vlastimil Babka
2021-06-09 11:38 ` [RFC v2 29/34] mm: slub: Move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context Vlastimil Babka
2021-06-09 22:29 ` Cyrill Gorcunov
2021-06-10 8:32 ` Vlastimil Babka
2021-06-10 8:36 ` Cyrill Gorcunov
2021-07-07 6:33 ` Hillf Danton
2021-06-09 11:38 ` [RFC v2 30/34] mm: slub: Make object_map_lock a raw_spinlock_t Vlastimil Babka
2021-06-09 11:39 ` [RFC v2 31/34] mm, slub: optionally save/restore irqs in slab_[un]lock()/ Vlastimil Babka
2021-07-02 12:17 ` Sebastian Andrzej Siewior
2021-06-09 11:39 ` [RFC v2 32/34] mm, slub: make slab_lock() disable irqs with PREEMPT_RT Vlastimil Babka
2021-06-09 11:39 ` [RFC v2 33/34] mm, slub: use migrate_disable() on PREEMPT_RT Vlastimil Babka
2021-06-14 11:07 ` Vlastimil Babka
2021-06-14 11:16 ` Sebastian Andrzej Siewior
2021-06-14 11:33 ` Vlastimil Babka
2021-06-14 12:54 ` Vlastimil Babka
2021-06-14 14:01 ` Sebastian Andrzej Siewior
2021-06-09 11:39 ` [RFC v2 34/34] mm, slub: convert kmem_cpu_slab protection to local_lock Vlastimil Babka
2021-06-14 9:49 ` [RFC v2 00/34] SLUB: reduce irq disabled scope and make it RT compatible Mel Gorman
2021-06-14 11:31 ` Mel Gorman
2021-06-14 11:10 ` Vlastimil Babka
2021-07-02 18:29 ` Sebastian Andrzej Siewior
2021-07-02 20:25 ` Vlastimil Babka
2021-07-29 13:49 ` Sebastian Andrzej Siewior
2021-07-29 14:17 ` Vlastimil Babka
2021-07-29 14:37 ` Sebastian Andrzej Siewior
2021-07-03 7:24 ` Mike Galbraith
2021-07-03 15:47 ` Mike Galbraith [this message]
2021-07-04 5:37 ` Mike Galbraith
2021-07-18 7:41 ` Vlastimil Babka
2021-07-18 8:29 ` Mike Galbraith
2021-07-18 12:09 ` Mike Galbraith
2021-07-05 16:00 ` Mike Galbraith
2021-07-06 17:56 ` Mike Galbraith
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ca0474137c1e5a16a1215693298c9cd93218e24c.camel@gmx.de \
--to=efault@gmx.de \
--cc=bigeasy@linutronix.de \
--cc=brouer@redhat.com \
--cc=cl@linux.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=jannh@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=penberg@kernel.org \
--cc=peterz@infradead.org \
--cc=rientjes@google.com \
--cc=tglx@linutronix.de \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.