* [patch] x86, perf: avoid spamming kernel log for bts buffer failure @ 2014-06-30 23:04 David Rientjes 2014-07-01 9:34 ` Peter Zijlstra ` (2 more replies) 0 siblings, 3 replies; 10+ messages in thread From: David Rientjes @ 2014-06-30 23:04 UTC (permalink / raw) To: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo Cc: Stephane Eranian, x86, linux-kernel It's unnecessary to excessively spam the kernel log anytime the BTS buffer cannot be allocated, so make this allocation __GFP_NOWARN. The user probably will want to at least find some artifact that the allocation has failed in the past, probably due to fragmentation because of its large size, when it's not allocated at bootstrap. Thus, add a WARN_ONCE() so something is left behind for them to understand why perf commnads that require PEBS is not working properly. Signed-off-by: David Rientjes <rientjes@google.com> --- arch/x86/kernel/cpu/perf_event_intel_ds.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c --- a/arch/x86/kernel/cpu/perf_event_intel_ds.c +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c @@ -311,9 +311,11 @@ static int alloc_bts_buffer(int cpu) if (!x86_pmu.bts) return 0; - buffer = kzalloc_node(BTS_BUFFER_SIZE, GFP_KERNEL, node); - if (unlikely(!buffer)) + buffer = kzalloc_node(BTS_BUFFER_SIZE, GFP_KERNEL | __GFP_NOWARN, node); + if (unlikely(!buffer)) { + WARN_ONCE(1, "%s: BTS buffer allocation failure\n", __func__); return -ENOMEM; + } max = BTS_BUFFER_SIZE / BTS_RECORD_SIZE; thresh = max / 16; ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] x86, perf: avoid spamming kernel log for bts buffer failure 2014-06-30 23:04 [patch] x86, perf: avoid spamming kernel log for bts buffer failure David Rientjes @ 2014-07-01 9:34 ` Peter Zijlstra 2014-07-02 13:16 ` Stephane Eranian 2014-07-14 23:33 ` David Rientjes 2014-07-16 19:21 ` [tip:perf/urgent] perf/x86/intel: Avoid spamming kernel log for BTS " tip-bot for David Rientjes 2 siblings, 1 reply; 10+ messages in thread From: Peter Zijlstra @ 2014-07-01 9:34 UTC (permalink / raw) To: David Rientjes Cc: Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo, Stephane Eranian, x86, linux-kernel [-- Attachment #1: Type: text/plain, Size: 960 bytes --] On Mon, Jun 30, 2014 at 04:04:08PM -0700, David Rientjes wrote: > It's unnecessary to excessively spam the kernel log anytime the BTS buffer > cannot be allocated, so make this allocation __GFP_NOWARN. > > The user probably will want to at least find some artifact that the > allocation has failed in the past, probably due to fragmentation because > of its large size, when it's not allocated at bootstrap. Thus, add a > WARN_ONCE() so something is left behind for them to understand why perf > commnads that require PEBS is not working properly. Can you elaborate a bit under which conditions this triggered? Typically we should be doing fairly well allocating such buffers with GFP_KERNEL, that should allow things like compaction to run and create higher order pages. And the BTS (branch trace store) isn't _that_ large. That said, the patch is reasonable; although arguably we should maybe do the same to alloc_pebs_buffer(). [-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] x86, perf: avoid spamming kernel log for bts buffer failure 2014-07-01 9:34 ` Peter Zijlstra @ 2014-07-02 13:16 ` Stephane Eranian 2014-07-02 13:31 ` Peter Zijlstra 0 siblings, 1 reply; 10+ messages in thread From: Stephane Eranian @ 2014-07-02 13:16 UTC (permalink / raw) To: Peter Zijlstra Cc: David Rientjes, Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo, x86, LKML On Tue, Jul 1, 2014 at 11:34 AM, Peter Zijlstra <peterz@infradead.org> wrote: > On Mon, Jun 30, 2014 at 04:04:08PM -0700, David Rientjes wrote: >> It's unnecessary to excessively spam the kernel log anytime the BTS buffer >> cannot be allocated, so make this allocation __GFP_NOWARN. >> >> The user probably will want to at least find some artifact that the >> allocation has failed in the past, probably due to fragmentation because >> of its large size, when it's not allocated at bootstrap. Thus, add a >> WARN_ONCE() so something is left behind for them to understand why perf >> commnads that require PEBS is not working properly. > > Can you elaborate a bit under which conditions this triggered? Typically > we should be doing fairly well allocating such buffers with GFP_KERNEL, > that should allow things like compaction to run and create higher order > pages. > I think this triggers when you have fragmented memory and you have perf_events active and inactive (i.e., 0 users = no nmi watchdog) frequently. Each first user invokes the reserve_ds() function to reserve DS, PEBS, BTS. The reason for BTS rather then PEBS is the size of the allocation. PEBS allocates one page, i.e., less likely to get a failure than BTS which allocates 4 pages, I think. David and I discussed this. He can probably add more background info, if needed. > And the BTS (branch trace store) isn't _that_ large. > > That said, the patch is reasonable; although arguably we should maybe do > the same to alloc_pebs_buffer(). ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] x86, perf: avoid spamming kernel log for bts buffer failure 2014-07-02 13:16 ` Stephane Eranian @ 2014-07-02 13:31 ` Peter Zijlstra 2014-07-02 13:38 ` Stephane Eranian 2014-07-02 23:30 ` David Rientjes 0 siblings, 2 replies; 10+ messages in thread From: Peter Zijlstra @ 2014-07-02 13:31 UTC (permalink / raw) To: Stephane Eranian Cc: David Rientjes, Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo, x86, LKML [-- Attachment #1: Type: text/plain, Size: 1758 bytes --] On Wed, Jul 02, 2014 at 03:16:40PM +0200, Stephane Eranian wrote: > On Tue, Jul 1, 2014 at 11:34 AM, Peter Zijlstra <peterz@infradead.org> wrote: > > On Mon, Jun 30, 2014 at 04:04:08PM -0700, David Rientjes wrote: > >> It's unnecessary to excessively spam the kernel log anytime the BTS buffer > >> cannot be allocated, so make this allocation __GFP_NOWARN. > >> > >> The user probably will want to at least find some artifact that the > >> allocation has failed in the past, probably due to fragmentation because > >> of its large size, when it's not allocated at bootstrap. Thus, add a > >> WARN_ONCE() so something is left behind for them to understand why perf > >> commnads that require PEBS is not working properly. > > > > Can you elaborate a bit under which conditions this triggered? Typically > > we should be doing fairly well allocating such buffers with GFP_KERNEL, > > that should allow things like compaction to run and create higher order > > pages. > > > I think this triggers when you have fragmented memory and you have > perf_events active and inactive (i.e., 0 users = no nmi watchdog) frequently. > Each first user invokes the reserve_ds() function to reserve DS, PEBS, BTS. Right, that'd suck. I suppose we could also change that to allocate the DS resources on first demand and never free them again. So only allocate the PEBS buffer when we create the first PEBS event, and idem for the BTS muck. > The reason for BTS rather then PEBS is the size of the allocation. > PEBS allocates one page, i.e., less likely to get a failure than BTS > which allocates 4 pages, I think. Sure.. > David and I discussed this. He can probably add more background > info, if needed. It would still be good to see why compaction etc is failing. [-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] x86, perf: avoid spamming kernel log for bts buffer failure 2014-07-02 13:31 ` Peter Zijlstra @ 2014-07-02 13:38 ` Stephane Eranian 2014-07-02 14:54 ` Peter Zijlstra 2014-07-02 23:30 ` David Rientjes 1 sibling, 1 reply; 10+ messages in thread From: Stephane Eranian @ 2014-07-02 13:38 UTC (permalink / raw) To: Peter Zijlstra Cc: David Rientjes, Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo, x86, LKML On Wed, Jul 2, 2014 at 3:31 PM, Peter Zijlstra <peterz@infradead.org> wrote: > On Wed, Jul 02, 2014 at 03:16:40PM +0200, Stephane Eranian wrote: >> On Tue, Jul 1, 2014 at 11:34 AM, Peter Zijlstra <peterz@infradead.org> wrote: >> > On Mon, Jun 30, 2014 at 04:04:08PM -0700, David Rientjes wrote: >> >> It's unnecessary to excessively spam the kernel log anytime the BTS buffer >> >> cannot be allocated, so make this allocation __GFP_NOWARN. >> >> >> >> The user probably will want to at least find some artifact that the >> >> allocation has failed in the past, probably due to fragmentation because >> >> of its large size, when it's not allocated at bootstrap. Thus, add a >> >> WARN_ONCE() so something is left behind for them to understand why perf >> >> commnads that require PEBS is not working properly. >> > >> > Can you elaborate a bit under which conditions this triggered? Typically >> > we should be doing fairly well allocating such buffers with GFP_KERNEL, >> > that should allow things like compaction to run and create higher order >> > pages. >> > >> I think this triggers when you have fragmented memory and you have >> perf_events active and inactive (i.e., 0 users = no nmi watchdog) frequently. >> Each first user invokes the reserve_ds() function to reserve DS, PEBS, BTS. > > Right, that'd suck. I suppose we could also change that to allocate the > DS resources on first demand and never free them again. > Some may argue that if you never use perf_event again, you are wasting (1 + 1 + 4) pages per CPU. That may not be okay on some systems. But yes, it would avoid this problem and also take the penalty for the allocs only once. > So only allocate the PEBS buffer when we create the first PEBS event, > and idem for the BTS muck. > >> The reason for BTS rather then PEBS is the size of the allocation. >> PEBS allocates one page, i.e., less likely to get a failure than BTS >> which allocates 4 pages, I think. > > Sure.. > >> David and I discussed this. He can probably add more background >> info, if needed. > > It would still be good to see why compaction etc is failing. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] x86, perf: avoid spamming kernel log for bts buffer failure 2014-07-02 13:38 ` Stephane Eranian @ 2014-07-02 14:54 ` Peter Zijlstra 0 siblings, 0 replies; 10+ messages in thread From: Peter Zijlstra @ 2014-07-02 14:54 UTC (permalink / raw) To: Stephane Eranian Cc: David Rientjes, Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo, x86, LKML [-- Attachment #1: Type: text/plain, Size: 700 bytes --] On Wed, Jul 02, 2014 at 03:38:50PM +0200, Stephane Eranian wrote: > On Wed, Jul 2, 2014 at 3:31 PM, Peter Zijlstra <peterz@infradead.org> wrote: > > Right, that'd suck. I suppose we could also change that to allocate the > > DS resources on first demand and never free them again. > > > Some may argue that if you never use perf_event again, you are wasting > (1 + 1 + 4) pages per CPU. That may not be okay on some systems. > > But yes, it would avoid this problem and also take the penalty for the allocs > only once. We could of course over engineer this and put a timer on them to free after 5 minutes to avoid the alloc/free cycle for workloads that create/destroy events a lot. [-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] x86, perf: avoid spamming kernel log for bts buffer failure 2014-07-02 13:31 ` Peter Zijlstra 2014-07-02 13:38 ` Stephane Eranian @ 2014-07-02 23:30 ` David Rientjes 1 sibling, 0 replies; 10+ messages in thread From: David Rientjes @ 2014-07-02 23:30 UTC (permalink / raw) To: Peter Zijlstra Cc: Stephane Eranian, Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo, x86, LKML On Wed, 2 Jul 2014, Peter Zijlstra wrote: > > David and I discussed this. He can probably add more background > > info, if needed. > > It would still be good to see why compaction etc is failing. > "Why compaction is failing" has been the story of my life for the past few weeks, unfortunately. One person ran into this and here is the breakdown at the time of the page allocation failure: Node 0 DMA32: 12478*4kB 7595*8kB 2087*16kB 26*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 144896kB Node 0 Normal: 55037*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 222196kB Node 1 Normal: 165860*4kB 18*8kB 1*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 663600kB So we really don't have any order-4 or higher memory and if compaction and reclaim fails, then the allocation failure is printed every time. This is what the patch is trying to address and we can't guarantee order-4 is always allocatable even with GFP_KERNEL. This allocation is just outside the PAGE_ALLOC_COSTLY_ORDER == 3 threshold that would have oom killed something and retried; the oom killer is deferred for this case because there's no guarantee killing a process would have resulted in order-4 memory being available (and nobody wants killing when they are 902MB above the per-zone min watermarks like this poor guy). ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] x86, perf: avoid spamming kernel log for bts buffer failure 2014-06-30 23:04 [patch] x86, perf: avoid spamming kernel log for bts buffer failure David Rientjes 2014-07-01 9:34 ` Peter Zijlstra @ 2014-07-14 23:33 ` David Rientjes 2014-07-15 8:53 ` Peter Zijlstra 2014-07-16 19:21 ` [tip:perf/urgent] perf/x86/intel: Avoid spamming kernel log for BTS " tip-bot for David Rientjes 2 siblings, 1 reply; 10+ messages in thread From: David Rientjes @ 2014-07-14 23:33 UTC (permalink / raw) To: Peter Zijlstra, Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo Cc: Stephane Eranian, x86, linux-kernel On Mon, 30 Jun 2014, David Rientjes wrote: > It's unnecessary to excessively spam the kernel log anytime the BTS buffer > cannot be allocated, so make this allocation __GFP_NOWARN. > > The user probably will want to at least find some artifact that the > allocation has failed in the past, probably due to fragmentation because > of its large size, when it's not allocated at bootstrap. Thus, add a > WARN_ONCE() so something is left behind for them to understand why perf > commnads that require PEBS is not working properly. > > Signed-off-by: David Rientjes <rientjes@google.com> Peter, would you like to ack this? Trying to get this merged in time for the next merge window. > --- > arch/x86/kernel/cpu/perf_event_intel_ds.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c > --- a/arch/x86/kernel/cpu/perf_event_intel_ds.c > +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c > @@ -311,9 +311,11 @@ static int alloc_bts_buffer(int cpu) > if (!x86_pmu.bts) > return 0; > > - buffer = kzalloc_node(BTS_BUFFER_SIZE, GFP_KERNEL, node); > - if (unlikely(!buffer)) > + buffer = kzalloc_node(BTS_BUFFER_SIZE, GFP_KERNEL | __GFP_NOWARN, node); > + if (unlikely(!buffer)) { > + WARN_ONCE(1, "%s: BTS buffer allocation failure\n", __func__); > return -ENOMEM; > + } > > max = BTS_BUFFER_SIZE / BTS_RECORD_SIZE; > thresh = max / 16; ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [patch] x86, perf: avoid spamming kernel log for bts buffer failure 2014-07-14 23:33 ` David Rientjes @ 2014-07-15 8:53 ` Peter Zijlstra 0 siblings, 0 replies; 10+ messages in thread From: Peter Zijlstra @ 2014-07-15 8:53 UTC (permalink / raw) To: David Rientjes Cc: Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo, Stephane Eranian, x86, linux-kernel [-- Attachment #1: Type: text/plain, Size: 848 bytes --] On Mon, Jul 14, 2014 at 04:33:47PM -0700, David Rientjes wrote: > On Mon, 30 Jun 2014, David Rientjes wrote: > > > It's unnecessary to excessively spam the kernel log anytime the BTS buffer > > cannot be allocated, so make this allocation __GFP_NOWARN. > > > > The user probably will want to at least find some artifact that the > > allocation has failed in the past, probably due to fragmentation because > > of its large size, when it's not allocated at bootstrap. Thus, add a > > WARN_ONCE() so something is left behind for them to understand why perf > > commnads that require PEBS is not working properly. > > > > Signed-off-by: David Rientjes <rientjes@google.com> > > Peter, would you like to ack this? Trying to get this merged in time for > the next merge window. I've queued it; still don't like it though. [-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* [tip:perf/urgent] perf/x86/intel: Avoid spamming kernel log for BTS buffer failure 2014-06-30 23:04 [patch] x86, perf: avoid spamming kernel log for bts buffer failure David Rientjes 2014-07-01 9:34 ` Peter Zijlstra 2014-07-14 23:33 ` David Rientjes @ 2014-07-16 19:21 ` tip-bot for David Rientjes 2 siblings, 0 replies; 10+ messages in thread From: tip-bot for David Rientjes @ 2014-07-16 19:21 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, hpa, mingo, torvalds, peterz, acme, tglx, rientjes Commit-ID: 4485154138f6ffa5b252cb490aba3e8eb30124e4 Gitweb: http://git.kernel.org/tip/4485154138f6ffa5b252cb490aba3e8eb30124e4 Author: David Rientjes <rientjes@google.com> AuthorDate: Mon, 30 Jun 2014 16:04:08 -0700 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Wed, 16 Jul 2014 13:31:30 +0200 perf/x86/intel: Avoid spamming kernel log for BTS buffer failure It's unnecessary to excessively spam the kernel log anytime the BTS buffer cannot be allocated, so make this allocation __GFP_NOWARN. The user probably will want to at least find some artifact that the allocation has failed in the past, probably due to fragmentation because of its large size, when it's not allocated at bootstrap. Thus, add a WARN_ONCE() so something is left behind for them to understand why perf commnads that require PEBS is not working properly. Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1406301600460.26302@chino.kir.corp.google.com Signed-off-by: Ingo Molnar <mingo@kernel.org> --- arch/x86/kernel/cpu/perf_event_intel_ds.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c index 980970c..696ade3 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_ds.c +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c @@ -311,9 +311,11 @@ static int alloc_bts_buffer(int cpu) if (!x86_pmu.bts) return 0; - buffer = kzalloc_node(BTS_BUFFER_SIZE, GFP_KERNEL, node); - if (unlikely(!buffer)) + buffer = kzalloc_node(BTS_BUFFER_SIZE, GFP_KERNEL | __GFP_NOWARN, node); + if (unlikely(!buffer)) { + WARN_ONCE(1, "%s: BTS buffer allocation failure\n", __func__); return -ENOMEM; + } max = BTS_BUFFER_SIZE / BTS_RECORD_SIZE; thresh = max / 16; ^ permalink raw reply related [flat|nested] 10+ messages in thread
end of thread, other threads:[~2014-07-16 19:22 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-06-30 23:04 [patch] x86, perf: avoid spamming kernel log for bts buffer failure David Rientjes 2014-07-01 9:34 ` Peter Zijlstra 2014-07-02 13:16 ` Stephane Eranian 2014-07-02 13:31 ` Peter Zijlstra 2014-07-02 13:38 ` Stephane Eranian 2014-07-02 14:54 ` Peter Zijlstra 2014-07-02 23:30 ` David Rientjes 2014-07-14 23:33 ` David Rientjes 2014-07-15 8:53 ` Peter Zijlstra 2014-07-16 19:21 ` [tip:perf/urgent] perf/x86/intel: Avoid spamming kernel log for BTS " tip-bot for David Rientjes
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).