All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Mel Gorman <mgorman@techsingularity.net>
Cc: Matthew Wilcox <willy@infradead.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Pankaj Gupta <pagupta@redhat.com>,
	Tariq Toukan <ttoukan.linux@gmail.com>,
	Tariq Toukan <tariqt@mellanox.com>,
	netdev@vger.kernel.org, akpm@linux-foundation.org,
	linux-mm <linux-mm@kvack.org>,
	Saeed Mahameed <saeedm@mellanox.com>,
	linux-kernel@vger.kernel.org, brouer@redhat.com
Subject: Re: in_irq_or_nmi() and RFC patch
Date: Thu, 30 Mar 2017 17:07:08 +0200	[thread overview]
Message-ID: <20170330170708.084bd16c@redhat.com> (raw)
In-Reply-To: <20170330130436.l37yazbxlrkvcbf3@techsingularity.net>

On Thu, 30 Mar 2017 14:04:36 +0100
Mel Gorman <mgorman@techsingularity.net> wrote:

> On Wed, Mar 29, 2017 at 09:44:41PM +0200, Jesper Dangaard Brouer wrote:
> > > Regardless or using in_irq() (or in combi with in_nmi()) I get the
> > > following warning below:
> > > 
> > > [    0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-4.11.0-rc3-net-next-page-alloc-softirq+ root=UUID=2e8451ff-6797-49b5-8d3a-eed5a42d7dc9 ro rhgb quiet LANG=en_DK.UTF
> > > -8
> > > [    0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes)
> > > [    0.000000] ------------[ cut here ]------------
> > > [    0.000000] WARNING: CPU: 0 PID: 0 at kernel/softirq.c:161 __local_bh_enable_ip+0x70/0x90
> > > [    0.000000] Modules linked in:
> > > [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.11.0-rc3-net-next-page-alloc-softirq+ #235
> > > [    0.000000] Hardware name: MSI MS-7984/Z170A GAMING PRO (MS-7984), BIOS 1.60 12/16/2015
> > > [    0.000000] Call Trace:
> > > [    0.000000]  dump_stack+0x4f/0x73
> > > [    0.000000]  __warn+0xcb/0xf0
> > > [    0.000000]  warn_slowpath_null+0x1d/0x20
> > > [    0.000000]  __local_bh_enable_ip+0x70/0x90
> > > [    0.000000]  free_hot_cold_page+0x1a4/0x2f0
> > > [    0.000000]  __free_pages+0x1f/0x30
> > > [    0.000000]  __free_pages_bootmem+0xab/0xb8
> > > [    0.000000]  __free_memory_core+0x79/0x91
> > > [    0.000000]  free_all_bootmem+0xaa/0x122
> > > [    0.000000]  mem_init+0x71/0xa4
> > > [    0.000000]  start_kernel+0x1e5/0x3f1
> > > [    0.000000]  x86_64_start_reservations+0x2a/0x2c
> > > [    0.000000]  x86_64_start_kernel+0x178/0x18b
> > > [    0.000000]  start_cpu+0x14/0x14
> > > [    0.000000]  ? start_cpu+0x14/0x14
> > > [    0.000000] ---[ end trace a57944bec8fc985c ]---
> > > [    0.000000] Memory: 32739472K/33439416K available (7624K kernel code, 1528K rwdata, 3168K rodata, 1860K init, 2260K bss, 699944K reserved, 0K cma-reserved)
> > > 
> > > And kernel/softirq.c:161 contains:
> > > 
> > >  WARN_ON_ONCE(in_irq() || irqs_disabled());
> > > 
> > > Thus, I don't think the change in my RFC-patch[1] is safe.
> > > Of changing[2] to support softirq allocations by replacing
> > > preempt_disable() with local_bh_disable().
> > > 
> > > [1] http://lkml.kernel.org/r/20170327143947.4c237e54@redhat.com
> > > 
> > > [2] commit 374ad05ab64d ("mm, page_alloc: only use per-cpu allocator for irq-safe requests")
> > >  https://git.kernel.org/torvalds/c/374ad05ab64d  
> > 
> > A patch that avoids the above warning is inlined below, but I'm not
> > sure if this is best direction.  Or we should rather consider reverting
> > part of commit 374ad05ab64d to avoid the softirq performance regression?
> >    
> 
> At the moment, I'm not seeing a better alternative. If this works, I
> think it would still be far superior in terms of performance than a
> revert. 

Started performance benchmarking:
 163 cycles = current state
 183 cycles = with BH disable + in_irq
 218 cycles = with BH disable + in_irq + irqs_disabled

Thus, the performance numbers unfortunately looks bad, once we add the
test for irqs_disabled().  The slowdown by replacing preempt_disable
with BH-disable is still a win (we saved 29 cycles before, and loose
20, I was expecting regression to be only 10 cycles).

Bad things happen when adding the test for irqs_disabled().  This
likely happens because it uses the "pushfq + pop" to read CPU flags.  I
wonder if X86-experts know if e.g. using "lahf" would be faster (and if
it also loads the interrupt flag X86_EFLAGS_IF)?

We basically lost more (163-218=-55) than we gained (29) :-(


> As before, if there are bad consequences to adding a BH
> rescheduling point then we'll have to revert. However, I don't like a
> revert being the first option as it'll keep encouraging drivers to build
> sub-allocators to avoid the page allocator.

I'm also motivated by speeding up the page allocator to avoid this
happening in all the drivers.

> > [PATCH] mm, page_alloc: re-enable softirq use of per-cpu page allocator
> > 
> > From: Jesper Dangaard Brouer <brouer@redhat.com>
> >   
> 
> Other than the slightly misleading comments about NMI which could
> explain "this potentially misses an NMI but an NMI allocating pages is
> brain damaged", I don't see a problem. The irqs_disabled() check is a
> subtle but it's not earth shattering and it still helps the 100GiB cases
> with the limited cycle budget to process packets.


-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

WARNING: multiple messages have this Message-ID (diff)
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Mel Gorman <mgorman@techsingularity.net>
Cc: Matthew Wilcox <willy@infradead.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Pankaj Gupta <pagupta@redhat.com>,
	Tariq Toukan <ttoukan.linux@gmail.com>,
	Tariq Toukan <tariqt@mellanox.com>,
	netdev@vger.kernel.org, akpm@linux-foundation.org,
	linux-mm <linux-mm@kvack.org>,
	Saeed Mahameed <saeedm@mellanox.com>,
	linux-kernel@vger.kernel.org, brouer@redhat.com
Subject: Re: in_irq_or_nmi() and RFC patch
Date: Thu, 30 Mar 2017 17:07:08 +0200	[thread overview]
Message-ID: <20170330170708.084bd16c@redhat.com> (raw)
In-Reply-To: <20170330130436.l37yazbxlrkvcbf3@techsingularity.net>

On Thu, 30 Mar 2017 14:04:36 +0100
Mel Gorman <mgorman@techsingularity.net> wrote:

> On Wed, Mar 29, 2017 at 09:44:41PM +0200, Jesper Dangaard Brouer wrote:
> > > Regardless or using in_irq() (or in combi with in_nmi()) I get the
> > > following warning below:
> > > 
> > > [    0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-4.11.0-rc3-net-next-page-alloc-softirq+ root=UUID=2e8451ff-6797-49b5-8d3a-eed5a42d7dc9 ro rhgb quiet LANG=en_DK.UTF
> > > -8
> > > [    0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes)
> > > [    0.000000] ------------[ cut here ]------------
> > > [    0.000000] WARNING: CPU: 0 PID: 0 at kernel/softirq.c:161 __local_bh_enable_ip+0x70/0x90
> > > [    0.000000] Modules linked in:
> > > [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.11.0-rc3-net-next-page-alloc-softirq+ #235
> > > [    0.000000] Hardware name: MSI MS-7984/Z170A GAMING PRO (MS-7984), BIOS 1.60 12/16/2015
> > > [    0.000000] Call Trace:
> > > [    0.000000]  dump_stack+0x4f/0x73
> > > [    0.000000]  __warn+0xcb/0xf0
> > > [    0.000000]  warn_slowpath_null+0x1d/0x20
> > > [    0.000000]  __local_bh_enable_ip+0x70/0x90
> > > [    0.000000]  free_hot_cold_page+0x1a4/0x2f0
> > > [    0.000000]  __free_pages+0x1f/0x30
> > > [    0.000000]  __free_pages_bootmem+0xab/0xb8
> > > [    0.000000]  __free_memory_core+0x79/0x91
> > > [    0.000000]  free_all_bootmem+0xaa/0x122
> > > [    0.000000]  mem_init+0x71/0xa4
> > > [    0.000000]  start_kernel+0x1e5/0x3f1
> > > [    0.000000]  x86_64_start_reservations+0x2a/0x2c
> > > [    0.000000]  x86_64_start_kernel+0x178/0x18b
> > > [    0.000000]  start_cpu+0x14/0x14
> > > [    0.000000]  ? start_cpu+0x14/0x14
> > > [    0.000000] ---[ end trace a57944bec8fc985c ]---
> > > [    0.000000] Memory: 32739472K/33439416K available (7624K kernel code, 1528K rwdata, 3168K rodata, 1860K init, 2260K bss, 699944K reserved, 0K cma-reserved)
> > > 
> > > And kernel/softirq.c:161 contains:
> > > 
> > >  WARN_ON_ONCE(in_irq() || irqs_disabled());
> > > 
> > > Thus, I don't think the change in my RFC-patch[1] is safe.
> > > Of changing[2] to support softirq allocations by replacing
> > > preempt_disable() with local_bh_disable().
> > > 
> > > [1] http://lkml.kernel.org/r/20170327143947.4c237e54@redhat.com
> > > 
> > > [2] commit 374ad05ab64d ("mm, page_alloc: only use per-cpu allocator for irq-safe requests")
> > >  https://git.kernel.org/torvalds/c/374ad05ab64d  
> > 
> > A patch that avoids the above warning is inlined below, but I'm not
> > sure if this is best direction.  Or we should rather consider reverting
> > part of commit 374ad05ab64d to avoid the softirq performance regression?
> >    
> 
> At the moment, I'm not seeing a better alternative. If this works, I
> think it would still be far superior in terms of performance than a
> revert. 

Started performance benchmarking:
 163 cycles = current state
 183 cycles = with BH disable + in_irq
 218 cycles = with BH disable + in_irq + irqs_disabled

Thus, the performance numbers unfortunately looks bad, once we add the
test for irqs_disabled().  The slowdown by replacing preempt_disable
with BH-disable is still a win (we saved 29 cycles before, and loose
20, I was expecting regression to be only 10 cycles).

Bad things happen when adding the test for irqs_disabled().  This
likely happens because it uses the "pushfq + pop" to read CPU flags.  I
wonder if X86-experts know if e.g. using "lahf" would be faster (and if
it also loads the interrupt flag X86_EFLAGS_IF)?

We basically lost more (163-218=-55) than we gained (29) :-(


> As before, if there are bad consequences to adding a BH
> rescheduling point then we'll have to revert. However, I don't like a
> revert being the first option as it'll keep encouraging drivers to build
> sub-allocators to avoid the page allocator.

I'm also motivated by speeding up the page allocator to avoid this
happening in all the drivers.

> > [PATCH] mm, page_alloc: re-enable softirq use of per-cpu page allocator
> > 
> > From: Jesper Dangaard Brouer <brouer@redhat.com>
> >   
> 
> Other than the slightly misleading comments about NMI which could
> explain "this potentially misses an NMI but an NMI allocating pages is
> brain damaged", I don't see a problem. The irqs_disabled() check is a
> subtle but it's not earth shattering and it still helps the 100GiB cases
> with the limited cycle budget to process packets.


-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-03-30 15:07 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-27 20:25 [merged] mm-page_alloc-only-use-per-cpu-allocator-for-irq-safe-requests.patch removed from -mm tree akpm
2017-03-01 13:48 ` Page allocator order-0 optimizations merged Jesper Dangaard Brouer
2017-03-01 17:36   ` Tariq Toukan
2017-03-01 17:36     ` Tariq Toukan
2017-03-22 17:39     ` Tariq Toukan
2017-03-22 17:39       ` Tariq Toukan
2017-03-22 23:40       ` Mel Gorman
2017-03-23 13:43         ` Jesper Dangaard Brouer
2017-03-23 14:51           ` Mel Gorman
2017-03-26  8:21             ` Tariq Toukan
2017-03-26 10:17               ` Tariq Toukan
2017-03-27  7:32                 ` Pankaj Gupta
2017-03-27  8:55                   ` Jesper Dangaard Brouer
2017-03-27 12:28                     ` Mel Gorman
2017-03-27 12:39                     ` Jesper Dangaard Brouer
2017-03-27 13:32                       ` Mel Gorman
2017-03-28  7:32                         ` Tariq Toukan
2017-03-28  8:29                           ` Jesper Dangaard Brouer
2017-03-28 16:05                           ` Tariq Toukan
2017-03-28 18:24                             ` Jesper Dangaard Brouer
2017-03-29  7:13                               ` Tariq Toukan
2017-03-28  8:28                         ` Pankaj Gupta
2017-03-27 14:15                       ` Matthew Wilcox
2017-03-27 14:15                         ` Matthew Wilcox
2017-03-27 15:15                         ` Jesper Dangaard Brouer
2017-03-27 16:58                           ` in_irq_or_nmi() Matthew Wilcox
2017-03-27 16:58                             ` in_irq_or_nmi() Matthew Wilcox
2017-03-27 16:58                             ` in_irq_or_nmi() Matthew Wilcox
2017-03-29  8:12                             ` in_irq_or_nmi() Peter Zijlstra
2017-03-29  8:12                               ` in_irq_or_nmi() Peter Zijlstra
2017-03-29  8:12                               ` in_irq_or_nmi() Peter Zijlstra
2017-03-29  8:59                               ` in_irq_or_nmi() Jesper Dangaard Brouer
2017-03-29  8:59                                 ` in_irq_or_nmi() Jesper Dangaard Brouer
2017-03-29  9:19                                 ` in_irq_or_nmi() Peter Zijlstra
2017-03-29  9:19                                   ` in_irq_or_nmi() Peter Zijlstra
2017-03-29  9:19                                   ` in_irq_or_nmi() Peter Zijlstra
2017-03-29 18:12                                   ` in_irq_or_nmi() Matthew Wilcox
2017-03-29 18:12                                     ` in_irq_or_nmi() Matthew Wilcox
2017-03-29 19:11                                     ` in_irq_or_nmi() Jesper Dangaard Brouer
2017-03-29 19:11                                       ` in_irq_or_nmi() Jesper Dangaard Brouer
2017-03-29 19:44                                       ` in_irq_or_nmi() and RFC patch Jesper Dangaard Brouer
2017-03-29 19:44                                         ` Jesper Dangaard Brouer
2017-03-30  6:49                                         ` Peter Zijlstra
2017-03-30  6:49                                           ` Peter Zijlstra
2017-03-30  7:12                                           ` Jesper Dangaard Brouer
2017-03-30  7:12                                             ` Jesper Dangaard Brouer
2017-03-30  7:35                                             ` Peter Zijlstra
2017-03-30  7:35                                               ` Peter Zijlstra
2017-03-30  9:46                                               ` Jesper Dangaard Brouer
2017-03-30  9:46                                                 ` Jesper Dangaard Brouer
2017-03-30 13:04                                         ` Mel Gorman
2017-03-30 13:04                                           ` Mel Gorman
2017-03-30 15:07                                           ` Jesper Dangaard Brouer [this message]
2017-03-30 15:07                                             ` Jesper Dangaard Brouer
2017-04-03 12:05                                             ` Mel Gorman
2017-04-03 12:05                                               ` Mel Gorman
2017-04-05  8:53                                               ` Mel Gorman
2017-04-05  8:53                                                 ` Mel Gorman
2017-04-10 14:31   ` Page allocator order-0 optimizations merged zhong jiang
2017-04-10 14:31     ` zhong jiang
2017-04-10 15:10     ` Mel Gorman
2017-04-11  1:54       ` zhong jiang
2017-04-11  1:54         ` zhong jiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170330170708.084bd16c@redhat.com \
    --to=brouer@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=netdev@vger.kernel.org \
    --cc=pagupta@redhat.com \
    --cc=peterz@infradead.org \
    --cc=saeedm@mellanox.com \
    --cc=tariqt@mellanox.com \
    --cc=ttoukan.linux@gmail.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.