All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Metcalf <cmetcalf@tilera.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Tejun Heo <tj@kernel.org>, <linux-kernel@vger.kernel.org>,
	<linux-mm@kvack.org>, Thomas Gleixner <tglx@linutronix.de>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Cody P Schafer <cody@linux.vnet.ibm.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Subject: Re: [PATCH v4 2/2] mm: make lru_add_drain_all() selective
Date: Tue, 13 Aug 2013 19:04:47 -0400	[thread overview]
Message-ID: <520ABB8F.6000709@tilera.com> (raw)
In-Reply-To: <20130813152622.f15dcaaa672ba182308ce29f@linux-foundation.org>

On 8/13/2013 6:26 PM, Andrew Morton wrote:
> On Tue, 13 Aug 2013 18:13:48 -0400 Chris Metcalf <cmetcalf@tilera.com> wrote:
>
>> On 8/13/2013 5:13 PM, Andrew Morton wrote:
>>> On Tue, 13 Aug 2013 16:59:54 -0400 Chris Metcalf <cmetcalf@tilera.com> wrote:
>>>
>>>>> Then again, why does this patchset exist?  It's a performance
>>>>> optimisation so presumably someone cares.  But not enough to perform
>>>>> actual measurements :(
>>>> The patchset exists because of the difference between zero overhead on
>>>> cpus that don't have drainable lrus, and non-zero overhead.  This turns
>>>> out to be important on workloads where nohz cores are handling 10 Gb
>>>> traffic in userspace and really, really don't want to be interrupted,
>>>> or they drop packets on the floor.
>>> But what is the effect of the patchset?  Has it been tested against the
>>> problematic workload(s)?
>> Yes.  The result is that syscalls such as mlockall(), which otherwise interrupt
>> every core, don't interrupt the cores that are running purely in userspace.
>> Since they are purely in userspace they don't have any drainable pagevecs,
>> so the patchset means they don't get interrupted and don't drop packets.
>>
>> I implemented this against Linux 2.6.38 and our home-grown version of nohz
>> cpusets back in July 2012, and we have been shipping it to customers since then.
> argh.
>
> Those per-cpu LRU pagevecs were a nasty but very effective locking
> amortization hack back in, umm, 2002.  They have caused quite a lot of
> weird corner-case behaviour, resulting in all the lru_add_drain_all()
> calls sprinkled around the place.  I'd like to nuke the whole thing,
> but that would require a fundamental rethnik/rework of all the LRU list
> locking.
>
> According to the 8891d6da17db0f changelog, the lru_add_drain_all() in
> sys_mlock() isn't really required: "it isn't must.  but it reduce the
> failure of moving to unevictable list.  its failure can rescue in
> vmscan later.  but reducing is better."
>
> I suspect we could just kill it.

That's probably true, but I suspect this change is still worthwhile for
nohz environments.  There are other calls of lru_add_drain_all(), and
you just don't want anything in the kernel that interrupts every core
when only a subset could be interrupted.  If the kernel can avoid
generating unnecessary interrupts to uninvolved cores, you can make
guarantees about jitter on cores that are running dedicated userspace code.

-- 
Chris Metcalf, Tilera Corp.
http://www.tilera.com


WARNING: multiple messages have this Message-ID (diff)
From: Chris Metcalf <cmetcalf@tilera.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Tejun Heo <tj@kernel.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Thomas Gleixner <tglx@linutronix.de>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Cody P Schafer <cody@linux.vnet.ibm.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Subject: Re: [PATCH v4 2/2] mm: make lru_add_drain_all() selective
Date: Tue, 13 Aug 2013 19:04:47 -0400	[thread overview]
Message-ID: <520ABB8F.6000709@tilera.com> (raw)
In-Reply-To: <20130813152622.f15dcaaa672ba182308ce29f@linux-foundation.org>

On 8/13/2013 6:26 PM, Andrew Morton wrote:
> On Tue, 13 Aug 2013 18:13:48 -0400 Chris Metcalf <cmetcalf@tilera.com> wrote:
>
>> On 8/13/2013 5:13 PM, Andrew Morton wrote:
>>> On Tue, 13 Aug 2013 16:59:54 -0400 Chris Metcalf <cmetcalf@tilera.com> wrote:
>>>
>>>>> Then again, why does this patchset exist?  It's a performance
>>>>> optimisation so presumably someone cares.  But not enough to perform
>>>>> actual measurements :(
>>>> The patchset exists because of the difference between zero overhead on
>>>> cpus that don't have drainable lrus, and non-zero overhead.  This turns
>>>> out to be important on workloads where nohz cores are handling 10 Gb
>>>> traffic in userspace and really, really don't want to be interrupted,
>>>> or they drop packets on the floor.
>>> But what is the effect of the patchset?  Has it been tested against the
>>> problematic workload(s)?
>> Yes.  The result is that syscalls such as mlockall(), which otherwise interrupt
>> every core, don't interrupt the cores that are running purely in userspace.
>> Since they are purely in userspace they don't have any drainable pagevecs,
>> so the patchset means they don't get interrupted and don't drop packets.
>>
>> I implemented this against Linux 2.6.38 and our home-grown version of nohz
>> cpusets back in July 2012, and we have been shipping it to customers since then.
> argh.
>
> Those per-cpu LRU pagevecs were a nasty but very effective locking
> amortization hack back in, umm, 2002.  They have caused quite a lot of
> weird corner-case behaviour, resulting in all the lru_add_drain_all()
> calls sprinkled around the place.  I'd like to nuke the whole thing,
> but that would require a fundamental rethnik/rework of all the LRU list
> locking.
>
> According to the 8891d6da17db0f changelog, the lru_add_drain_all() in
> sys_mlock() isn't really required: "it isn't must.  but it reduce the
> failure of moving to unevictable list.  its failure can rescue in
> vmscan later.  but reducing is better."
>
> I suspect we could just kill it.

That's probably true, but I suspect this change is still worthwhile for
nohz environments.  There are other calls of lru_add_drain_all(), and
you just don't want anything in the kernel that interrupts every core
when only a subset could be interrupted.  If the kernel can avoid
generating unnecessary interrupts to uninvolved cores, you can make
guarantees about jitter on cores that are running dedicated userspace code.

-- 
Chris Metcalf, Tilera Corp.
http://www.tilera.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2013-08-13 23:04 UTC|newest]

Thread overview: 104+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-06 20:22 [PATCH] mm: make lru_add_drain_all() selective Chris Metcalf
2013-08-06 20:22 ` Chris Metcalf
2013-08-06 20:22 ` [PATCH v2] " Chris Metcalf
2013-08-06 20:22   ` Chris Metcalf
2013-08-07 20:45   ` Tejun Heo
2013-08-07 20:45     ` Tejun Heo
2013-08-07 20:49     ` [PATCH v3 1/2] workqueue: add new schedule_on_cpu_mask() API Chris Metcalf
2013-08-07 20:49       ` Chris Metcalf
2013-08-07 20:52     ` [PATCH v3 2/2] mm: make lru_add_drain_all() selective Chris Metcalf
2013-08-07 20:52       ` Chris Metcalf
2013-08-07 22:48   ` [PATCH v2] " Cody P Schafer
2013-08-07 22:48     ` Cody P Schafer
2013-08-07 20:49     ` [PATCH v4 1/2] workqueue: add new schedule_on_cpu_mask() API Chris Metcalf
2013-08-07 20:49       ` Chris Metcalf
2013-08-09 15:02       ` Tejun Heo
2013-08-09 15:02         ` Tejun Heo
2013-08-09 16:12         ` Chris Metcalf
2013-08-09 16:12           ` Chris Metcalf
2013-08-09 16:30           ` Tejun Heo
2013-08-09 16:30             ` Tejun Heo
2013-08-07 20:49             ` [PATCH v5 " Chris Metcalf
2013-08-07 20:49               ` Chris Metcalf
2013-08-09 17:40               ` Tejun Heo
2013-08-09 17:40                 ` Tejun Heo
2013-08-09 17:49                 ` [PATCH v6 " Chris Metcalf
2013-08-09 17:49                   ` Chris Metcalf
2013-08-09 17:52                 ` [PATCH v6 2/2] mm: make lru_add_drain_all() selective Chris Metcalf
2013-08-09 17:52                   ` Chris Metcalf
2013-08-07 20:52             ` [PATCH v5 " Chris Metcalf
2013-08-07 20:52               ` Chris Metcalf
2013-08-07 20:52     ` [PATCH v4 " Chris Metcalf
2013-08-07 20:52       ` Chris Metcalf
2013-08-12 21:05       ` Andrew Morton
2013-08-12 21:05         ` Andrew Morton
2013-08-13  1:53         ` Chris Metcalf
2013-08-13  1:53           ` Chris Metcalf
2013-08-13 19:35           ` Andrew Morton
2013-08-13 19:35             ` Andrew Morton
2013-08-13 20:19             ` Tejun Heo
2013-08-13 20:19               ` Tejun Heo
2013-08-13 20:31               ` Andrew Morton
2013-08-13 20:31                 ` Andrew Morton
2013-08-13 20:59                 ` Chris Metcalf
2013-08-13 20:59                   ` Chris Metcalf
2013-08-13 21:13                   ` Andrew Morton
2013-08-13 21:13                     ` Andrew Morton
2013-08-13 22:13                     ` Chris Metcalf
2013-08-13 22:13                       ` Chris Metcalf
2013-08-13 22:26                       ` Andrew Morton
2013-08-13 22:26                         ` Andrew Morton
2013-08-13 23:04                         ` Chris Metcalf [this message]
2013-08-13 23:04                           ` Chris Metcalf
2013-08-13 22:51                       ` [PATCH v7 1/2] workqueue: add schedule_on_each_cpu_cond Chris Metcalf
2013-08-13 22:51                         ` Chris Metcalf
2013-08-13 22:53                       ` [PATCH v7 2/2] mm: make lru_add_drain_all() selective Chris Metcalf
2013-08-13 22:53                         ` Chris Metcalf
2013-08-13 23:29                         ` Tejun Heo
2013-08-13 23:29                           ` Tejun Heo
2013-08-13 23:32                           ` Chris Metcalf
2013-08-13 23:32                             ` Chris Metcalf
2013-08-14  6:46                             ` Andrew Morton
2013-08-14  6:46                               ` Andrew Morton
2013-08-14 13:05                               ` Tejun Heo
2013-08-14 13:05                                 ` Tejun Heo
2013-08-14 16:03                               ` Chris Metcalf
2013-08-14 16:03                                 ` Chris Metcalf
2013-08-14 16:57                                 ` Tejun Heo
2013-08-14 16:57                                   ` Tejun Heo
2013-08-14 17:18                                   ` Chris Metcalf
2013-08-14 17:18                                     ` Chris Metcalf
2013-08-14 20:07                                     ` Tejun Heo
2013-08-14 20:07                                       ` Tejun Heo
2013-08-14 20:22                                       ` [PATCH v8] " Chris Metcalf
2013-08-14 20:22                                         ` Chris Metcalf
2013-08-14 20:44                                         ` Andrew Morton
2013-08-14 20:44                                           ` Andrew Morton
2013-08-14 20:50                                           ` Tejun Heo
2013-08-14 20:50                                             ` Tejun Heo
2013-08-14 21:03                                             ` Andrew Morton
2013-08-14 21:03                                               ` Andrew Morton
2013-08-14 21:07                                             ` Andrew Morton
2013-08-14 21:07                                               ` Andrew Morton
2013-08-14 21:12                                         ` Andrew Morton
2013-08-14 21:12                                           ` Andrew Morton
2013-08-14 21:23                                           ` Chris Metcalf
2013-08-14 21:23                                             ` Chris Metcalf
2013-08-13 23:44                           ` [PATCH v7 2/2] " Chris Metcalf
2013-08-13 23:44                             ` Chris Metcalf
2013-08-13 23:51                             ` Tejun Heo
2013-08-13 23:51                               ` Tejun Heo
2013-08-13 21:07                 ` [PATCH v4 " Tejun Heo
2013-08-13 21:07                   ` Tejun Heo
2013-08-13 21:16                   ` Andrew Morton
2013-08-13 21:16                     ` Andrew Morton
2013-08-13 22:07                     ` Tejun Heo
2013-08-13 22:07                       ` Tejun Heo
2013-08-13 22:18                       ` Andrew Morton
2013-08-13 22:18                         ` Andrew Morton
2013-08-13 22:33                         ` Tejun Heo
2013-08-13 22:33                           ` Tejun Heo
2013-08-13 22:47                           ` Andrew Morton
2013-08-13 22:47                             ` Andrew Morton
2013-08-13 23:03                             ` Tejun Heo
2013-08-13 23:03                               ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=520ABB8F.6000709@tilera.com \
    --to=cmetcalf@tilera.com \
    --cc=akpm@linux-foundation.org \
    --cc=cody@linux.vnet.ibm.com \
    --cc=fweisbec@gmail.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.