All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christoph Lameter <cl@linux.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>,
	Vince Weaver <vincent.weaver@maine.edu>,
	linux-kernel@vger.kernel.org, Paul Mackerras <paulus@samba.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@ghostprotocols.net>,
	trinity@vger.kernel.org, akpm@linux-foundation.org,
	torvalds@linux-foundation.org, roland@kernel.org,
	infinipath@qlogic.com, linux-mm@kvack.org,
	linux-rdma@vger.kernel.org, Or Gerlitz <or.gerlitz@gmail.com>,
	Hugh Dickins <hughd@google.com>
Subject: Re: [RFC][PATCH] mm: Fix RLIMIT_MEMLOCK
Date: Tue, 28 May 2013 16:37:06 +0000	[thread overview]
Message-ID: <0000013eec0006ee-0f8caf7b-cc94-4f54-ae38-0ca6623b7841-000000@email.amazonses.com> (raw)
In-Reply-To: <20130527064834.GA2781@laptop>

On Mon, 27 May 2013, Peter Zijlstra wrote:

> Before your patch pinned was included in locked and thus RLIMIT_MEMLOCK
> had a single resource counter. After your patch RLIMIT_MEMLOCK is
> applied separately to both -- more or less.

Before the patch the count was doubled since a single page was counted
twice: Once because it was mlocked (marked with PG_mlock) and then again
because it was also pinned (the refcount was increased). Two different things.

We have agreed for a long time that mlocked pages are movable. That is not
true for pinned pages and therefore pinning pages therefore do not fall
into that category (Hugh? AFAICR you came up with that rule?)

> NO, mlocked pages are pages that do not leave core memory; IOW do not
> cause major faults. Pinning pages is a perfectly spec compliant mlock()
> implementation.

That is not the definition that we have used so far.

> Now in an earlier discussion on the issue 'we' (I can't remember if you
> participated there, I remember Mel and Kosaki-San) agreed that for
> 'normal' (read not whacky real-time people) mlock can still be useful
> and we should introduce a pinned user API for the RT people.

Right. I remember that.

> > Pinned pages are pages that have an elevated refcount because the hardware
> > needs to use these pages for I/O. The elevated refcount may be temporary
> > (then we dont care about this) or for a longer time (such as the memory
> > registration of the IB subsystem). That is when we account the memory as
> > pinned. The elevated refcount stops page migration and other things from
> > trying to move that memory.
>
> Again I _know_ that!!!

But then you refuse to acknowledge the difference and want to conflate
both.

> > Pages can be both pinned and mlocked.
>
> Right, but apart for mlockall() this is a highly unlikely situation to
> actually occur. And if you're using mlockall() you've effectively
> disabled RLIMIT_MEMLOCK and thus nobody cares if the resource counter
> goes funny.

mlockall() would never be used on all processes. You still need the
RLIMIT_MLOCK to ensure that the box does not lock up.

> > I think we need to be first clear on what we want to accomplish and what
> > these counters actually should count before changing things.
>
> Backward isn't it... _you_ changed it without consideration.

I applied the categorization that we had agreed on before during the
development of page migratiob. Pinning is not compatible.

> The IB code does a big get_user_pages(), which last time I checked
> pins a sequential range of pages. Therefore the VMA approach.

The IB code (and other code) can require the pinning of pages in various
ways.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Christoph Lameter <cl@linux.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>,
	Vince Weaver <vincent.weaver@maine.edu>,
	linux-kernel@vger.kernel.org, Paul Mackerras <paulus@samba.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@ghostprotocols.net>,
	trinity@vger.kernel.org, akpm@linux-foundation.org,
	torvalds@linux-foundation.org, roland@kernel.org,
	infinipath@qlogic.com, linux-mm@kvack.org,
	linux-rdma@vger.kernel.org, Or Gerlitz <or.gerlitz@gmail.com>,
	Hugh Dickins <hughd@google.com>
Subject: Re: [RFC][PATCH] mm: Fix RLIMIT_MEMLOCK
Date: Tue, 28 May 2013 16:37:06 +0000	[thread overview]
Message-ID: <0000013eec0006ee-0f8caf7b-cc94-4f54-ae38-0ca6623b7841-000000@email.amazonses.com> (raw)
In-Reply-To: <20130527064834.GA2781@laptop>

On Mon, 27 May 2013, Peter Zijlstra wrote:

> Before your patch pinned was included in locked and thus RLIMIT_MEMLOCK
> had a single resource counter. After your patch RLIMIT_MEMLOCK is
> applied separately to both -- more or less.

Before the patch the count was doubled since a single page was counted
twice: Once because it was mlocked (marked with PG_mlock) and then again
because it was also pinned (the refcount was increased). Two different things.

We have agreed for a long time that mlocked pages are movable. That is not
true for pinned pages and therefore pinning pages therefore do not fall
into that category (Hugh? AFAICR you came up with that rule?)

> NO, mlocked pages are pages that do not leave core memory; IOW do not
> cause major faults. Pinning pages is a perfectly spec compliant mlock()
> implementation.

That is not the definition that we have used so far.

> Now in an earlier discussion on the issue 'we' (I can't remember if you
> participated there, I remember Mel and Kosaki-San) agreed that for
> 'normal' (read not whacky real-time people) mlock can still be useful
> and we should introduce a pinned user API for the RT people.

Right. I remember that.

> > Pinned pages are pages that have an elevated refcount because the hardware
> > needs to use these pages for I/O. The elevated refcount may be temporary
> > (then we dont care about this) or for a longer time (such as the memory
> > registration of the IB subsystem). That is when we account the memory as
> > pinned. The elevated refcount stops page migration and other things from
> > trying to move that memory.
>
> Again I _know_ that!!!

But then you refuse to acknowledge the difference and want to conflate
both.

> > Pages can be both pinned and mlocked.
>
> Right, but apart for mlockall() this is a highly unlikely situation to
> actually occur. And if you're using mlockall() you've effectively
> disabled RLIMIT_MEMLOCK and thus nobody cares if the resource counter
> goes funny.

mlockall() would never be used on all processes. You still need the
RLIMIT_MLOCK to ensure that the box does not lock up.

> > I think we need to be first clear on what we want to accomplish and what
> > these counters actually should count before changing things.
>
> Backward isn't it... _you_ changed it without consideration.

I applied the categorization that we had agreed on before during the
development of page migratiob. Pinning is not compatible.

> The IB code does a big get_user_pages(), which last time I checked
> pins a sequential range of pages. Therefore the VMA approach.

The IB code (and other code) can require the pinning of pages in various
ways.

  reply	other threads:[~2013-05-28 16:37 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-22 19:35 OOPS in perf_mmap_close() Vince Weaver
2013-05-22 19:35 ` Vince Weaver
2013-05-22 23:56 ` Vince Weaver
2013-05-23  3:48   ` Vince Weaver
2013-05-23  4:48     ` Al Viro
2013-05-23 10:41       ` Peter Zijlstra
2013-05-23 14:09         ` Christoph Lameter
2013-05-23 15:24           ` Peter Zijlstra
2013-05-23 16:12             ` Christoph Lameter
2013-05-23 16:39               ` Peter Zijlstra
2013-05-23 17:59                 ` Christoph Lameter
2013-05-23 19:24                   ` Peter Zijlstra
2013-05-24 14:01                   ` [RFC][PATCH] mm: Fix RLIMIT_MEMLOCK Peter Zijlstra
2013-05-24 14:01                     ` Peter Zijlstra
2013-05-24 15:40                     ` Christoph Lameter
2013-05-24 15:40                       ` Christoph Lameter
2013-05-26  1:11                       ` KOSAKI Motohiro
2013-05-26  1:11                         ` KOSAKI Motohiro
2013-05-28 16:19                         ` Christoph Lameter
2013-05-28 16:19                           ` Christoph Lameter
2013-05-27  6:48                       ` Peter Zijlstra
2013-05-27  6:48                         ` Peter Zijlstra
2013-05-28 16:37                         ` Christoph Lameter [this message]
2013-05-28 16:37                           ` Christoph Lameter
2013-05-29  7:58                           ` [regression] " Ingo Molnar
2013-05-29  7:58                             ` Ingo Molnar
2013-05-29 19:53                             ` KOSAKI Motohiro
2013-05-29 19:53                               ` KOSAKI Motohiro
2013-05-30  6:32                               ` Ingo Molnar
2013-05-30  6:32                                 ` Ingo Molnar
2013-05-30 20:42                                 ` KOSAKI Motohiro
2013-05-30 20:42                                   ` KOSAKI Motohiro
2013-05-31  9:27                                   ` Ingo Molnar
2013-05-31  9:27                                     ` Ingo Molnar
2013-05-30 18:30                           ` Peter Zijlstra
2013-05-30 18:30                             ` Peter Zijlstra
2013-05-30 19:59                           ` Pekka Enberg
2013-05-30 19:59                             ` Pekka Enberg
2013-05-30 21:00                     ` KOSAKI Motohiro
2013-05-30 21:00                       ` KOSAKI Motohiro
2013-05-23 12:52       ` OOPS in perf_mmap_close() Peter Zijlstra
2013-05-23 14:10         ` Vince Weaver
2013-05-23 15:26           ` Peter Zijlstra
2013-05-23 15:47             ` Vince Weaver
2013-05-23 23:40             ` Vince Weaver
2013-05-24  9:21               ` Peter Zijlstra
2013-05-28  8:55               ` Peter Zijlstra
2013-05-28 13:29                 ` [tip:perf/urgent] perf: Fix perf mmap bugs tip-bot for Peter Zijlstra
2013-06-04  8:44                   ` Peter Zijlstra
2013-06-05 11:55                     ` Peter Zijlstra
2013-06-19 18:38                     ` [tip:perf/core] perf: Fix mmap() accounting hole tip-bot for Peter Zijlstra
2013-05-28 16:19                 ` OOPS in perf_mmap_close() Vince Weaver
2013-05-28 18:22                   ` Vince Weaver
2013-05-29  7:44                     ` Peter Zijlstra
2013-05-29 13:17                       ` Vince Weaver
2013-05-29 19:18                       ` Vince Weaver
2013-05-30  7:25                         ` Peter Zijlstra
2013-05-30 12:51                           ` Vince Weaver
2013-05-31 15:46                             ` Peter Zijlstra
2013-06-03 13:26                             ` Peter Zijlstra
2013-06-03 17:18                               ` Peter Zijlstra
2013-06-03 19:25                               ` Peter Zijlstra
2013-06-05 15:54                                 ` Vince Weaver
2013-06-05 16:54                                   ` Peter Zijlstra
2013-05-29  8:07                   ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0000013eec0006ee-0f8caf7b-cc94-4f54-ae38-0ca6623b7841-000000@email.amazonses.com \
    --to=cl@linux.com \
    --cc=acme@ghostprotocols.net \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=infinipath@qlogic.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=or.gerlitz@gmail.com \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=roland@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=trinity@vger.kernel.org \
    --cc=vincent.weaver@maine.edu \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.