linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] recognize MAP_LOCKED in mmap() call
@ 2002-09-18 19:18 Mark_H_Johnson
  2002-09-18 19:39 ` Rik van Riel
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Mark_H_Johnson @ 2002-09-18 19:18 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, linux-mm, owner-linux-mm


Andrew Morton wrote:
>(SuS really only anticipates that mmap needs to look at prior mlocks
>in force against the address range.  It also says
>
>     Process memory locking does apply to shared memory regions,
>
>and we don't do that either.  I think we should; can't see why SuS
>requires this.)

Let me make sure I read what you said correctly. Does this mean that Linux
2.4 (or 2.5) kernels do not lock shared memory regions if a process uses
mlockall?

If not, that is *really bad* for our real time applications. We don't want
to take a page fault while running some 80hz task, just because some
non-real time application tried to use what little physical memory we allow
for the kernel and all other applications.

I asked a related question about a week ago on linux-mm and didn't get a
response. Basically, I was concerned that top did not show RSS == Size when
mlockall(MCL_CURRENT|MCL_FUTURE) was called. Could this explain the
difference or is there something else that I'm missing here?

Thanks.
--Mark H Johnson
  <mailto:Mark_H_Johnson@raytheon.com>




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] recognize MAP_LOCKED in mmap() call
  2002-09-18 19:18 [PATCH] recognize MAP_LOCKED in mmap() call Mark_H_Johnson
@ 2002-09-18 19:39 ` Rik van Riel
  2002-09-18 19:54 ` Andrew Morton
  2002-09-25 15:36 ` Hubertus Franke
  2 siblings, 0 replies; 9+ messages in thread
From: Rik van Riel @ 2002-09-18 19:39 UTC (permalink / raw)
  To: Mark_H_Johnson; +Cc: Andrew Morton, linux-kernel, linux-mm, owner-linux-mm

On Wed, 18 Sep 2002 Mark_H_Johnson@raytheon.com wrote:
> Andrew Morton wrote:
> >(SuS really only anticipates that mmap needs to look at prior mlocks
> >in force against the address range.  It also says
> >
> >     Process memory locking does apply to shared memory regions,
> >
> >and we don't do that either.  I think we should; can't see why SuS
> >requires this.)
>
> Let me make sure I read what you said correctly. Does this mean that
> Linux 2.4 (or 2.5) kernels do not lock shared memory regions if a
> process uses mlockall?

But it does.  Linux won't evict memory that's MLOCKed...

cheers,

Rik
-- 
Spamtrap of the month: september@surriel.com

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] recognize MAP_LOCKED in mmap() call
  2002-09-18 19:18 [PATCH] recognize MAP_LOCKED in mmap() call Mark_H_Johnson
  2002-09-18 19:39 ` Rik van Riel
@ 2002-09-18 19:54 ` Andrew Morton
  2002-09-25 15:42   ` Hubertus Franke
  2002-09-25 15:36 ` Hubertus Franke
  2 siblings, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2002-09-18 19:54 UTC (permalink / raw)
  To: Mark_H_Johnson; +Cc: linux-kernel, linux-mm, owner-linux-mm

Mark_H_Johnson@raytheon.com wrote:
> 
> Andrew Morton wrote:
> >(SuS really only anticipates that mmap needs to look at prior mlocks
> >in force against the address range.  It also says
> >
> >     Process memory locking does apply to shared memory regions,
> >
> >and we don't do that either.  I think we should; can't see why SuS
> >requires this.)
> 
> Let me make sure I read what you said correctly. Does this mean that Linux
> 2.4 (or 2.5) kernels do not lock shared memory regions if a process uses
> mlockall?

Linux does lock these regions.  SuS seems to imply that we shouldn't.
But we should.

> If not, that is *really bad* for our real time applications. We don't want
> to take a page fault while running some 80hz task, just because some
> non-real time application tried to use what little physical memory we allow
> for the kernel and all other applications.
> 
> I asked a related question about a week ago on linux-mm and didn't get a
> response. Basically, I was concerned that top did not show RSS == Size when
> mlockall(MCL_CURRENT|MCL_FUTURE) was called. Could this explain the
> difference or is there something else that I'm missing here?
> 

That mlockall should have faulted everything in.  It could be an
accounting bug, or it could be a bug.  That's not an aspect which
gets tested a lot.  I'll take a look.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] recognize MAP_LOCKED in mmap() call
  2002-09-18 19:18 [PATCH] recognize MAP_LOCKED in mmap() call Mark_H_Johnson
  2002-09-18 19:39 ` Rik van Riel
  2002-09-18 19:54 ` Andrew Morton
@ 2002-09-25 15:36 ` Hubertus Franke
  2 siblings, 0 replies; 9+ messages in thread
From: Hubertus Franke @ 2002-09-25 15:36 UTC (permalink / raw)
  To: Mark_H_Johnson, Andrew Morton; +Cc: linux-kernel, linux-mm, owner-linux-mm

On Wednesday 18 September 2002 03:18 pm, Mark_H_Johnson@raytheon.com wrote:
> Andrew Morton wrote:
> >(SuS really only anticipates that mmap needs to look at prior mlocks
> >in force against the address range.  It also says
> >
> >     Process memory locking does apply to shared memory regions,
> >
> >and we don't do that either.  I think we should; can't see why SuS
> >requires this.)
>
> Let me make sure I read what you said correctly. Does this mean that Linux
> 2.4 (or 2.5) kernels do not lock shared memory regions if a process uses
> mlockall?
>
> If not, that is *really bad* for our real time applications. We don't want
> to take a page fault while running some 80hz task, just because some
> non-real time application tried to use what little physical memory we allow
> for the kernel and all other applications.
>
> I asked a related question about a week ago on linux-mm and didn't get a
> response. Basically, I was concerned that top did not show RSS == Size when
> mlockall(MCL_CURRENT|MCL_FUTURE) was called. Could this explain the
> difference or is there something else that I'm missing here?
>
> Thanks.
> --Mark H Johnson
>   <mailto:Mark_H_Johnson@raytheon.com>


Sorry for the lengthy delay.
mlock() and mlockall() do the right thing..
however, mmap(MAP_LOCKED) should behave like a    mmap | mlock operation 
according to the manpages. This however was not implemented as the 
transformation from the mmap_flags to vm_flags never checked for MAP_LOCKED
but only for mm->def_flags which only covers a previous mlockall() call.

Hope this clarifies it .
-- 
-- Hubertus Franke  (frankeh@watson.ibm.com)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] recognize MAP_LOCKED in mmap() call
  2002-09-18 19:54 ` Andrew Morton
@ 2002-09-25 15:42   ` Hubertus Franke
  2002-09-25 16:35     ` Andrew Morton
  0 siblings, 1 reply; 9+ messages in thread
From: Hubertus Franke @ 2002-09-25 15:42 UTC (permalink / raw)
  To: Andrew Morton, Mark_H_Johnson; +Cc: linux-kernel, linux-mm, owner-linux-mm

On Wednesday 18 September 2002 03:54 pm, Andrew Morton wrote:
> Mark_H_Johnson@raytheon.com wrote:
> > Andrew Morton wrote:
> > >(SuS really only anticipates that mmap needs to look at prior mlocks
> > >in force against the address range.  It also says
> > >
> > >     Process memory locking does apply to shared memory regions,
> > >
> > >and we don't do that either.  I think we should; can't see why SuS
> > >requires this.)
> >
> > Let me make sure I read what you said correctly. Does this mean that
> > Linux 2.4 (or 2.5) kernels do not lock shared memory regions if a process
> > uses mlockall?
>
> Linux does lock these regions.  SuS seems to imply that we shouldn't.
> But we should.
>
> > If not, that is *really bad* for our real time applications. We don't
> > want to take a page fault while running some 80hz task, just because some
> > non-real time application tried to use what little physical memory we
> > allow for the kernel and all other applications.
> >
> > I asked a related question about a week ago on linux-mm and didn't get a
> > response. Basically, I was concerned that top did not show RSS == Size
> > when mlockall(MCL_CURRENT|MCL_FUTURE) was called. Could this explain the
> > difference or is there something else that I'm missing here?
>
> That mlockall should have faulted everything in.  It could be an
> accounting bug, or it could be a bug.  That's not an aspect which
> gets tested a lot.  I'll take a look.


This is what the manpage says...

       mlockall  disables  paging  for  all pages mapped into the
       address space of the calling process.  This  includes  the
       pages  of  the  code,  data  and stack segment, as well as
       shared libraries, user space kernel  data,  shared  memory
       and  memory  mapped files. All mapped pages are guaranteed
       to be resident  in  RAM  when  the  mlockall  system  call
       returns  successfully  and  they are guaranteed to stay in
       RAM until the pages  are  unlocked  again  by  munlock  or
       munlockall  or  until  the  process  terminates  or starts
       another program with exec.  Child processes do not inherit
       page locks across a fork.

Do you read that all pages must be faulted in apriori ?
Or is it sufficient to to make sure non of the currently mapped 
pages are swapped out and future swapout is prohibited.

This still allows for page faults on pages that have not been
mapped in the specified range or process. If required the 
app could touch these and they wouldn't be swapped later.


-- 
-- Hubertus Franke  (frankeh@watson.ibm.com)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] recognize MAP_LOCKED in mmap() call
  2002-09-25 15:42   ` Hubertus Franke
@ 2002-09-25 16:35     ` Andrew Morton
  0 siblings, 0 replies; 9+ messages in thread
From: Andrew Morton @ 2002-09-25 16:35 UTC (permalink / raw)
  To: frankeh; +Cc: Mark_H_Johnson, linux-kernel, linux-mm

Hubertus Franke wrote:
> 
> ...
> This is what the manpage says...
> 
>        mlockall  disables  paging  for  all pages mapped into the
>        address space of the calling process.  This  includes  the
>        pages  of  the  code,  data  and stack segment, as well as
>        shared libraries, user space kernel  data,  shared  memory
>        and  memory  mapped files. All mapped pages are guaranteed
>        to be resident  in  RAM  when  the  mlockall  system  call
>        returns  successfully  and  they are guaranteed to stay in
>        RAM until the pages  are  unlocked  again  by  munlock  or
>        munlockall  or  until  the  process  terminates  or starts
>        another program with exec.  Child processes do not inherit
>        page locks across a fork.
> 
> Do you read that all pages must be faulted in apriori ?

For MCL_FUTURE.

> Or is it sufficient to to make sure non of the currently mapped
> pages are swapped out and future swapout is prohibited.

I'd say that we should try to make all the pages present.  But
if it's a problem for (say) a hugepage implementation then it's
unlikely that the world would end if these things were still
demand paged in.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] recognize MAP_LOCKED in mmap() call
@ 2002-09-25 16:57 Mark_H_Johnson
  0 siblings, 0 replies; 9+ messages in thread
From: Mark_H_Johnson @ 2002-09-25 16:57 UTC (permalink / raw)
  To: frankeh; +Cc: Andrew Morton, linux-kernel, linux-mm


>This is what the manpage says...
>
>       mlockall  disables  paging  for  all pages mapped into the
>       address space of the calling process.  This  includes  the
>       pages  of  the  code,  data  and stack segment, as well as
>       shared libraries, user space kernel  data,  shared  memory
>       and  memory  mapped files. All mapped pages are guaranteed
>       to be resident  in  RAM  when  the  mlockall  system  call
>       returns  successfully  and  they are guaranteed to stay in
>       RAM until the pages  are  unlocked  again  by  munlock  or
>       munlockall  or  until  the  process  terminates  or starts
>       another program with exec.  Child processes do not inherit
>       page locks across a fork.
>
>Do you read that all pages must be faulted in apriori ?
>Or is it sufficient to to make sure none of the currently mapped
>pages are swapped out and future swapout is prohibited.
>
The key phrase is that "...all mapped pages are guaranteed to be resident
in RAM when the mlockall system call returns successfully..." (third
sentence) In that way I would expect the segments containing the code,
heap, and current stack allocations to be resident. I do not expect
the full stack allocation (e.g., 2M for each thread if that is the
stack size) to be mapped (nor resident) unless I take special action
to grow the stack that large.

We happen to have special code to grow each stack and allocate heap
variables to account for what we expect to use prior to mlockall.

That does raise a question though - are there other segments (e.g.,
debug information) that may be in the total size calculations that
are mapped only when some special action is taken (e.g., I run the
debugger)? That would explain the difference - the measures I reported
on were with executables built with debug symbols.

That might also explain a possible problem we have had when trying
to debug such an application after an hour of run time or so. If
running gdb triggers a growth in locked memory (and we don't have
enough) - we would likely get an error condition that isn't normally
expected by gdb.

>This still allows for page faults on pages that have not been
>mapped in the specified range or process. If required the
>app could touch these and they wouldn't be swapped later.
>

I don't think touching the pages is enough - they have to be allocated
and the maps generated (e.g., calls to mmap, malloc). That is a possibly
expensive operation when real time is active and something we try to
avoid whenever possible.

--
--Mark H Johnson
  <mailto:Mark_H_Johnson@raytheon.com>



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] recognize MAP_LOCKED in mmap() call
  2002-09-18 16:07     ` [PATCH] recognize MAP_LOCKED in mmap() call Hubertus Franke
@ 2002-09-18 16:29       ` Andrew Morton
  0 siblings, 0 replies; 9+ messages in thread
From: Andrew Morton @ 2002-09-18 16:29 UTC (permalink / raw)
  To: frankeh; +Cc: linux-kernel, linux-mm

Hubertus Franke wrote:
> 
> Andrew, at the current time an mmap() ignores a MAP_LOCKED passed to it.
> The only way we can get VM_LOCKED associated with the newly created VMA
> is to have previously called mlockall() on the process which sets the
> mm->def_flags != VM_LOCKED or subsequently call mlock() on the
> newly created VMA.
> 
> The attached patch checks for MAP_LOCKED being passed and if so checks
> the capabilities of the process. Limit checks were already in place.

Looks sane, thanks.

It appears that MAP_LOCKED is a Linux-special, so presumably it
_used_ to work.  I wonder when it broke?

You patch applies to 2.4 as well; it would be useful to give that
a sanity test and send a copy to Marcelo.

(SuS really only anticipates that mmap needs to look at prior mlocks
in force against the address range.  It also says

     Process memory locking does apply to shared memory regions,

and we don't do that either.  I think we should; can't see why SuS
requires this.)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] recognize MAP_LOCKED in mmap() call
  2002-09-13 21:30   ` William Lee Irwin III
@ 2002-09-18 16:07     ` Hubertus Franke
  2002-09-18 16:29       ` Andrew Morton
  0 siblings, 1 reply; 9+ messages in thread
From: Hubertus Franke @ 2002-09-18 16:07 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, linux-mm

[-- Attachment #1: Type: text/plain, Size: 1017 bytes --]


Andrew, at the current time an mmap() ignores a MAP_LOCKED passed to it.
The only way we can get VM_LOCKED associated with the newly created VMA
is to have previously called mlockall() on the process which sets the 
mm->def_flags != VM_LOCKED or subsequently call mlock() on the
newly created VMA.

The attached patch checks for MAP_LOCKED being passed and if so checks
the capabilities of the process. Limit checks were already in place.
-- 
-- Hubertus Franke  (frankeh@watson.ibm.com)

--------------------------------< PATCH >------------------------------
--- linux-2.5.35/mm/mmap.c	Wed Sep 18 11:12:13 2002
+++ linux-2.5.35-fix/mm/mmap.c	Wed Sep 18 11:44:32 2002
@@ -461,6 +461,11 @@
 	 */
 	vm_flags = calc_vm_flags(prot,flags) | mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC;
 
+	if (flags & MAP_LOCKED) {
+		if (!capable(CAP_IPC_LOCK))
+			return -EPERM;
+		vm_flags |= VM_LOCKED;
+	}
 	/* mlock MCL_FUTURE? */
 	if (vm_flags & VM_LOCKED) {
 		unsigned long locked = mm->locked_vm << PAGE_SHIFT;




[-- Attachment #2: patch.2.5.35.mmap_locked --]
[-- Type: text/x-diff, Size: 452 bytes --]

--- linux-2.5.35/mm/mmap.c	Wed Sep 18 11:12:13 2002
+++ linux-2.5.35-fix/mm/mmap.c	Wed Sep 18 11:44:32 2002
@@ -461,6 +461,11 @@
 	 */
 	vm_flags = calc_vm_flags(prot,flags) | mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC;
 
+	if (flags & MAP_LOCKED) {
+		if (!capable(CAP_IPC_LOCK))
+			return -EPERM;
+		vm_flags |= VM_LOCKED;
+	}
 	/* mlock MCL_FUTURE? */
 	if (vm_flags & VM_LOCKED) {
 		unsigned long locked = mm->locked_vm << PAGE_SHIFT;

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2002-09-25 17:05 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-09-18 19:18 [PATCH] recognize MAP_LOCKED in mmap() call Mark_H_Johnson
2002-09-18 19:39 ` Rik van Riel
2002-09-18 19:54 ` Andrew Morton
2002-09-25 15:42   ` Hubertus Franke
2002-09-25 16:35     ` Andrew Morton
2002-09-25 15:36 ` Hubertus Franke
  -- strict thread matches above, loose matches on Subject: below --
2002-09-25 16:57 Mark_H_Johnson
2002-09-13  3:33 [PATCH] per-zone kswapd process Dave Hansen
2002-09-13 13:05 ` Alan Cox
2002-09-13 21:30   ` William Lee Irwin III
2002-09-18 16:07     ` [PATCH] recognize MAP_LOCKED in mmap() call Hubertus Franke
2002-09-18 16:29       ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).