All of lore.kernel.org
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>,
	Dave Hansen <dave.hansen@intel.com>, Mel Gorman <mgorman@suse.de>,
	linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-api@vger.kernel.org,
	mtk.manpages@gmail.com, linux-man@vger.kernel.org,
	Rik van Riel <riel@redhat.com>
Subject: Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints
Date: Wed, 4 Feb 2015 09:09:21 +0900	[thread overview]
Message-ID: <20150204000921.GC3583@blaptop> (raw)
In-Reply-To: <54D0B43D.8000209@suse.cz>

On Tue, Feb 03, 2015 at 12:42:53PM +0100, Vlastimil Babka wrote:
> On 02/03/2015 11:53 AM, Kirill A. Shutemov wrote:
> > On Tue, Feb 03, 2015 at 09:19:15AM +0100, Vlastimil Babka wrote:
> >> [CC linux-api, man pages]
> >> 
> >> On 02/02/2015 11:22 PM, Dave Hansen wrote:
> >> > On 02/02/2015 08:55 AM, Mel Gorman wrote:
> >> >> This patch identifies when a thread is frequently calling MADV_DONTNEED
> >> >> on the same region of memory and starts ignoring the hint. On an 8-core
> >> >> single-socket machine this was the impact on ebizzy using glibc 2.19.
> >> > 
> >> > The manpage, at least, claims that we zero-fill after MADV_DONTNEED is
> >> > called:
> >> > 
> >> >>      MADV_DONTNEED
> >> >>               Do  not  expect  access in the near future.  (For the time being, the application is finished with the given range, so the kernel can free resources
> >> >>               associated with it.)  Subsequent accesses of pages in this range will succeed, but will result either in reloading of the memory contents  from  the
> >> >>               underlying mapped file (see mmap(2)) or zero-fill-on-demand pages for mappings without an underlying file.
> >> > 
> >> > So if we have anything depending on the behavior that it's _always_
> >> > zero-filled after an MADV_DONTNEED, this will break it.
> >> 
> >> OK, so that's a third person (including me) who understood it as a zero-fill
> >> guarantee. I think the man page should be clarified (if it's indeed not
> >> guaranteed), or we have a bug.
> >> 
> >> The implementation actually skips MADV_DONTNEED for
> >> VM_LOCKED|VM_HUGETLB|VM_PFNMAP vma's.
> > 
> > It doesn't skip. It fails with -EINVAL. Or I miss something.
> 
> No, I missed that. Thanks for pointing out. The manpage also explains EINVAL in
> this case:
> 
> *  The application is attempting to release locked or shared pages (with
> MADV_DONTNEED).
> 
> - that covers mlocking ok, not sure if the rest fits the "shared pages" case
> though. I dont see any check for other kinds of shared pages in the code.
> 
> >> - The word "will result" did sound as a guarantee at least to me. So here it
> >> could be changed to "may result (unless the advice is ignored)"?
> > 
> > It's too late to fix documentation. Applications already depends on the
> > beheviour.
> 
> Right, so as long as they check for EINVAL, it should be safe. It appears that
> jemalloc does.
> 
> I still wouldnt be sure just by reading the man page that the clearing is
> guaranteed whenever I dont get an error return value, though,
> 

IMHO,

Man page said
"MADV_DONTNEED: Subsequent accesses of pages in this range will succeed,
 but will result either in reloading of  the memory contents from the
 underlying mapped file (see mmap(2)) or  zero-fill-on-demand pages
 for mappings without an underlying file."

Heap by allocated by malloc(3) is anonymous page so it's a mapping
withtout an underlying file so userspace can expect zero-fill.

Man page said
"EINVAL: The application is attempting to release locked or
shared pages (with MADV_DONTNEED)"

So, user can expect the call on area by allocated by malloc(3)
if he doesn't call mlock will always be successful.

Man page said
"madivse: This call does not influence the semantics of the application
(except in the case of MADV_DONTNEED)"

So, we shouldn't break MADV_DONTNEED's semantic which free pages
instantly. It's a long time semantic and it was one of arguable issues
on MADV_FREE Rik had tried long time ago to replace MADV_DONTNEED
with MADV_FREE.

-- 
Kind regards,
Minchan Kim

WARNING: multiple messages have this Message-ID (diff)
From: Minchan Kim <minchan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
To: Vlastimil Babka <vbabka-AlSwsSmVLrQ@public.gmane.org>
Cc: "Kirill A. Shutemov"
	<kirill-oKw7cIdHH8eLwutG50LtGA@public.gmane.org>,
	Dave Hansen <dave.hansen-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Mel Gorman <mgorman-l3A5Bk7waGM@public.gmane.org>,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
	linux-man-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Rik van Riel <riel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Subject: Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints
Date: Wed, 4 Feb 2015 09:09:21 +0900	[thread overview]
Message-ID: <20150204000921.GC3583@blaptop> (raw)
In-Reply-To: <54D0B43D.8000209-AlSwsSmVLrQ@public.gmane.org>

On Tue, Feb 03, 2015 at 12:42:53PM +0100, Vlastimil Babka wrote:
> On 02/03/2015 11:53 AM, Kirill A. Shutemov wrote:
> > On Tue, Feb 03, 2015 at 09:19:15AM +0100, Vlastimil Babka wrote:
> >> [CC linux-api, man pages]
> >> 
> >> On 02/02/2015 11:22 PM, Dave Hansen wrote:
> >> > On 02/02/2015 08:55 AM, Mel Gorman wrote:
> >> >> This patch identifies when a thread is frequently calling MADV_DONTNEED
> >> >> on the same region of memory and starts ignoring the hint. On an 8-core
> >> >> single-socket machine this was the impact on ebizzy using glibc 2.19.
> >> > 
> >> > The manpage, at least, claims that we zero-fill after MADV_DONTNEED is
> >> > called:
> >> > 
> >> >>      MADV_DONTNEED
> >> >>               Do  not  expect  access in the near future.  (For the time being, the application is finished with the given range, so the kernel can free resources
> >> >>               associated with it.)  Subsequent accesses of pages in this range will succeed, but will result either in reloading of the memory contents  from  the
> >> >>               underlying mapped file (see mmap(2)) or zero-fill-on-demand pages for mappings without an underlying file.
> >> > 
> >> > So if we have anything depending on the behavior that it's _always_
> >> > zero-filled after an MADV_DONTNEED, this will break it.
> >> 
> >> OK, so that's a third person (including me) who understood it as a zero-fill
> >> guarantee. I think the man page should be clarified (if it's indeed not
> >> guaranteed), or we have a bug.
> >> 
> >> The implementation actually skips MADV_DONTNEED for
> >> VM_LOCKED|VM_HUGETLB|VM_PFNMAP vma's.
> > 
> > It doesn't skip. It fails with -EINVAL. Or I miss something.
> 
> No, I missed that. Thanks for pointing out. The manpage also explains EINVAL in
> this case:
> 
> *  The application is attempting to release locked or shared pages (with
> MADV_DONTNEED).
> 
> - that covers mlocking ok, not sure if the rest fits the "shared pages" case
> though. I dont see any check for other kinds of shared pages in the code.
> 
> >> - The word "will result" did sound as a guarantee at least to me. So here it
> >> could be changed to "may result (unless the advice is ignored)"?
> > 
> > It's too late to fix documentation. Applications already depends on the
> > beheviour.
> 
> Right, so as long as they check for EINVAL, it should be safe. It appears that
> jemalloc does.
> 
> I still wouldnt be sure just by reading the man page that the clearing is
> guaranteed whenever I dont get an error return value, though,
> 

IMHO,

Man page said
"MADV_DONTNEED: Subsequent accesses of pages in this range will succeed,
 but will result either in reloading of  the memory contents from the
 underlying mapped file (see mmap(2)) or  zero-fill-on-demand pages
 for mappings without an underlying file."

Heap by allocated by malloc(3) is anonymous page so it's a mapping
withtout an underlying file so userspace can expect zero-fill.

Man page said
"EINVAL: The application is attempting to release locked or
shared pages (with MADV_DONTNEED)"

So, user can expect the call on area by allocated by malloc(3)
if he doesn't call mlock will always be successful.

Man page said
"madivse: This call does not influence the semantics of the application
(except in the case of MADV_DONTNEED)"

So, we shouldn't break MADV_DONTNEED's semantic which free pages
instantly. It's a long time semantic and it was one of arguable issues
on MADV_FREE Rik had tried long time ago to replace MADV_DONTNEED
with MADV_FREE.

-- 
Kind regards,
Minchan Kim

WARNING: multiple messages have this Message-ID (diff)
From: Minchan Kim <minchan@kernel.org>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>,
	Dave Hansen <dave.hansen@intel.com>, Mel Gorman <mgorman@suse.de>,
	linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-api@vger.kernel.org,
	mtk.manpages@gmail.com, linux-man@vger.kernel.org,
	Rik van Riel <riel@redhat.com>
Subject: Re: MADV_DONTNEED semantics? Was: [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints
Date: Wed, 4 Feb 2015 09:09:21 +0900	[thread overview]
Message-ID: <20150204000921.GC3583@blaptop> (raw)
In-Reply-To: <54D0B43D.8000209@suse.cz>

On Tue, Feb 03, 2015 at 12:42:53PM +0100, Vlastimil Babka wrote:
> On 02/03/2015 11:53 AM, Kirill A. Shutemov wrote:
> > On Tue, Feb 03, 2015 at 09:19:15AM +0100, Vlastimil Babka wrote:
> >> [CC linux-api, man pages]
> >> 
> >> On 02/02/2015 11:22 PM, Dave Hansen wrote:
> >> > On 02/02/2015 08:55 AM, Mel Gorman wrote:
> >> >> This patch identifies when a thread is frequently calling MADV_DONTNEED
> >> >> on the same region of memory and starts ignoring the hint. On an 8-core
> >> >> single-socket machine this was the impact on ebizzy using glibc 2.19.
> >> > 
> >> > The manpage, at least, claims that we zero-fill after MADV_DONTNEED is
> >> > called:
> >> > 
> >> >>      MADV_DONTNEED
> >> >>               Do  not  expect  access in the near future.  (For the time being, the application is finished with the given range, so the kernel can free resources
> >> >>               associated with it.)  Subsequent accesses of pages in this range will succeed, but will result either in reloading of the memory contents  from  the
> >> >>               underlying mapped file (see mmap(2)) or zero-fill-on-demand pages for mappings without an underlying file.
> >> > 
> >> > So if we have anything depending on the behavior that it's _always_
> >> > zero-filled after an MADV_DONTNEED, this will break it.
> >> 
> >> OK, so that's a third person (including me) who understood it as a zero-fill
> >> guarantee. I think the man page should be clarified (if it's indeed not
> >> guaranteed), or we have a bug.
> >> 
> >> The implementation actually skips MADV_DONTNEED for
> >> VM_LOCKED|VM_HUGETLB|VM_PFNMAP vma's.
> > 
> > It doesn't skip. It fails with -EINVAL. Or I miss something.
> 
> No, I missed that. Thanks for pointing out. The manpage also explains EINVAL in
> this case:
> 
> *  The application is attempting to release locked or shared pages (with
> MADV_DONTNEED).
> 
> - that covers mlocking ok, not sure if the rest fits the "shared pages" case
> though. I dont see any check for other kinds of shared pages in the code.
> 
> >> - The word "will result" did sound as a guarantee at least to me. So here it
> >> could be changed to "may result (unless the advice is ignored)"?
> > 
> > It's too late to fix documentation. Applications already depends on the
> > beheviour.
> 
> Right, so as long as they check for EINVAL, it should be safe. It appears that
> jemalloc does.
> 
> I still wouldnt be sure just by reading the man page that the clearing is
> guaranteed whenever I dont get an error return value, though,
> 

IMHO,

Man page said
"MADV_DONTNEED: Subsequent accesses of pages in this range will succeed,
 but will result either in reloading of  the memory contents from the
 underlying mapped file (see mmap(2)) or  zero-fill-on-demand pages
 for mappings without an underlying file."

Heap by allocated by malloc(3) is anonymous page so it's a mapping
withtout an underlying file so userspace can expect zero-fill.

Man page said
"EINVAL: The application is attempting to release locked or
shared pages (with MADV_DONTNEED)"

So, user can expect the call on area by allocated by malloc(3)
if he doesn't call mlock will always be successful.

Man page said
"madivse: This call does not influence the semantics of the application
(except in the case of MADV_DONTNEED)"

So, we shouldn't break MADV_DONTNEED's semantic which free pages
instantly. It's a long time semantic and it was one of arguable issues
on MADV_FREE Rik had tried long time ago to replace MADV_DONTNEED
with MADV_FREE.

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2015-02-04  0:09 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-02 16:55 [RFC PATCH] mm: madvise: Ignore repeated MADV_DONTNEED hints Mel Gorman
2015-02-02 16:55 ` Mel Gorman
2015-02-02 22:05 ` Andrew Morton
2015-02-02 22:05   ` Andrew Morton
2015-02-02 22:18   ` Mel Gorman
2015-02-02 22:18     ` Mel Gorman
2015-02-02 22:35     ` Andrew Morton
2015-02-02 22:35       ` Andrew Morton
2015-02-03  0:26       ` Davidlohr Bueso
2015-02-03  0:26         ` Davidlohr Bueso
2015-02-03 10:50       ` Mel Gorman
2015-02-03 10:50         ` Mel Gorman
2015-02-05 21:44     ` Rik van Riel
2015-02-05 21:44       ` Rik van Riel
2015-02-02 22:22 ` Dave Hansen
2015-02-02 22:22   ` Dave Hansen
2015-02-03  8:19   ` MADV_DONTNEED semantics? Was: " Vlastimil Babka
2015-02-03  8:19     ` Vlastimil Babka
2015-02-03 10:53     ` Kirill A. Shutemov
2015-02-03 10:53       ` Kirill A. Shutemov
2015-02-03 10:53       ` Kirill A. Shutemov
2015-02-03 11:42       ` Vlastimil Babka
2015-02-03 11:42         ` Vlastimil Babka
2015-02-03 16:20         ` Michael Kerrisk (man-pages)
2015-02-03 16:20           ` Michael Kerrisk (man-pages)
2015-02-04 13:46           ` Vlastimil Babka
2015-02-04 13:46             ` Vlastimil Babka
2015-02-04 13:46             ` Vlastimil Babka
2015-02-04 14:00             ` Michael Kerrisk (man-pages)
2015-02-04 14:00               ` Michael Kerrisk (man-pages)
2015-02-04 14:00               ` Michael Kerrisk (man-pages)
2015-02-04 17:02               ` Vlastimil Babka
2015-02-04 17:02                 ` Vlastimil Babka
2015-02-04 19:24                 ` Michael Kerrisk (man-pages)
2015-02-04 19:24                   ` Michael Kerrisk (man-pages)
2015-02-04 19:24                   ` Michael Kerrisk (man-pages)
2015-02-05  1:07                   ` Minchan Kim
2015-02-05  1:07                     ` Minchan Kim
2015-02-05  1:07                     ` Minchan Kim
2015-02-06 15:41                     ` Michael Kerrisk (man-pages)
2015-02-06 15:41                       ` Michael Kerrisk (man-pages)
2015-02-06 15:41                       ` Michael Kerrisk (man-pages)
2015-02-09  6:46                       ` Minchan Kim
2015-02-09  6:46                         ` Minchan Kim
2015-02-09  6:46                         ` Minchan Kim
2015-02-09  9:13                         ` Michael Kerrisk (man-pages)
2015-02-09  9:13                           ` Michael Kerrisk (man-pages)
2015-02-05 15:41                   ` Michal Hocko
2015-02-05 15:41                     ` Michal Hocko
2015-02-05 15:41                     ` Michal Hocko
2015-02-06 15:57                     ` Michael Kerrisk (man-pages)
2015-02-06 15:57                       ` Michael Kerrisk (man-pages)
2015-02-06 15:57                       ` Michael Kerrisk (man-pages)
2015-02-06 20:45                       ` Michal Hocko
2015-02-06 20:45                         ` Michal Hocko
2015-02-06 20:45                         ` Michal Hocko
2015-02-09  6:50                       ` Minchan Kim
2015-02-09  6:50                         ` Minchan Kim
2015-02-09  6:50                         ` Minchan Kim
2015-02-04  0:09         ` Minchan Kim [this message]
2015-02-04  0:09           ` Minchan Kim
2015-02-04  0:09           ` Minchan Kim
2015-02-03 11:16     ` Mel Gorman
2015-02-03 11:16       ` Mel Gorman
2015-02-03 15:21       ` Michal Hocko
2015-02-03 15:21         ` Michal Hocko
2015-02-03 15:21         ` Michal Hocko
2015-02-03 16:25         ` Michael Kerrisk (man-pages)
2015-02-03 16:25           ` Michael Kerrisk (man-pages)
2015-02-03 16:25           ` Michael Kerrisk (man-pages)
2015-02-03  9:47   ` Mel Gorman
2015-02-03  9:47     ` Mel Gorman
2015-02-03 10:47     ` Kirill A. Shutemov
2015-02-03 10:47       ` Kirill A. Shutemov
2015-02-03 11:21       ` Mel Gorman
2015-02-03 11:21         ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150204000921.GC3583@blaptop \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-man@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mtk.manpages@gmail.com \
    --cc=riel@redhat.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.