All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: David Hildenbrand <david@redhat.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Arnd Bergmann <arnd@arndb.de>, Oscar Salvador <osalvador@suse.de>,
	Matthew Wilcox <willy@infradead.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Minchan Kim <minchan@kernel.org>, Jann Horn <jannh@google.com>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	Dave Hansen <dave.hansen@intel.com>,
	Hugh Dickins <hughd@google.com>, Rik van Riel <riel@surriel.com>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Richard Henderson <rth@twiddle.net>,
	Ivan Kokshaysky <ink@jurassic.park.msu.ru>,
	Matt Turner <mattst88@gmail.com>,
	Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
	"James E.J. Bottomley" <James.Bottomley@hansenpartnership.com>,
	Helge Deller <deller@gmx.de>, Chris Zankel <chris@zankel.net>,
	Max Filippov <jcmvbkbc@gmail.com>,
	linux-alpha@vger.kernel.org, linux-mips@vger.kernel.org,
	linux-parisc@vger.kernel.org, linux-xtensa@linux-xtensa.org,
	linux-arch@vger.kernel.org
Subject: Re: [PATCH RFC] mm/madvise: introduce MADV_POPULATE to prefault/prealloc memory
Date: Fri, 19 Feb 2021 11:35:21 +0100	[thread overview]
Message-ID: <YC+UaTVUn0o4Zynz@dhcp22.suse.cz> (raw)
In-Reply-To: <20210217154844.12392-1-david@redhat.com>

On Wed 17-02-21 16:48:44, David Hildenbrand wrote:
[...]

I only got  to the implementation now.

> +static long madvise_populate(struct vm_area_struct *vma,
> +			     struct vm_area_struct **prev,
> +			     unsigned long start, unsigned long end)
> +{
> +	struct mm_struct *mm = vma->vm_mm;
> +	unsigned long tmp_end;
> +	int locked = 1;
> +	long pages;
> +
> +	*prev = vma;
> +
> +	while (start < end) {
> +		/*
> +		 * We might have temporarily dropped the lock. For example,
> +		 * our VMA might have been split.
> +		 */
> +		if (!vma || start >= vma->vm_end) {
> +			vma = find_vma(mm, start);
> +			if (!vma)
> +				return -ENOMEM;
> +		}

Why do you need to find a vma when you already have one. do_madvise will
give you your vma already. I do understand that you want to finish the
vma for some errors but that shouldn't require handling vmas. You should
be in the shope of one here unless I miss anything.

> +
> +		/* Bail out on incompatible VMA types. */
> +		if (vma->vm_flags & (VM_IO | VM_PFNMAP) ||
> +		    !vma_is_accessible(vma)) {
> +			return -EINVAL;
> +		}
> +
> +		/*
> +		 * Populate pages and take care of VM_LOCKED: simulate user
> +		 * space access.
> +		 *
> +		 * For private, writable mappings, trigger a write fault to
> +		 * break COW (i.e., shared zeropage). For other mappings (i.e.,
> +		 * read-only, shared), trigger a read fault.
> +		 */
> +		tmp_end = min_t(unsigned long, end, vma->vm_end);
> +		pages = populate_vma_page_range(vma, start, tmp_end, &locked);
> +		if (!locked) {
> +			mmap_read_lock(mm);
> +			*prev = NULL;
> +			vma = NULL;
> +		}
> +		if (pages < 0) {
> +			switch (pages) {
> +			case -EINTR:
> +			case -ENOMEM:
> +				return pages;
> +			case -EHWPOISON:
> +				/* Skip over any poisoned pages. */
> +				start += PAGE_SIZE;
> +				continue;
> +			case -EBUSY:
> +			case -EAGAIN:
> +				continue;
> +			default:
> +				pr_warn_once("%s: unhandled return value: %ld\n",
> +					     __func__, pages);
> +				return -ENOMEM;
> +			}
> +		}
> +		start += pages * PAGE_SIZE;
> +	}
> +	return 0;
> +}
> +
>  /*
>   * Application wants to free up the pages and associated backing store.
>   * This is effectively punching a hole into the middle of a file.
> @@ -934,6 +1001,8 @@ madvise_vma(struct vm_area_struct *vma, struct vm_area_struct **prev,
>  	case MADV_FREE:
>  	case MADV_DONTNEED:
>  		return madvise_dontneed_free(vma, prev, start, end, behavior);
> +	case MADV_POPULATE:
> +		return madvise_populate(vma, prev, start, end);
>  	default:
>  		return madvise_behavior(vma, prev, start, end, behavior);
>  	}
> @@ -954,6 +1023,7 @@ madvise_behavior_valid(int behavior)
>  	case MADV_FREE:
>  	case MADV_COLD:
>  	case MADV_PAGEOUT:
> +	case MADV_POPULATE:
>  #ifdef CONFIG_KSM
>  	case MADV_MERGEABLE:
>  	case MADV_UNMERGEABLE:
> -- 
> 2.29.2
> 

-- 
Michal Hocko
SUSE Labs

WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@suse.com>
To: David Hildenbrand <david@redhat.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Arnd Bergmann <arnd@arndb.de>, Oscar Salvador <osalvador@suse.de>,
	Matthew Wilcox <willy@infradead.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Minchan Kim <minchan@kernel.org>, Jann Horn <jannh@google.com>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	Dave Hansen <dave.hansen@intel.com>,
	Hugh Dickins <hughd@google.com>, Rik van Riel <riel@surriel.com>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Richard Henderson <rth@twiddle.net>,
	Ivan Kokshaysky <ink@jurassic.park.msu.ru>,
	Matt Turner <mattst88@gmail.com>,
	Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
	James E.J. Bottomle
Subject: Re: [PATCH RFC] mm/madvise: introduce MADV_POPULATE to prefault/prealloc memory
Date: Fri, 19 Feb 2021 11:35:21 +0100	[thread overview]
Message-ID: <YC+UaTVUn0o4Zynz@dhcp22.suse.cz> (raw)
In-Reply-To: <20210217154844.12392-1-david@redhat.com>

On Wed 17-02-21 16:48:44, David Hildenbrand wrote:
[...]

I only got  to the implementation now.

> +static long madvise_populate(struct vm_area_struct *vma,
> +			     struct vm_area_struct **prev,
> +			     unsigned long start, unsigned long end)
> +{
> +	struct mm_struct *mm = vma->vm_mm;
> +	unsigned long tmp_end;
> +	int locked = 1;
> +	long pages;
> +
> +	*prev = vma;
> +
> +	while (start < end) {
> +		/*
> +		 * We might have temporarily dropped the lock. For example,
> +		 * our VMA might have been split.
> +		 */
> +		if (!vma || start >= vma->vm_end) {
> +			vma = find_vma(mm, start);
> +			if (!vma)
> +				return -ENOMEM;
> +		}

Why do you need to find a vma when you already have one. do_madvise will
give you your vma already. I do understand that you want to finish the
vma for some errors but that shouldn't require handling vmas. You should
be in the shope of one here unless I miss anything.

> +
> +		/* Bail out on incompatible VMA types. */
> +		if (vma->vm_flags & (VM_IO | VM_PFNMAP) ||
> +		    !vma_is_accessible(vma)) {
> +			return -EINVAL;
> +		}
> +
> +		/*
> +		 * Populate pages and take care of VM_LOCKED: simulate user
> +		 * space access.
> +		 *
> +		 * For private, writable mappings, trigger a write fault to
> +		 * break COW (i.e., shared zeropage). For other mappings (i.e.,
> +		 * read-only, shared), trigger a read fault.
> +		 */
> +		tmp_end = min_t(unsigned long, end, vma->vm_end);
> +		pages = populate_vma_page_range(vma, start, tmp_end, &locked);
> +		if (!locked) {
> +			mmap_read_lock(mm);
> +			*prev = NULL;
> +			vma = NULL;
> +		}
> +		if (pages < 0) {
> +			switch (pages) {
> +			case -EINTR:
> +			case -ENOMEM:
> +				return pages;
> +			case -EHWPOISON:
> +				/* Skip over any poisoned pages. */
> +				start += PAGE_SIZE;
> +				continue;
> +			case -EBUSY:
> +			case -EAGAIN:
> +				continue;
> +			default:
> +				pr_warn_once("%s: unhandled return value: %ld\n",
> +					     __func__, pages);
> +				return -ENOMEM;
> +			}
> +		}
> +		start += pages * PAGE_SIZE;
> +	}
> +	return 0;
> +}
> +
>  /*
>   * Application wants to free up the pages and associated backing store.
>   * This is effectively punching a hole into the middle of a file.
> @@ -934,6 +1001,8 @@ madvise_vma(struct vm_area_struct *vma, struct vm_area_struct **prev,
>  	case MADV_FREE:
>  	case MADV_DONTNEED:
>  		return madvise_dontneed_free(vma, prev, start, end, behavior);
> +	case MADV_POPULATE:
> +		return madvise_populate(vma, prev, start, end);
>  	default:
>  		return madvise_behavior(vma, prev, start, end, behavior);
>  	}
> @@ -954,6 +1023,7 @@ madvise_behavior_valid(int behavior)
>  	case MADV_FREE:
>  	case MADV_COLD:
>  	case MADV_PAGEOUT:
> +	case MADV_POPULATE:
>  #ifdef CONFIG_KSM
>  	case MADV_MERGEABLE:
>  	case MADV_UNMERGEABLE:
> -- 
> 2.29.2
> 

-- 
Michal Hocko
SUSE Labs

  parent reply	other threads:[~2021-02-19 10:36 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-17 15:48 [PATCH RFC] mm/madvise: introduce MADV_POPULATE to prefault/prealloc memory David Hildenbrand
2021-02-17 15:48 ` David Hildenbrand
2021-02-17 16:46 ` Dave Hansen
2021-02-17 16:46   ` Dave Hansen
2021-02-17 17:06   ` David Hildenbrand
2021-02-17 17:06     ` David Hildenbrand
2021-02-17 17:21 ` Vlastimil Babka
2021-02-17 17:21   ` Vlastimil Babka
2021-02-18 11:07   ` Rolf Eike Beer
2021-02-18 11:07     ` Rolf Eike Beer
2021-02-18 11:27     ` David Hildenbrand
2021-02-18 11:27       ` David Hildenbrand
2021-02-18 10:25 ` Michal Hocko
2021-02-18 10:25   ` Michal Hocko
2021-02-18 10:44   ` David Hildenbrand
2021-02-18 10:44     ` David Hildenbrand
2021-02-18 10:54     ` David Hildenbrand
2021-02-18 10:54       ` David Hildenbrand
2021-02-18 11:28       ` Michal Hocko
2021-02-18 11:28         ` Michal Hocko
2021-02-18 11:27     ` Michal Hocko
2021-02-18 11:27       ` Michal Hocko
2021-02-18 11:38       ` David Hildenbrand
2021-02-18 11:38         ` David Hildenbrand
2021-02-18 12:22 ` [PATCH RFC] madvise.2: Document MADV_POPULATE David Hildenbrand
2021-02-18 12:22   ` David Hildenbrand
2021-02-18 22:59 ` [PATCH RFC] mm/madvise: introduce MADV_POPULATE to prefault/prealloc memory Peter Xu
2021-02-18 22:59   ` Peter Xu
2021-02-19  8:20   ` David Hildenbrand
2021-02-19  8:20     ` David Hildenbrand
2021-02-19 16:31     ` Peter Xu
2021-02-19 16:31       ` Peter Xu
2021-02-19 17:13       ` David Hildenbrand
2021-02-19 17:13         ` David Hildenbrand
2021-02-19 19:14         ` David Hildenbrand
2021-02-19 19:14           ` David Hildenbrand
2021-02-19 19:25           ` Mike Kravetz
2021-02-19 19:25             ` Mike Kravetz
2021-02-20  9:01             ` David Hildenbrand
2021-02-20  9:01               ` David Hildenbrand
2021-02-19 19:23         ` Peter Xu
2021-02-19 19:23           ` Peter Xu
2021-02-19 20:04           ` David Hildenbrand
2021-02-19 20:04             ` David Hildenbrand
2021-02-22 12:46     ` Michal Hocko
2021-02-22 12:46       ` Michal Hocko
2021-02-22 12:52       ` David Hildenbrand
2021-02-22 12:52         ` David Hildenbrand
2021-02-19 10:35 ` Michal Hocko [this message]
2021-02-19 10:35   ` Michal Hocko
2021-02-19 10:43   ` David Hildenbrand
2021-02-19 10:43     ` David Hildenbrand
2021-02-19 11:04     ` Michal Hocko
2021-02-19 11:04       ` Michal Hocko
2021-02-19 11:10       ` David Hildenbrand
2021-02-19 11:10         ` David Hildenbrand
2021-02-20  9:12 ` David Hildenbrand
2021-02-20  9:12   ` David Hildenbrand
2021-02-22 12:56   ` Michal Hocko
2021-02-22 12:56     ` Michal Hocko
2021-02-22 12:59     ` David Hildenbrand
2021-02-22 12:59       ` David Hildenbrand
2021-02-22 13:19       ` Michal Hocko
2021-02-22 13:19         ` Michal Hocko
2021-02-22 13:22         ` David Hildenbrand
2021-02-22 13:22           ` David Hildenbrand
2021-02-22 14:02           ` Michal Hocko
2021-02-22 14:02             ` Michal Hocko
2021-02-22 15:30             ` David Hildenbrand
2021-02-22 15:30               ` David Hildenbrand
2021-02-24 14:25 ` David Hildenbrand
2021-02-24 14:25   ` David Hildenbrand
2021-02-24 14:38   ` David Hildenbrand
2021-02-24 14:38     ` David Hildenbrand
2021-02-25  8:41   ` David Hildenbrand
2021-02-25  8:41     ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YC+UaTVUn0o4Zynz@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=chris@zankel.net \
    --cc=dave.hansen@intel.com \
    --cc=david@redhat.com \
    --cc=deller@gmx.de \
    --cc=hughd@google.com \
    --cc=ink@jurassic.park.msu.ru \
    --cc=jannh@google.com \
    --cc=jcmvbkbc@gmail.com \
    --cc=jgg@ziepe.ca \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-alpha@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-parisc@vger.kernel.org \
    --cc=linux-xtensa@linux-xtensa.org \
    --cc=mattst88@gmail.com \
    --cc=minchan@kernel.org \
    --cc=mst@redhat.com \
    --cc=osalvador@suse.de \
    --cc=riel@surriel.com \
    --cc=rth@twiddle.net \
    --cc=tsbogend@alpha.franken.de \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.