Linux-mm Archive on lore.kernel.org
 help / color / Atom feed
From: Hillf Danton <hdanton@sina.com>
To: Minchan Kim <minchan@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>, Michal Hocko <mhocko@suse.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Tim Murray <timmurray@google.com>,
	Joel Fernandes <joel@joelfernandes.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Daniel Colascione <dancol@google.com>,
	Shakeel Butt <shakeelb@google.com>,
	Sonny Rao <sonnyrao@google.com>,
	Brian Geffon <bgeffon@google.com>
Subject: Re: [RFC 1/7] mm: introduce MADV_COOL
Date: Tue, 28 May 2019 20:15:23 +0800
Message-ID: <20190528121523.8764-1-hdanton@sina.com> (raw)


On Tue, 28 May 2019 18:58:15 +0800 Minchan Kim wrote:
> On Tue, May 28, 2019 at 04:53:01PM +0800, Hillf Danton wrote:
> >
> > On Mon, 20 May 2019 12:52:48 +0900 Minchan Kim wrote:
> > > +static int madvise_cool_pte_range(pmd_t *pmd, unsigned long addr,
> > > +				unsigned long end, struct mm_walk *walk)
> > > +{
> > > +	pte_t *orig_pte, *pte, ptent;
> > > +	spinlock_t *ptl;
> > > +	struct page *page;
> > > +	struct vm_area_struct *vma = walk->vma;
> > > +	unsigned long next;
> > > +
> > > +	next = pmd_addr_end(addr, end);
> > > +	if (pmd_trans_huge(*pmd)) {
> > > +		spinlock_t *ptl;
> >
> > Seems not needed with another ptl declared above.
>
> Will remove it.
>
> > > +
> > > +		ptl = pmd_trans_huge_lock(pmd, vma);
> > > +		if (!ptl)
> > > +			return 0;
> > > +
> > > +		if (is_huge_zero_pmd(*pmd))
> > > +			goto huge_unlock;
> > > +
> > > +		page = pmd_page(*pmd);
> > > +		if (page_mapcount(page) > 1)
> > > +			goto huge_unlock;
> > > +
> > > +		if (next - addr != HPAGE_PMD_SIZE) {
> > > +			int err;
> >
> > Alternately, we deactivate thp only if the address range from userspace
> > is sane enough, in order to avoid complex works we have to do here.
>
> Not sure it's a good idea. That's the way we have done in MADV_FREE
> so want to be consistent.
>
Fair.

> > > +
> > > +			get_page(page);
> > > +			spin_unlock(ptl);
> > > +			lock_page(page);
> > > +			err = split_huge_page(page);
> > > +			unlock_page(page);
> > > +			put_page(page);
> > > +			if (!err)
> > > +				goto regular_page;
> > > +			return 0;
> > > +		}
> > > +
> > > +		pmdp_test_and_clear_young(vma, addr, pmd);
> > > +		deactivate_page(page);
> > > +huge_unlock:
> > > +		spin_unlock(ptl);
> > > +		return 0;
> > > +	}
> > > +
> > > +	if (pmd_trans_unstable(pmd))
> > > +		return 0;
> > > +
> > > +regular_page:
> >
> > Take a look at pending signal?
>
> Do you have any reason to see pending signal here? I want to know what's
> your requirement so that what's the better place to handle it.
>
We could bail out without work done IMO if there is a fatal siganl pending.
And we can do that, if it makes sense to you, before the hard work.

> >
> > > +	orig_pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
> > > +	for (pte = orig_pte; addr < end; pte++, addr += PAGE_SIZE) {
> >
> > s/end/next/ ?
>
> Why do you think it should be next?
>
Simply based on the following line, and afraid that next != end
	> > > +	next = pmd_addr_end(addr, end);

> > > +		ptent = *pte;
> > > +
> > > +		if (pte_none(ptent))
> > > +			continue;
> > > +
> > > +		if (!pte_present(ptent))
> > > +			continue;
> > > +
> > > +		page = vm_normal_page(vma, addr, ptent);
> > > +		if (!page)
> > > +			continue;
> > > +
> > > +		if (page_mapcount(page) > 1)
> > > +			continue;
> > > +
> > > +		ptep_test_and_clear_young(vma, addr, pte);
> > > +		deactivate_page(page);
> > > +	}
> > > +
> > > +	pte_unmap_unlock(orig_pte, ptl);
> > > +	cond_resched();
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +static long madvise_cool(struct vm_area_struct *vma,
> > > +			unsigned long start_addr, unsigned long end_addr)
> > > +{
> > > +	struct mm_struct *mm = vma->vm_mm;
> > > +	struct mmu_gather tlb;
> > > +
> > > +	if (vma->vm_flags & (VM_LOCKED|VM_HUGETLB|VM_PFNMAP))
> > > +		return -EINVAL;
> >
> > No service in case of VM_IO?
>
> I don't know VM_IO would have regular LRU pages but just follow normal
> convention for DONTNEED and FREE.
> Do you have anything in your mind?
>
I want to skip a mapping set up for DMA.

BR
Hillf


             reply index

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-28 12:15 Hillf Danton [this message]
2019-05-28 12:39 ` Minchan Kim
  -- strict thread matches above, loose matches on Subject: below --
2019-05-29  8:52 Hillf Danton
2019-05-29  2:40 Hillf Danton
2019-05-29  5:05 ` Michal Hocko
2019-05-28 15:38 Hillf Danton
2019-05-28 16:11 ` Michal Hocko
2019-05-20  3:52 [RFC 0/7] introduce memory hinting API for external process Minchan Kim
2019-05-20  3:52 ` [RFC 1/7] mm: introduce MADV_COOL Minchan Kim
2019-05-20  8:16   ` Michal Hocko
2019-05-20  8:19     ` Michal Hocko
2019-05-20 15:08       ` Suren Baghdasaryan
2019-05-20 22:55       ` Minchan Kim
2019-05-20 22:54     ` Minchan Kim
2019-05-21  6:04       ` Michal Hocko
2019-05-21  9:11         ` Minchan Kim
2019-05-21 10:05           ` Michal Hocko
2019-05-28  8:53   ` Hillf Danton
2019-05-28 10:58   ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190528121523.8764-1-hdanton@sina.com \
    --to=hdanton@sina.com \
    --cc=akpm@linux-foundation.org \
    --cc=bgeffon@google.com \
    --cc=dancol@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=shakeelb@google.com \
    --cc=sonnyrao@google.com \
    --cc=surenb@google.com \
    --cc=timmurray@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-mm Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-mm/0 linux-mm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-mm linux-mm/ https://lore.kernel.org/linux-mm \
		linux-mm@kvack.org
	public-inbox-index linux-mm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kvack.linux-mm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git