From: Hillf Danton <hdanton@sina.com> To: Minchan Kim <minchan@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org>, LKML <linux-kernel@vger.kernel.org>, linux-mm <linux-mm@kvack.org>, Michal Hocko <mhocko@suse.com>, Johannes Weiner <hannes@cmpxchg.org>, Tim Murray <timmurray@google.com>, Joel Fernandes <joel@joelfernandes.org>, Suren Baghdasaryan <surenb@google.com>, Daniel Colascione <dancol@google.com>, Shakeel Butt <shakeelb@google.com>, Sonny Rao <sonnyrao@google.com>, Brian Geffon <bgeffon@google.com> Subject: Re: [RFC 1/7] mm: introduce MADV_COOL Date: Tue, 28 May 2019 20:15:23 +0800 Message-ID: <20190528121523.8764-1-hdanton@sina.com> (raw) On Tue, 28 May 2019 18:58:15 +0800 Minchan Kim wrote: > On Tue, May 28, 2019 at 04:53:01PM +0800, Hillf Danton wrote: > > > > On Mon, 20 May 2019 12:52:48 +0900 Minchan Kim wrote: > > > +static int madvise_cool_pte_range(pmd_t *pmd, unsigned long addr, > > > + unsigned long end, struct mm_walk *walk) > > > +{ > > > + pte_t *orig_pte, *pte, ptent; > > > + spinlock_t *ptl; > > > + struct page *page; > > > + struct vm_area_struct *vma = walk->vma; > > > + unsigned long next; > > > + > > > + next = pmd_addr_end(addr, end); > > > + if (pmd_trans_huge(*pmd)) { > > > + spinlock_t *ptl; > > > > Seems not needed with another ptl declared above. > > Will remove it. > > > > + > > > + ptl = pmd_trans_huge_lock(pmd, vma); > > > + if (!ptl) > > > + return 0; > > > + > > > + if (is_huge_zero_pmd(*pmd)) > > > + goto huge_unlock; > > > + > > > + page = pmd_page(*pmd); > > > + if (page_mapcount(page) > 1) > > > + goto huge_unlock; > > > + > > > + if (next - addr != HPAGE_PMD_SIZE) { > > > + int err; > > > > Alternately, we deactivate thp only if the address range from userspace > > is sane enough, in order to avoid complex works we have to do here. > > Not sure it's a good idea. That's the way we have done in MADV_FREE > so want to be consistent. > Fair. > > > + > > > + get_page(page); > > > + spin_unlock(ptl); > > > + lock_page(page); > > > + err = split_huge_page(page); > > > + unlock_page(page); > > > + put_page(page); > > > + if (!err) > > > + goto regular_page; > > > + return 0; > > > + } > > > + > > > + pmdp_test_and_clear_young(vma, addr, pmd); > > > + deactivate_page(page); > > > +huge_unlock: > > > + spin_unlock(ptl); > > > + return 0; > > > + } > > > + > > > + if (pmd_trans_unstable(pmd)) > > > + return 0; > > > + > > > +regular_page: > > > > Take a look at pending signal? > > Do you have any reason to see pending signal here? I want to know what's > your requirement so that what's the better place to handle it. > We could bail out without work done IMO if there is a fatal siganl pending. And we can do that, if it makes sense to you, before the hard work. > > > > > + orig_pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); > > > + for (pte = orig_pte; addr < end; pte++, addr += PAGE_SIZE) { > > > > s/end/next/ ? > > Why do you think it should be next? > Simply based on the following line, and afraid that next != end > > > + next = pmd_addr_end(addr, end); > > > + ptent = *pte; > > > + > > > + if (pte_none(ptent)) > > > + continue; > > > + > > > + if (!pte_present(ptent)) > > > + continue; > > > + > > > + page = vm_normal_page(vma, addr, ptent); > > > + if (!page) > > > + continue; > > > + > > > + if (page_mapcount(page) > 1) > > > + continue; > > > + > > > + ptep_test_and_clear_young(vma, addr, pte); > > > + deactivate_page(page); > > > + } > > > + > > > + pte_unmap_unlock(orig_pte, ptl); > > > + cond_resched(); > > > + > > > + return 0; > > > +} > > > + > > > +static long madvise_cool(struct vm_area_struct *vma, > > > + unsigned long start_addr, unsigned long end_addr) > > > +{ > > > + struct mm_struct *mm = vma->vm_mm; > > > + struct mmu_gather tlb; > > > + > > > + if (vma->vm_flags & (VM_LOCKED|VM_HUGETLB|VM_PFNMAP)) > > > + return -EINVAL; > > > > No service in case of VM_IO? > > I don't know VM_IO would have regular LRU pages but just follow normal > convention for DONTNEED and FREE. > Do you have anything in your mind? > I want to skip a mapping set up for DMA. BR Hillf
next reply index Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-05-28 12:15 Hillf Danton [this message] 2019-05-28 12:39 ` Minchan Kim -- strict thread matches above, loose matches on Subject: below -- 2019-05-29 8:52 Hillf Danton 2019-05-29 2:40 Hillf Danton 2019-05-29 5:05 ` Michal Hocko 2019-05-28 15:38 Hillf Danton 2019-05-28 16:11 ` Michal Hocko 2019-05-20 3:52 [RFC 0/7] introduce memory hinting API for external process Minchan Kim 2019-05-20 3:52 ` [RFC 1/7] mm: introduce MADV_COOL Minchan Kim 2019-05-20 8:16 ` Michal Hocko 2019-05-20 8:19 ` Michal Hocko 2019-05-20 15:08 ` Suren Baghdasaryan 2019-05-20 22:55 ` Minchan Kim 2019-05-20 22:54 ` Minchan Kim 2019-05-21 6:04 ` Michal Hocko 2019-05-21 9:11 ` Minchan Kim 2019-05-21 10:05 ` Michal Hocko 2019-05-28 8:53 ` Hillf Danton 2019-05-28 10:58 ` Minchan Kim
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20190528121523.8764-1-hdanton@sina.com \ --to=hdanton@sina.com \ --cc=akpm@linux-foundation.org \ --cc=bgeffon@google.com \ --cc=dancol@google.com \ --cc=hannes@cmpxchg.org \ --cc=joel@joelfernandes.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mhocko@suse.com \ --cc=minchan@kernel.org \ --cc=shakeelb@google.com \ --cc=sonnyrao@google.com \ --cc=surenb@google.com \ --cc=timmurray@google.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Linux-mm Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/linux-mm/0 linux-mm/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 linux-mm linux-mm/ https://lore.kernel.org/linux-mm \ linux-mm@kvack.org public-inbox-index linux-mm Example config snippet for mirrors Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.kvack.linux-mm AGPL code for this site: git clone https://public-inbox.org/public-inbox.git