From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965650AbbFJPor (ORCPT ); Wed, 10 Jun 2015 11:44:47 -0400 Received: from cantor2.suse.de ([195.135.220.15]:55792 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965635AbbFJPod (ORCPT ); Wed, 10 Jun 2015 11:44:33 -0400 Message-ID: <55785B5E.3000306@suse.cz> Date: Wed, 10 Jun 2015 17:44:30 +0200 From: Vlastimil Babka User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: "Kirill A. Shutemov" , Andrew Morton , Andrea Arcangeli , Hugh Dickins CC: Dave Hansen , Mel Gorman , Rik van Riel , Christoph Lameter , Naoya Horiguchi , Steve Capper , "Aneesh Kumar K.V" , Johannes Weiner , Michal Hocko , Jerome Marchand , Sasha Levin , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCHv6 32/36] thp: reintroduce split_huge_page() References: <1433351167-125878-1-git-send-email-kirill.shutemov@linux.intel.com> <1433351167-125878-33-git-send-email-kirill.shutemov@linux.intel.com> In-Reply-To: <1433351167-125878-33-git-send-email-kirill.shutemov@linux.intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 06/03/2015 07:06 PM, Kirill A. Shutemov wrote: > This patch adds implementation of split_huge_page() for new > refcountings. > > Unlike previous implementation, new split_huge_page() can fail if > somebody holds GUP pin on the page. It also means that pin on page > would prevent it from bening split under you. It makes situation in > many places much cleaner. > > The basic scheme of split_huge_page(): > > - Check that sum of mapcounts of all subpage is equal to page_count() > plus one (caller pin). Foll off with -EBUSY. This way we can avoid > useless PMD-splits. > > - Freeze the page counters by splitting all PMD and setup migration > PTEs. > > - Re-check sum of mapcounts against page_count(). Page's counts are > stable now. -EBUSY if page is pinned. > > - Split compound page. > > - Unfreeze the page by removing migration entries. > > Signed-off-by: Kirill A. Shutemov > Tested-by: Sasha Levin [...] > + > +static int __split_huge_page_tail(struct page *head, int tail, > + struct lruvec *lruvec, struct list_head *list) > +{ > + int mapcount; > + struct page *page_tail = head + tail; > + > + mapcount = page_mapcount(page_tail); Isn't page_mapcount() unnecessarily heavyweight here? When you are splitting a page, it already should have zero compound_mapcount() and shouldn't be PageDoubleMap(), no? So you should care about page->_mapcount only? Sure, splitting THP is not a hotpath, but when done 512 times per split, it could make some difference in the split's latency. > + VM_BUG_ON_PAGE(atomic_read(&page_tail->_count) != 0, page_tail); > + > + /* > + * tail_page->_count is zero and not changing from under us. But > + * get_page_unless_zero() may be running from under us on the > + * tail_page. If we used atomic_set() below instead of atomic_add(), we > + * would then run atomic_set() concurrently with > + * get_page_unless_zero(), and atomic_set() is implemented in C not > + * using locked ops. spin_unlock on x86 sometime uses locked ops > + * because of PPro errata 66, 92, so unless somebody can guarantee > + * atomic_set() here would be safe on all archs (and not only on x86), > + * it's safer to use atomic_add(). I would be surprised if this was the first place to use atomic_set() with potential concurrent atomic_add(). Shouldn't atomic_*() API guarantee that this works? > + */ > + atomic_add(page_mapcount(page_tail) + 1, &page_tail->_count); You already have the value in mapcount variable, so why read it again.