From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1753869AbcLYVvU (ORCPT <rfc822;w@1wt.eu>);
        Sun, 25 Dec 2016 16:51:20 -0500
Received: from mail-it0-f67.google.com ([209.85.214.67]:33698 "EHLO
        mail-it0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1751674AbcLYVvS (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Sun, 25 Dec 2016 16:51:18 -0500
MIME-Version: 1.0
In-Reply-To: <20161225030030.23219-3-npiggin@gmail.com>
References: <20161225030030.23219-1-npiggin@gmail.com> <20161225030030.23219-3-npiggin@gmail.com>
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Sun, 25 Dec 2016 13:51:17 -0800
X-Google-Sender-Auth: f_plt1VRmWD3p4WMfJv5tiA7KSA
Message-ID: <CA+55aFzqgtz-782MmLOjQ2A2nB5YVyLAvveo6G_c85jqqGDA0Q@mail.gmail.com>
Subject: Re: [PATCH 2/2] mm: add PageWaiters indicating tasks are waiting for
 a page bit
To: Nicholas Piggin <npiggin@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>,
        Bob Peterson <rpeterso@redhat.com>,
        Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
        Steven Whitehouse <swhiteho@redhat.com>,
        Andrew Lutomirski <luto@kernel.org>,
        Andreas Gruenbacher <agruenba@redhat.com>,
        Peter Zijlstra <peterz@infradead.org>, linux-mm <linux-mm@kvack.org>,
        Mel Gorman <mgorman@techsingularity.net>
Content-Type: text/plain; charset=UTF-8
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by mail.home.local id uBPLpTGK011991

On Sat, Dec 24, 2016 at 7:00 PM, Nicholas Piggin <npiggin@gmail.com> wrote:
> Add a new page flag, PageWaiters, to indicate the page waitqueue has
> tasks waiting. This can be tested rather than testing waitqueue_active
> which requires another cacheline load.

Ok, I applied this one too. I think there's room for improvement, but
I don't think it's going to help to just wait another release cycle
and hope something happens.

Example room for improvement from a profile of unlock_page():

   46.44 │      lock   andb $0xfe,(%rdi)
   34.22 │      mov    (%rdi),%rax

this has the old "do atomic op on a byte, then load the whole word"
issue that we used to have with the nasty zone lookup code too. And it
causes a horrible pipeline hickup because the load will not forward
the data from the (partial) store.

 Its' really a misfeature of our asm optimizations of the atomic bit
ops. Using "andb" is slightly smaller, but in this case in particular,
an "andq" would be a ton faster, and the mask still fits in an imm8,
so it's not even hugely larger.

But it might also be a good idea to simply use a "cmpxchg" loop here.
That also gives atomicity guarantees that we don't have with the
"clear bit and then load the value".

Regardless, I think this is worth more people looking at and testing.
And merging it is probably the best way for that to happen.

                Linus