From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 802C0ECDE20 for ; Thu, 12 Sep 2019 09:19:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 56873214AE for ; Thu, 12 Sep 2019 09:19:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1568279971; bh=rf0WRInmJ5z8W2m0J3lIfwQT5dpCz3RmvzbaY+xFu2M=; h=Date:From:To:Cc:Subject:References:In-Reply-To:List-ID:From; b=Fg0JT/25pWJ1aIxTprtwEU3VbAGNbxS7SqLsJxxmG1awnfU493P7epQsj7zJHDoOk L1Hs4iItVXGbM8xhR4Vn000tjpfEYhosw1OvFL1BZgthmpIqRc69lIP7CrmVgYr94z W4hOjAqNsD/QeG5UnO3QpXB4BPp7BB+BKisOcvDY= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730695AbfILJTa (ORCPT ); Thu, 12 Sep 2019 05:19:30 -0400 Received: from mx2.suse.de ([195.135.220.15]:50108 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730428AbfILJT3 (ORCPT ); Thu, 12 Sep 2019 05:19:29 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 6E344B61F; Thu, 12 Sep 2019 09:19:27 +0000 (UTC) Date: Thu, 12 Sep 2019 11:19:25 +0200 From: Michal Hocko To: Alexander Duyck Cc: Alexander Duyck , virtio-dev@lists.oasis-open.org, kvm list , "Michael S. Tsirkin" , Catalin Marinas , David Hildenbrand , Dave Hansen , LKML , Matthew Wilcox , linux-mm , Andrew Morton , will@kernel.org, linux-arm-kernel@lists.infradead.org, Oscar Salvador , Yang Zhang , Pankaj Gupta , Konrad Rzeszutek Wilk , Nitesh Narayan Lal , Rik van Riel , lcapitulino@redhat.com, "Wang, Wei W" , Andrea Arcangeli , ying.huang@intel.com, Paolo Bonzini , Dan Williams , Fengguang Wu , "Kirill A. Shutemov" , Mel Gorman , Vlastimil Babka Subject: Re: [PATCH v9 0/8] stg mail -e --version=v9 \ Message-ID: <20190912091925.GM4023@dhcp22.suse.cz> References: <20190907172225.10910.34302.stgit@localhost.localdomain> <20190910124209.GY2063@dhcp22.suse.cz> <20190910144713.GF2063@dhcp22.suse.cz> <20190910175213.GD4023@dhcp22.suse.cz> <1d7de9f9f4074f67c567dbb4cc1497503d739e30.camel@linux.intel.com> <20190911113619.GP4023@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed 11-09-19 08:12:03, Alexander Duyck wrote: > On Wed, Sep 11, 2019 at 4:36 AM Michal Hocko wrote: > > > > On Tue 10-09-19 14:23:40, Alexander Duyck wrote: > > [...] > > > We don't put any limitations on the allocator other then that it needs to > > > clean up the metadata on allocation, and that it cannot allocate a page > > > that is in the process of being reported since we pulled it from the > > > free_list. If the page is a "Reported" page then it decrements the > > > reported_pages count for the free_area and makes sure the page doesn't > > > exist in the "Boundary" array pointer value, if it does it moves the > > > "Boundary" since it is pulling the page. > > > > This is still a non-trivial limitation on the page allocation from an > > external code IMHO. I cannot give any explicit reason why an ordering on > > the free list might matter (well except for page shuffling which uses it > > to make physical memory pattern allocation more random) but the > > architecture seems hacky and dubious to be honest. It shoulds like the > > whole interface has been developed around a very particular and single > > purpose optimization. > > How is this any different then the code that moves a page that will > likely be merged to the tail though? I guess you are referring to the page shuffling. If that is the case then this is an integral part of the allocator for a reason and it is very well obvious in the code including the consequences. I do not really like an idea of hiding similar constrains behind a generic looking feature which is completely detached from the allocator and so any future change of the allocator might subtly break it. > In our case the "Reported" page is likely going to be much more > expensive to allocate and use then a standard page because it will be > faulted back in. In such a case wouldn't it make sense for us to want > to keep the pages that don't require faults ahead of those pages in > the free_list so that they are more likely to be allocated? OK, I was suspecting this would pop out. And this is exactly why I didn't like an idea of an external code imposing a non obvious constrains to the allocator. You simply cannot count with any ordering with the page allocator. We used to distinguish cache hot/cold pages in the past and pushed pages to the specific end of the free list but that has been removed. There are other potential changes like that possible. Shuffling is a good recent example. Anyway I am not a maintainer of this code. I would really like to hear opinions from Mel and Vlastimil here (now CCed - the thread starts http://lkml.kernel.org/r/20190907172225.10910.34302.stgit@localhost.localdomain. -- Michal Hocko SUSE Labs From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA5A7C49ED9 for ; Thu, 12 Sep 2019 09:20:02 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7F05D20663 for ; Thu, 12 Sep 2019 09:20:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="XPJlfav6" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7F05D20663 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=iCmqByQpbCg7J78JBbN+AMMqYPeDx71772jpvwhrHLw=; b=XPJlfav6Whmgst 3OugZMzbRm4ukQOggIWwEXidfhG60EP9qpHufrozn24/FtNa7t5Fmg/OQra+qCAz+CI0NIDAX1RbM D+pIs1Y8d3LtbABZpXh3Y31RHjLk78oONdTwaZwH2/QhB0FUQPjRCKKbqZQeMf7zA7XHgLj7R6kL4 jJnJfxOKKmpzXTKEaEkWf59jn6POW1bf/S5Z+lLdnJgmI0KGGIV92PHyTyAufs1loFU53nam2AJ1X jjYDU7+iQpTqmnjDrIl/Or7gW6xDqUZZHIsDbrYxC89GQjc5wD7S1LyNjioEVGLVctYKFrH7Pa0TQ gP3fI8ZASuqayBQmAWdQ==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.2 #3 (Red Hat Linux)) id 1i8LGd-0001DV-AW; Thu, 12 Sep 2019 09:19:55 +0000 Received: from mx2.suse.de ([195.135.220.15] helo=mx1.suse.de) by bombadil.infradead.org with esmtps (Exim 4.92.2 #3 (Red Hat Linux)) id 1i8LGE-0000yO-Hw for linux-arm-kernel@lists.infradead.org; Thu, 12 Sep 2019 09:19:33 +0000 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 6E344B61F; Thu, 12 Sep 2019 09:19:27 +0000 (UTC) Date: Thu, 12 Sep 2019 11:19:25 +0200 From: Michal Hocko To: Alexander Duyck Subject: Re: [PATCH v9 0/8] stg mail -e --version=v9 \ Message-ID: <20190912091925.GM4023@dhcp22.suse.cz> References: <20190907172225.10910.34302.stgit@localhost.localdomain> <20190910124209.GY2063@dhcp22.suse.cz> <20190910144713.GF2063@dhcp22.suse.cz> <20190910175213.GD4023@dhcp22.suse.cz> <1d7de9f9f4074f67c567dbb4cc1497503d739e30.camel@linux.intel.com> <20190911113619.GP4023@dhcp22.suse.cz> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20190912_021931_637425_5AB37A61 X-CRM114-Status: GOOD ( 20.96 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Yang Zhang , Pankaj Gupta , kvm list , David Hildenbrand , Catalin Marinas , lcapitulino@redhat.com, linux-mm , Alexander Duyck , will@kernel.org, Andrea Arcangeli , virtio-dev@lists.oasis-open.org, "Michael S. Tsirkin" , Matthew Wilcox , "Wang, Wei W" , Mel Gorman , ying.huang@intel.com, Rik van Riel , Konrad Rzeszutek Wilk , Vlastimil Babka , Dan Williams , linux-arm-kernel@lists.infradead.org, Oscar Salvador , Nitesh Narayan Lal , Dave Hansen , LKML , Paolo Bonzini , Andrew Morton , Fengguang Wu , "Kirill A. Shutemov" Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Wed 11-09-19 08:12:03, Alexander Duyck wrote: > On Wed, Sep 11, 2019 at 4:36 AM Michal Hocko wrote: > > > > On Tue 10-09-19 14:23:40, Alexander Duyck wrote: > > [...] > > > We don't put any limitations on the allocator other then that it needs to > > > clean up the metadata on allocation, and that it cannot allocate a page > > > that is in the process of being reported since we pulled it from the > > > free_list. If the page is a "Reported" page then it decrements the > > > reported_pages count for the free_area and makes sure the page doesn't > > > exist in the "Boundary" array pointer value, if it does it moves the > > > "Boundary" since it is pulling the page. > > > > This is still a non-trivial limitation on the page allocation from an > > external code IMHO. I cannot give any explicit reason why an ordering on > > the free list might matter (well except for page shuffling which uses it > > to make physical memory pattern allocation more random) but the > > architecture seems hacky and dubious to be honest. It shoulds like the > > whole interface has been developed around a very particular and single > > purpose optimization. > > How is this any different then the code that moves a page that will > likely be merged to the tail though? I guess you are referring to the page shuffling. If that is the case then this is an integral part of the allocator for a reason and it is very well obvious in the code including the consequences. I do not really like an idea of hiding similar constrains behind a generic looking feature which is completely detached from the allocator and so any future change of the allocator might subtly break it. > In our case the "Reported" page is likely going to be much more > expensive to allocate and use then a standard page because it will be > faulted back in. In such a case wouldn't it make sense for us to want > to keep the pages that don't require faults ahead of those pages in > the free_list so that they are more likely to be allocated? OK, I was suspecting this would pop out. And this is exactly why I didn't like an idea of an external code imposing a non obvious constrains to the allocator. You simply cannot count with any ordering with the page allocator. We used to distinguish cache hot/cold pages in the past and pushed pages to the specific end of the free list but that has been removed. There are other potential changes like that possible. Shuffling is a good recent example. Anyway I am not a maintainer of this code. I would really like to hear opinions from Mel and Vlastimil here (now CCed - the thread starts http://lkml.kernel.org/r/20190907172225.10910.34302.stgit@localhost.localdomain. -- Michal Hocko SUSE Labs _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel