From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 463F0C43464 for ; Fri, 18 Sep 2020 07:32:51 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C983D2076D for ; Fri, 18 Sep 2020 07:32:50 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="OmMvJbIf" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C983D2076D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 6585B6B005D; Fri, 18 Sep 2020 03:32:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5E23E8E0001; Fri, 18 Sep 2020 03:32:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 482D76B0068; Fri, 18 Sep 2020 03:32:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0121.hostedemail.com [216.40.44.121]) by kanga.kvack.org (Postfix) with ESMTP id 302E16B005D for ; Fri, 18 Sep 2020 03:32:50 -0400 (EDT) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id E77DE8249980 for ; Fri, 18 Sep 2020 07:32:49 +0000 (UTC) X-FDA: 77275365258.11.grain34_0515b1f27129 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin11.hostedemail.com (Postfix) with ESMTP id CA906180F8B81 for ; Fri, 18 Sep 2020 07:32:49 +0000 (UTC) X-HE-Tag: grain34_0515b1f27129 X-Filterd-Recvd-Size: 9515 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf32.hostedemail.com (Postfix) with ESMTP for ; Fri, 18 Sep 2020 07:32:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1600414368; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=4MabJ7/yI0eY5r2Pj6y2tFpJhoLDMoN36dFYeJG6Za4=; b=OmMvJbIf9AriIyHQoAmuStLpIt+BxzZwWdkdxqYmqfH+Cd8sVbc2BM0F8nt0DO5DChS1CJ q53yyBLR8KrUwah0/8QIL4NcMQ4kwhRY9pMUf6HqpYtKTYrgKYx9cACae6+pO4By+e/zpc iJ04ZQWnW8ecVC1ArUOjWLW2o5p5GpY= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-315-YdB76Se8O6WZ-6QNmnbwhw-1; Fri, 18 Sep 2020 03:32:46 -0400 X-MC-Unique: YdB76Se8O6WZ-6QNmnbwhw-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 9D73D10074B1; Fri, 18 Sep 2020 07:32:43 +0000 (UTC) Received: from [10.36.114.41] (ovpn-114-41.ams2.redhat.com [10.36.114.41]) by smtp.corp.redhat.com (Postfix) with ESMTP id 68F4F55765; Fri, 18 Sep 2020 07:32:39 +0000 (UTC) Subject: Re: [PATCH RFC 0/4] mm: place pages to the freelist tail when onling and undoing isolation To: Wei Yang Cc: osalvador@suse.de, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-hyperv@vger.kernel.org, xen-devel@lists.xenproject.org, linux-acpi@vger.kernel.org, Andrew Morton , Alexander Duyck , Dave Hansen , Haiyang Zhang , "K. Y. Srinivasan" , Mel Gorman , Michael Ellerman , Michal Hocko , Mike Rapoport , Scott Cheloha , Stephen Hemminger , Vlastimil Babka , Wei Liu References: <5c0910c2cd0d9d351e509392a45552fb@suse.de> <20200918023051.GE54754@L-31X9LVDL-1304.local> From: David Hildenbrand Autocrypt: addr=david@redhat.com; prefer-encrypt=mutual; keydata= mQINBFXLn5EBEAC+zYvAFJxCBY9Tr1xZgcESmxVNI/0ffzE/ZQOiHJl6mGkmA1R7/uUpiCjJ dBrn+lhhOYjjNefFQou6478faXE6o2AhmebqT4KiQoUQFV4R7y1KMEKoSyy8hQaK1umALTdL QZLQMzNE74ap+GDK0wnacPQFpcG1AE9RMq3aeErY5tujekBS32jfC/7AnH7I0v1v1TbbK3Gp XNeiN4QroO+5qaSr0ID2sz5jtBLRb15RMre27E1ImpaIv2Jw8NJgW0k/D1RyKCwaTsgRdwuK Kx/Y91XuSBdz0uOyU/S8kM1+ag0wvsGlpBVxRR/xw/E8M7TEwuCZQArqqTCmkG6HGcXFT0V9 PXFNNgV5jXMQRwU0O/ztJIQqsE5LsUomE//bLwzj9IVsaQpKDqW6TAPjcdBDPLHvriq7kGjt WhVhdl0qEYB8lkBEU7V2Yb+SYhmhpDrti9Fq1EsmhiHSkxJcGREoMK/63r9WLZYI3+4W2rAc UucZa4OT27U5ZISjNg3Ev0rxU5UH2/pT4wJCfxwocmqaRr6UYmrtZmND89X0KigoFD/XSeVv jwBRNjPAubK9/k5NoRrYqztM9W6sJqrH8+UWZ1Idd/DdmogJh0gNC0+N42Za9yBRURfIdKSb B3JfpUqcWwE7vUaYrHG1nw54pLUoPG6sAA7Mehl3nd4pZUALHwARAQABtCREYXZpZCBIaWxk ZW5icmFuZCA8ZGF2aWRAcmVkaGF0LmNvbT6JAlgEEwEIAEICGwMGCwkIBwMCBhUIAgkKCwQW AgMBAh4BAheAAhkBFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAl8Ox4kFCRKpKXgACgkQTd4Q 9wD/g1oHcA//a6Tj7SBNjFNM1iNhWUo1lxAja0lpSodSnB2g4FCZ4R61SBR4l/psBL73xktp rDHrx4aSpwkRP6Epu6mLvhlfjmkRG4OynJ5HG1gfv7RJJfnUdUM1z5kdS8JBrOhMJS2c/gPf wv1TGRq2XdMPnfY2o0CxRqpcLkx4vBODvJGl2mQyJF/gPepdDfcT8/PY9BJ7FL6Hrq1gnAo4 3Iv9qV0JiT2wmZciNyYQhmA1V6dyTRiQ4YAc31zOo2IM+xisPzeSHgw3ONY/XhYvfZ9r7W1l pNQdc2G+o4Di9NPFHQQhDw3YTRR1opJaTlRDzxYxzU6ZnUUBghxt9cwUWTpfCktkMZiPSDGd KgQBjnweV2jw9UOTxjb4LXqDjmSNkjDdQUOU69jGMUXgihvo4zhYcMX8F5gWdRtMR7DzW/YE BgVcyxNkMIXoY1aYj6npHYiNQesQlqjU6azjbH70/SXKM5tNRplgW8TNprMDuntdvV9wNkFs 9TyM02V5aWxFfI42+aivc4KEw69SE9KXwC7FSf5wXzuTot97N9Phj/Z3+jx443jo2NR34XgF 89cct7wJMjOF7bBefo0fPPZQuIma0Zym71cP61OP/i11ahNye6HGKfxGCOcs5wW9kRQEk8P9 M/k2wt3mt/fCQnuP/mWutNPt95w9wSsUyATLmtNrwccz63W5Ag0EVcufkQEQAOfX3n0g0fZz Bgm/S2zF/kxQKCEKP8ID+Vz8sy2GpDvveBq4H2Y34XWsT1zLJdvqPI4af4ZSMxuerWjXbVWb T6d4odQIG0fKx4F8NccDqbgHeZRNajXeeJ3R7gAzvWvQNLz4piHrO/B4tf8svmRBL0ZB5P5A 2uhdwLU3NZuK22zpNn4is87BPWF8HhY0L5fafgDMOqnf4guJVJPYNPhUFzXUbPqOKOkL8ojk CXxkOFHAbjstSK5Ca3fKquY3rdX3DNo+EL7FvAiw1mUtS+5GeYE+RMnDCsVFm/C7kY8c2d0G NWkB9pJM5+mnIoFNxy7YBcldYATVeOHoY4LyaUWNnAvFYWp08dHWfZo9WCiJMuTfgtH9tc75 7QanMVdPt6fDK8UUXIBLQ2TWr/sQKE9xtFuEmoQGlE1l6bGaDnnMLcYu+Asp3kDT0w4zYGsx 5r6XQVRH4+5N6eHZiaeYtFOujp5n+pjBaQK7wUUjDilPQ5QMzIuCL4YjVoylWiBNknvQWBXS lQCWmavOT9sttGQXdPCC5ynI+1ymZC1ORZKANLnRAb0NH/UCzcsstw2TAkFnMEbo9Zu9w7Kv AxBQXWeXhJI9XQssfrf4Gusdqx8nPEpfOqCtbbwJMATbHyqLt7/oz/5deGuwxgb65pWIzufa N7eop7uh+6bezi+rugUI+w6DABEBAAGJAjwEGAEIACYCGwwWIQQb2cqtc1xMOkYN/MpN3hD3 AP+DWgUCXw7HsgUJEqkpoQAKCRBN3hD3AP+DWrrpD/4qS3dyVRxDcDHIlmguXjC1Q5tZTwNB boaBTPHSy/Nksu0eY7x6HfQJ3xajVH32Ms6t1trDQmPx2iP5+7iDsb7OKAb5eOS8h+BEBDeq 3ecsQDv0fFJOA9ag5O3LLNk+3x3q7e0uo06XMaY7UHS341ozXUUI7wC7iKfoUTv03iO9El5f XpNMx/YrIMduZ2+nd9Di7o5+KIwlb2mAB9sTNHdMrXesX8eBL6T9b+MZJk+mZuPxKNVfEQMQ a5SxUEADIPQTPNvBewdeI80yeOCrN+Zzwy/Mrx9EPeu59Y5vSJOx/z6OUImD/GhX7Xvkt3kq Er5KTrJz3++B6SH9pum9PuoE/k+nntJkNMmQpR4MCBaV/J9gIOPGodDKnjdng+mXliF3Ptu6 3oxc2RCyGzTlxyMwuc2U5Q7KtUNTdDe8T0uE+9b8BLMVQDDfJjqY0VVqSUwImzTDLX9S4g/8 kC4HRcclk8hpyhY2jKGluZO0awwTIMgVEzmTyBphDg/Gx7dZU1Xf8HFuE+UZ5UDHDTnwgv7E th6RC9+WrhDNspZ9fJjKWRbveQgUFCpe1sa77LAw+XFrKmBHXp9ZVIe90RMe2tRL06BGiRZr jPrnvUsUUsjRoRNJjKKA/REq+sAnhkNPPZ/NNMjaZ5b8Tovi8C0tmxiCHaQYqj7G2rgnT0kt WNyWQQ== Organization: Red Hat GmbH Message-ID: Date: Fri, 18 Sep 2020 09:32:38 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.11.0 MIME-Version: 1.0 In-Reply-To: <20200918023051.GE54754@L-31X9LVDL-1304.local> Content-Type: text/plain; charset=utf-8 Content-Language: en-US X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 18.09.20 04:30, Wei Yang wrote: > On Wed, Sep 16, 2020 at 09:31:21PM +0200, David Hildenbrand wrote: >> >> >>> Am 16.09.2020 um 20:50 schrieb osalvador@suse.de: >>> >>> =EF=BB=BFOn 2020-09-16 20:34, David Hildenbrand wrote: >>>> When adding separate memory blocks via add_memory*() and onlining th= em >>>> immediately, the metadata (especially the memmap) of the next block = will be >>>> placed onto one of the just added+onlined block. This creates a chai= n >>>> of unmovable allocations: If the last memory block cannot get >>>> offlined+removed() so will all dependant ones. We directly have unmo= vable >>>> allocations all over the place. >>>> This can be observed quite easily using virtio-mem, however, it can = also >>>> be observed when using DIMMs. The freshly onlined pages will usually= be >>>> placed to the head of the freelists, meaning they will be allocated = next, >>>> turning the just-added memory usually immediately un-removable. The >>>> fresh pages are cold, prefering to allocate others (that might be ho= t) >>>> also feels to be the natural thing to do. >>>> It also applies to the hyper-v balloon xen-balloon, and ppc64 dlpar:= when >>>> adding separate, successive memory blocks, each memory block will ha= ve >>>> unmovable allocations on them - for example gigantic pages will fail= to >>>> allocate. >>>> While the ZONE_NORMAL doesn't provide any guarantees that memory can= get >>>> offlined+removed again (any kind of fragmentation with unmovable >>>> allocations is possible), there are many scenarios (hotplugging a lo= t of >>>> memory, running workload, hotunplug some memory/as much as possible)= where >>>> we can offline+remove quite a lot with this patchset. >>> >>> Hi David, >>> >> >> Hi Oscar. >> >>> I did not read through the patchset yet, so sorry if the question is = nonsense, but is this not trying to fix the same issue the vmemmap patche= s did? [1] >> >> Not nonesense at all. It only helps to some degree, though. It solves = the dependencies due to the memmap. However, it=E2=80=98s not completely = ideal, especially for single memory blocks. >> >> With single memory blocks (virtio-mem, xen-balloon, hv balloon, ppc dl= par) you still have unmovable (vmemmap chunks) all over the physical addr= ess space. Consider the gigantic page example after hotplug. You directly= fragmented all hotplugged memory. >> >> Of course, there might be (less extreme) dependencies due page tables = for the identity mapping, extended struct pages and similar. >> >> Having that said, there are other benefits when preferring other memor= y over just hotplugged memory. Think about adding+onlining memory during = boot (dimms under QEMU, virtio-mem), once the system is up you will have = most (all) of that memory completely untouched. >> >> So while vmemmap on hotplugged memory would tackle some part of the is= sue, there are cases where this approach is better, and there are even be= nefits when combining both. >=20 > While everything changes with shuffle. >=20 Right. Shuffling would naturally try to break the dependencies. Shuffling is quite rare though, it has to be enabled explicitly on the cmdline and might not be of too much help in virtualized environments. --=20 Thanks, David / dhildenb