From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=gjjd=E5=kvack.org=owner-linux-mm@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-7.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH,
	DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,
	MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1
	autolearn=no autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 19460C2D0E4
	for <linux-mm@archiver.kernel.org>; Mon, 23 Nov 2020 14:15:54 +0000 (UTC)
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by mail.kernel.org (Postfix) with ESMTP id 8FA492075A
	for <linux-mm@archiver.kernel.org>; Mon, 23 Nov 2020 14:15:53 +0000 (UTC)
Authentication-Results: mail.kernel.org;
	dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="eWB97NsF"
DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8FA492075A
Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com
Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix)
	id 255616B009A; Mon, 23 Nov 2020 09:15:53 -0500 (EST)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 204716B009B; Mon, 23 Nov 2020 09:15:53 -0500 (EST)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 11B536B009C; Mon, 23 Nov 2020 09:15:53 -0500 (EST)
X-Delivered-To: linux-mm@kvack.org
Received: from forelay.hostedemail.com (smtprelay0143.hostedemail.com [216.40.44.143])
	by kanga.kvack.org (Postfix) with ESMTP id D99F96B009A
	for <linux-mm@kvack.org>; Mon, 23 Nov 2020 09:15:52 -0500 (EST)
Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251])
	by forelay03.hostedemail.com (Postfix) with ESMTP id 82377824999B
	for <linux-mm@kvack.org>; Mon, 23 Nov 2020 14:15:52 +0000 (UTC)
X-FDA: 77515881744.17.brake13_100156127366
Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251])
	by smtpin17.hostedemail.com (Postfix) with ESMTP id 630B3180D0181
	for <linux-mm@kvack.org>; Mon, 23 Nov 2020 14:15:52 +0000 (UTC)
X-HE-Tag: brake13_100156127366
X-Filterd-Recvd-Size: 5506
Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124])
	by imf25.hostedemail.com (Postfix) with ESMTP
	for <linux-mm@kvack.org>; Mon, 23 Nov 2020 14:15:51 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
	s=mimecast20190719; t=1606140951;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=qrhEgZWSM0CROI5Vz0uu9rwmyq7vhI6QnpKuTadBvHY=;
	b=eWB97NsFG7r6xMomdiIMxuSaT+iSNK1g6AJFJneVc43s4+4sd6NikxU7RAns9TEuA3YQMT
	Eyc/EUOp0uA3/NZv3MyO2McLnJdxS12yFHyfXwKjcayeT+6FBAy7+YacT20bjzOwKKvYZf
	E5tmsFU0KoctHNENrMXM9qxrnJASC9w=
Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com
 [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id
 us-mta-323-Hz5xVMq2PWqq9P4fJRRGJQ-1; Mon, 23 Nov 2020 09:15:46 -0500
X-MC-Unique: Hz5xVMq2PWqq9P4fJRRGJQ-1
Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16])
	(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 45BAC100B713;
	Mon, 23 Nov 2020 14:15:43 +0000 (UTC)
Received: from [10.36.114.57] (ovpn-114-57.ams2.redhat.com [10.36.114.57])
	by smtp.corp.redhat.com (Postfix) with ESMTP id 1B3E35C1BD;
	Mon, 23 Nov 2020 14:15:38 +0000 (UTC)
Subject: Re: [PATCH 1/4] mm: introduce cma_alloc_bulk API
To: Minchan Kim <minchan@kernel.org>,
 Andrew Morton <akpm@linux-foundation.org>
Cc: LKML <linux-kernel@vger.kernel.org>, linux-mm <linux-mm@kvack.org>,
 hyesoo.yu@samsung.com, willy@infradead.org, iamjoonsoo.kim@lge.com,
 vbabka@suse.cz, surenb@google.com, pullip.cho@samsung.com,
 joaodias@google.com, hridya@google.com, sumit.semwal@linaro.org,
 john.stultz@linaro.org, Brian.Starkey@arm.com, linux-media@vger.kernel.org,
 devicetree@vger.kernel.org, robh@kernel.org, christian.koenig@amd.com,
 linaro-mm-sig@lists.linaro.org
References: <20201117181935.3613581-1-minchan@kernel.org>
 <20201117181935.3613581-2-minchan@kernel.org>
From: David Hildenbrand <david@redhat.com>
Organization: Red Hat GmbH
Message-ID: <a2c33b8f-e4fb-1f1c-7ed0-496a1256ea09@redhat.com>
Date: Mon, 23 Nov 2020 15:15:37 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
 Thunderbird/78.4.0
MIME-Version: 1.0
In-Reply-To: <20201117181935.3613581-2-minchan@kernel.org>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 7bit
X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

On 17.11.20 19:19, Minchan Kim wrote:
> There is a need for special HW to require bulk allocation of
> high-order pages. For example, 4800 * order-4 pages, which
> would be minimum, sometimes, it requires more.
> 
> To meet the requirement, a option reserves 300M CMA area and
> requests the whole 300M contiguous memory. However, it doesn't
> work if even one of those pages in the range is long-term pinned
> directly or indirectly. The other option is to ask higher-order
> size (e.g., 2M) than requested order(64K) repeatedly until driver
> could gather necessary amount of memory. Basically, this approach
> makes the allocation very slow due to cma_alloc's function
> slowness and it could be stuck on one of the pageblocks if it
> encounters unmigratable page.
> 
> To solve the issue, this patch introduces cma_alloc_bulk.
> 
> 	int cma_alloc_bulk(struct cma *cma, unsigned int align,
> 		gfp_t gfp_mask, unsigned int order, size_t nr_requests,
> 		struct page **page_array, size_t *nr_allocated);
> 
> Most parameters are same with cma_alloc but it additionally passes
> vector array to store allocated memory. What's different with cma_alloc
> is it will skip pageblocks without waiting/stopping if it has unmovable
> page so that API continues to scan other pageblocks to find requested
> order page.
> 
> cma_alloc_bulk is best effort approach in that it skips some pageblocks
> if they have unmovable pages unlike cma_alloc. It doesn't need to be
> perfect from the beginning at the cost of performance. Thus, the API
> takes gfp_t to support __GFP_NORETRY which is propagated into
> alloc_contig_page to avoid significat overhead functions to inrecase
> CMA allocation success ratio(e.g., migration retrial, PCP, LRU draining
> per pageblock) at the cost of less allocation success ratio.
> If the caller couldn't allocate enough pages with __GFP_NORETRY, they
> could call it without __GFP_NORETRY to increase success ratio this time
> if they are okay to expense the overhead for the success ratio.

I'm not a friend of connecting __GFP_NORETRY  to PCP and LRU draining.
Also, gfp flags apply mostly to compaction (e.g., how to allocate free
pages for migration), so this seems a little wrong.

Can we instead introduce

enum alloc_contig_mode {
	/*
	 * Normal mode:
	 *
	 * Retry page migration 5 times, ... TBD
	 *
	 */
	ALLOC_CONTIG_NORMAL = 0,
	/*
	 * Fast mode: e.g., used for bulk allocations.
         *
	 * Don't retry page migration if it fails, don't drain PCP
         * lists, don't drain LRU.
	 */
	ALLOC_CONTIG_FAST,
};

To be extended by ALLOC_CONTIG_HARD in the future to be used e.g., by
virtio-mem (disable PCP, retry a couple of times more often ) ...

-- 
Thanks,

David / dhildenb