From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,HK_RANDOM_FROM,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16BA7C4363D for ; Tue, 22 Sep 2020 17:05:17 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9CB4E23A9D for ; Tue, 22 Sep 2020 17:05:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9CB4E23A9D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C80FE90001C; Tue, 22 Sep 2020 13:05:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C57B2900002; Tue, 22 Sep 2020 13:05:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B96D490001C; Tue, 22 Sep 2020 13:05:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0058.hostedemail.com [216.40.44.58]) by kanga.kvack.org (Postfix) with ESMTP id A3CC7900002 for ; Tue, 22 Sep 2020 13:05:15 -0400 (EDT) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 5F8D48249980 for ; Tue, 22 Sep 2020 17:05:15 +0000 (UTC) X-FDA: 77291322990.17.crook42_1d06d2c2714f Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin17.hostedemail.com (Postfix) with ESMTP id 84B0218027473 for ; Tue, 22 Sep 2020 17:04:49 +0000 (UTC) X-HE-Tag: crook42_1d06d2c2714f X-Filterd-Recvd-Size: 5502 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by imf22.hostedemail.com (Postfix) with ESMTP for ; Tue, 22 Sep 2020 17:04:48 +0000 (UTC) IronPort-SDR: FhgA/j4YljB90Rl9CbHUIv9xP8nVzxNCtbYazw5CWupmAb+i5jsBYrF634efaqRSIoYWLOvNua xWtD/f71iyMQ== X-IronPort-AV: E=McAfee;i="6000,8403,9752"; a="160737087" X-IronPort-AV: E=Sophos;i="5.77,291,1596524400"; d="scan'208";a="160737087" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Sep 2020 10:04:46 -0700 IronPort-SDR: pDr7f0tvI2tkaSrmro3ZOlGNzDsLFUlZA0Y2mT+eov61oD3VbMWAcbZ9JV/I+/GzNzMjkvUYD3 Fddbwl4KcUrw== X-IronPort-AV: E=Sophos;i="5.77,291,1596524400"; d="scan'208";a="454564080" Received: from atroib-mobl2.ger.corp.intel.com (HELO [10.214.238.184]) ([10.214.238.184]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Sep 2020 10:04:40 -0700 Subject: Re: [Intel-gfx] [PATCH 3/6] drm/i915: use vmap in shmem_pin_map To: Christoph Hellwig Cc: Matthew Wilcox , Juergen Gross , Stefano Stabellini , linux-mm@kvack.org, Peter Zijlstra , Boris Ostrovsky , x86@kernel.org, linux-kernel@vger.kernel.org, Minchan Kim , dri-devel@lists.freedesktop.org, xen-devel@lists.xenproject.org, Andrew Morton , intel-gfx@lists.freedesktop.org, Nitin Gupta , Chris Wilson , Matthew Auld References: <20200918163724.2511-1-hch@lst.de> <20200918163724.2511-4-hch@lst.de> <20200921191157.GX32101@casper.infradead.org> <20200922062249.GA30831@lst.de> <43d10588-2033-038b-14e4-9f41cd622d7b@linux.intel.com> <20200922143141.GA26637@lst.de> <20200922163346.GA1701@lst.de> From: Tvrtko Ursulin Organization: Intel Corporation UK Plc Message-ID: <1b05b9d6-a14c-85cd-0728-d0d40c9ff84b@linux.intel.com> Date: Tue, 22 Sep 2020 18:04:37 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <20200922163346.GA1701@lst.de> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 22/09/2020 17:33, Christoph Hellwig wrote: > On Tue, Sep 22, 2020 at 05:13:45PM +0100, Tvrtko Ursulin wrote: >>> void *shmem_pin_map(struct file *file) >>> { >>> - const size_t n_pte = shmem_npte(file); >>> - pte_t *stack[32], **ptes, **mem; >> >> Chris can comment how much he'd miss the 32 page stack shortcut. > > I'd like to see a profile that claim that kmalloc matters in a > path that does a vmap and reads pages through the page cache. > Especially when the kmalloc saves doing another page cache lookup > on the free side. Only reason I can come up with now is if mapping side is on a latency sensitive path, while un-mapping is lazy/delayed so can be more costly. Then fast map and extra cost on unmap may make sense. It more applies to the other i915 patch, which implements a much more used API, but whether or not we can demonstrate any difference in the perf profiles I couldn't tell you without trying to collect some. >> Is there something in vmap() preventing us from freeing the pages array >> here? I can't spot anything that is holding on to the pointer. Or it was >> just a sketch before you realized we could walk the vm_area? >> >> Also, I may be totally misunderstanding something, but I think you need to >> assign area->pages manually so shmem_unpin_map can access it below. > > We need area->pages to hold the pages for the free side. That being > said the patch I posted is broken because it never assigned to that. > As said it was a sketch. This is the patch I just rebooted into on > my Laptop: > > http://git.infradead.org/users/hch/misc.git/commitdiff/048522dfa26b6667adfb0371ff530dc263abe829 > > it needs extra prep patches from the series: > > http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/alloc_vm_area > >>> mapping_clear_unevictable(file->f_mapping); >>> - __shmem_unpin_map(file, ptr, shmem_npte(file)); >>> + for (i = 0; i < shmem_npages(file); i++) >>> + put_page(area->pages[i]); >>> + kvfree(area->pages); >>> + vunmap(ptr); >> >> Is the verdict from mm experts that we can't use vfree due __free_pages vs >> put_page differences? > > Switched to vfree now. > >> Could we get from ptes to pages, so that we don't have to keep the >> area->pages array allocated for the duration of the pin? > > We could do vmalloc_to_page, but that is fairly expensive (not as bad > as reading from the page cache..). Are you really worried about the > allocation? Not so much given how we don't even use shmem_pin_map outside selftests. If we start using it I expect it will be for tiny objects anyway. Only if they end up being pinned for the lifetime of the driver, it may be a pointless waste of memory compared to the downsides of vmalloc_to_page. But we can revisit this particular edge case optimization if the need arises. I'll look at your other i915 patch tomorrow. Regards, Tvrtko