From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.6 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6CD82C11D0C for ; Thu, 20 Feb 2020 12:27:54 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2479524670 for ; Thu, 20 Feb 2020 12:27:54 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=shipmail.org header.i=@shipmail.org header.b="YUVwVNYx" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2479524670 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=shipmail.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 76ADE6B0005; Thu, 20 Feb 2020 07:27:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 656A76B0008; Thu, 20 Feb 2020 07:27:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E43A96B0010; Thu, 20 Feb 2020 07:27:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0112.hostedemail.com [216.40.44.112]) by kanga.kvack.org (Postfix) with ESMTP id AE2C96B000D for ; Thu, 20 Feb 2020 07:27:43 -0500 (EST) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 39FBD1F1F for ; Thu, 20 Feb 2020 12:27:43 +0000 (UTC) X-FDA: 76510431606.25.story44_6066f9a18c003 X-HE-Tag: story44_6066f9a18c003 X-Filterd-Recvd-Size: 6867 Received: from ste-pvt-msa1.bahnhof.se (ste-pvt-msa1.bahnhof.se [213.80.101.70]) by imf09.hostedemail.com (Postfix) with ESMTP for ; Thu, 20 Feb 2020 12:27:42 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by ste-pvt-msa1.bahnhof.se (Postfix) with ESMTP id DD9EA3F65E; Thu, 20 Feb 2020 13:27:40 +0100 (CET) Authentication-Results: ste-pvt-msa1.bahnhof.se; dkim=pass (1024-bit key; unprotected) header.d=shipmail.org header.i=@shipmail.org header.b=YUVwVNYx; dkim-atps=neutral X-Virus-Scanned: Debian amavisd-new at bahnhof.se Received: from ste-pvt-msa1.bahnhof.se ([127.0.0.1]) by localhost (ste-pvt-msa1.bahnhof.se [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id suS8Fnzl-PuE; Thu, 20 Feb 2020 13:27:39 +0100 (CET) Received: from mail1.shipmail.org (h-205-35.A357.priv.bahnhof.se [155.4.205.35]) (Authenticated sender: mb878879) by ste-pvt-msa1.bahnhof.se (Postfix) with ESMTPA id D77BD3F481; Thu, 20 Feb 2020 13:27:34 +0100 (CET) Received: from localhost.localdomain.localdomain (h-205-35.A357.priv.bahnhof.se [155.4.205.35]) by mail1.shipmail.org (Postfix) with ESMTPSA id 10D7536016C; Thu, 20 Feb 2020 13:27:34 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=shipmail.org; s=mail; t=1582201654; bh=IXNf/Qbi8TyXfdlIfuX8eOlUFvlWatv8T9EkfzwiBlM=; h=From:To:Cc:Subject:Date:From; b=YUVwVNYxXZ8kFmNnFRiXwjX0URvz/NEyhVNuWQ3Cmhu+eSHIBGq+ggUWDS/XMDWfr 49Y5X+zRVYCGXGTPCVOWEWBRsmEyyhI725h/K/v3wQkp2ruPcdo5D2gwS2CwACljtl u7su0Wm7s2eqin4yCC6/sYLO541Css8WqirycZ+E= From: =?UTF-8?q?Thomas=20Hellstr=C3=B6m=20=28VMware=29?= To: linux-mm@kvack.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org Cc: pv-drivers@vmware.com, linux-graphics-maintainer@vmware.com, =?UTF-8?q?Thomas=20Hellstr=C3=B6m?= , Andrew Morton , Michal Hocko , "Matthew Wilcox (Oracle)" , "Kirill A. Shutemov" , Ralph Campbell , =?UTF-8?q?J=C3=A9r=C3=B4me=20Glisse?= , =?UTF-8?q?Christian=20K=C3=B6nig?= , Dan Williams Subject: [PATCH v4 0/9] Huge page-table entries for TTM Date: Thu, 20 Feb 2020 13:27:10 +0100 Message-Id: <20200220122719.4302-1-thomas_os@shipmail.org> X-Mailer: git-send-email 2.21.1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In order to reduce TLB misses and CPU usage this patchset enables huge- and giant page-table entries for TTM and TTM-enabled graphics drivers. Patch 1 and 2 introduce a vma_is_special_huge() function to make the mm c= ode take the same path as DAX when splitting huge- and giant page table entri= es, (which currently means zapping the page-table entry and rely on re-faulti= ng). Patch 3 makes the mm code split existing huge page-table entries on huge_fault fallbacks. Typically on COW or on buffer-objects that want write-notify. COW and write-notification is always done on the lowest page-table level. See the patch log message for additional considerations= . Patch 4 introduces functions to allow the graphics drivers to manipulate the caching- and encryption flags of huge page-table entries without ugly hacks. Patch 5 implements the huge_fault handler in TTM. This enables huge page-table entries, provided that the kernel is configu= red to support transhuge pages, either by default or using madvise(). However, they are unlikely to be inserted unless the kernel buffer object pfns and user-space addresses align perfectly. There are various options here, but since buffer objects that reside in system pages typically star= t at huge page boundaries if they are backed by huge pages, we try to enfor= ce buffer object starting pfns and user-space addresses to be huge page-size aligned if their size exceeds a huge page-size. If pud-size transhuge ("giant") pages are enabled by the arch, the same holds for those. Patch 6 implements a specialized huge_fault handler for vmwgfx. The vmwgfx driver may perform dirty-tracking and needs some special code to handle that correctly. Patch 7 implements a drm helper to align user-space addresses according to the above scheme, if possible. Patch 8 implements a TTM range manager for vmwgfx that does the same for graphics IO memory. This may later be reused by other graphics drivers if necessary. Patch 9 finally hooks up the helpers of patch 7 and 8 to the vmwgfx drive= r. A similar change is needed for graphics drivers that want a reasonable likelyhood of actually using huge page-table entries. If a buffer object size is not huge-page or giant-page aligned, its size will NOT be inflated by this patchset. This means that the buffe= r object tail will use smaller size page-table entries and thus no memory overhead occurs. Drivers that want to pay the memory overhead price need = to implement their own scheme to inflate buffer-object sizes. PMD size huge page-table-entries have been tested with vmwgfx and found t= o work well both with system memory backed and IO memory backed buffer obje= cts. PUD size giant page-table-entries have seen limited (fault and COW) testi= ng using a modified kernel (to support 1GB page allocations) and a fake vmwg= fx TTM memory type. The vmwgfx driver does otherwise not support 1GB-size IO memory resources. Comments and suggestions welcome. Thomas Changes since RFC: * Check for buffer objects present in contigous IO Memory (Christian K=C3= =B6nig) * Rebased on the vmwgfx emulated coherent memory functionality. That reba= se adds patch 5. Changes since v1: * Make the new TTM range manager vmwgfx-specific. (Christian K=C3=B6nig) * Minor fixes for configs that don't support or only partially support transhuge pages. Changes since v2: * Minor coding style and doc fixes in patch 5/9 (Christian K=C3=B6nig) * Patch 5/9 doesn't touch mm. Remove from the patch title. Changes since v3: * Added reviews and acks * Implemented ugly but generic ttm_pgprot_is_wrprotecting() instead of ar= ch specific code. Cc: Andrew Morton Cc: Michal Hocko Cc: "Matthew Wilcox (Oracle)" Cc: "Kirill A. Shutemov" Cc: Ralph Campbell Cc: "J=C3=A9r=C3=B4me Glisse" Cc: "Christian K=C3=B6nig" Cc: Dan Williams