From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15322C7619A for ; Wed, 12 Apr 2023 21:22:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9EDC7900003; Wed, 12 Apr 2023 17:22:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 976766B007D; Wed, 12 Apr 2023 17:22:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 816ED900003; Wed, 12 Apr 2023 17:22:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 6A9166B007B for ; Wed, 12 Apr 2023 17:22:41 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 25149120278 for ; Wed, 12 Apr 2023 21:22:41 +0000 (UTC) X-FDA: 80674013322.18.8D6E17F Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf27.hostedemail.com (Postfix) with ESMTP id 604804001E for ; Wed, 12 Apr 2023 21:22:39 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=YPt4R3mL; dmarc=none; spf=pass (imf27.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681334559; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+Cll63JULxN0yWP/Sv3hRRsnhg9rv6rmbBiRjN6np64=; b=Lq0R3oZnEBMvAyEnLWQAH/myukUyxHZHIO0zxkcfw9DI6cOZ/2qYcO21WOgxoYyLMl15Qv nyU4MzwYO3OoANJig+jmaGFN4qpOVLccv+bkNmA1Eskq5Zfll6k10/9ScbcRPn/VgvqJCd dm/upcVOOURYQQu3WE6N7eKV3EKxDsI= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=YPt4R3mL; dmarc=none; spf=pass (imf27.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681334559; a=rsa-sha256; cv=none; b=OHnIDoAEAiFOO6ofc6gIzg/XuuA43nYBeeeHM5KfrQcS9irdWOzQEnCoYFvMQeGe5Jr1E4 6+TFkujpyo8Ajk8hbvt7MlKAEiTlvXvcfx0LeWq6I85xk1xn8OmeOy8Hf6Jc7+RFSr3ubP VtSSw7kbCezIL+bmsOXe2KkZQJNS5HE= Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 4E66062FC0; Wed, 12 Apr 2023 21:22:38 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7E298C433D2; Wed, 12 Apr 2023 21:22:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1681334557; bh=WN6B+Av8pEy1O+KhHO/dvyN2eS48kNg9+OyF/O1E9/0=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=YPt4R3mLLZD3RbAUT9snWiFqH+yXsyKbg8vcMNGGraLoQEYqKm48sBWi3NELbakCp H39V0qlunc5h73qj5o2EqOM6erqeuf6X4/fFThQp/NEsmW5A4cNqwe6C3MU/NmSqCN vcEilj6aVThQLFth+atvKFoeTiDWRHV9LWtJ7fcU= Date: Wed, 12 Apr 2023 14:22:36 -0700 From: Andrew Morton To: "buddy.zhang" Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Venkatesh Pallipadi , Suresh Siddha , Juergen Gross , Dan Williams , Konstantin Khlebnikov Subject: Re: [PATCH] mm: Keep memory type same on DEVMEM Page-Fault Message-Id: <20230412142236.407d6d0e6d90232da004980e@linux-foundation.org> In-Reply-To: <20230319033750.475200-1-buddy.zhang@biscuitos.cn> References: <20230319033750.475200-1-buddy.zhang@biscuitos.cn> X-Mailer: Sylpheed 3.8.0beta1 (GTK+ 2.24.33; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 604804001E X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: ato4kdbbm9nxgxczemmd39ytxa69qeo8 X-HE-Tag: 1681334559-262334 X-HE-Meta: U2FsdGVkX1+0/kGUHB7m454lsCC+Qmw/bi8a0gQgXZfsx6usOsMuY6OvXMEDCkMADyL0SBoc+FHZ0rELU9MQGTlTA3E0JGffgQ6taUd35FOs8GvoEoxkk09dX2py1vXSJy65F94mPkBw2SWNdxNOCRKj5MNEhonUMAJdEQGQGETp4yZhhoT26WS0JGAnL4bqHGBNhCOdvuDKD9vY6v0woDMrfxfBPrazUTLRAveI3v4DxwtvrgjyWney8lmx5XzOA59HbY3c0hehKSLKuE550acbYypnehjNiVHN4AXuUcxTBEZTMOuNAKhSLyT09G2G01zTERHMvTKZ+r2TBxF8qH1hDZgIMb9mAmTLPwj3gDCfAGc8iQJYERhRguoceVH8F5QJu2Mk2tXm1nFtxZtbWy9+yto5qiCBHfNU1u5Yn2hpS9mc7fspEXfGo+gS522U2Dmx/Ei8NlCn+3hXhAm1ZRI7AW56xltINO54lZ3gME7jzH45hCj/JCYhQmIWVEak8Ovt162gqFEwdGDWERezrtDzC7MEv1Q1S4sGr3iH2rhuZsKEBA6OLWs4CAH7ZgH4Zh3M0Rh9mUz328WUms1OBzdN/W4pU2ujKFUT4hrMs90L+GlmHaMrXwDgHLHVkeCcTcN+U0f1HfKTSx0Edor/6x+VE3Igq25N3KsPdr9eO5pdEUlUt1xJvm1jKWrmah+vVXaItck3bOlCEf6WY50Rg/f1otIX2azTfsSwE/Lv9921tvGJuf0rv3SlwSWD/PJzllIKumTac2UezQPCSdp16K2DA2kk5EI05v3dmNT5g4p6DmNxupcZFuaI2JIQxtrj05JqZPBjN55lwhKf3AkK4R4yYSA6F5hq27AolxeWk4Qeau6+1yjUkBpMnYN6tOSD+putwh8DqDxqfPHwedwQJEvkeLq+uB+8xWaUGuJE5MYeRzZkxWQiHnsTDYqkbrJESEzg0kBSRl3GOWvYtns GmfLSiyn d1ubsvhjXSmejfoQl7RRGeZ/KVLPVUYmo+lGm2UWKTy5Ct4gWP00RrvuE2/YCD6AY/xlmk3iKp9vlhRn87Eq6MsGXj2r1OQmmVgOhGSGx+nLM5KmJLL3fHadAmXor5K+AEToSFI/JEiPk50e8M/NKYgtzxMFzUhzyDlFYEQKgHmkIsMAjihNMVhWnfWkKeY2nioRHaXVC9AhPE0byAb4c01p+rTtVVZWy/6/uFp5DjOjjUm6mSSoWQCdoix/sODGKle0T7LpQlNEIL4sYCCFOij0oqjGNabBWxwEea1VDUNbx350mCfrI6LEiDjxrXzqtcu+noEuRDJJoJ3ym+Y0zygUrMPkum2t4Q1ge X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sun, 19 Mar 2023 11:37:50 +0800 "buddy.zhang" wrote: > On X86 architecture, supports memory type on Page-table, such as > PTE is PAT/PCD/PWD, which can setup up Memory Type as WC/WB/WT/UC etc. > Then, Virtual address from userspace or kernel space can map to > same physical page, if each page table has different memory type, > then it's confused to have more memory type for same physical page. Thanks. Nobody has worked on this code for a long time. I'll cc a few folks who may be able to comment. > On DEVMEM, the 'remap_pfn_range()' keep memory type same on different > mapping. But if it happen on Page-Fault route, such as code: > > 19 static vm_fault_t vm_fault(struct vm_fault *vmf) > 20 { > 21 struct vm_area_struct *vma = vmf->vma; > 22 unsigned long address = vmf->address; > 23 struct page *fault_page; > 24 unsigned long pfn; > 25 int r; > 26 > 27 /* Allocate Page as DEVMEM */ > 28 fault_page = alloc_page(GFP_KERNEL); > 29 if (!fault_page) { > 30 printk("ERROR: NO Free Memory from DEVMEM.\n"); > 31 r = -ENOMEM; > 32 goto err_alloc; > 33 } > 34 pfn = page_to_pfn(fault_page); > 35 > 36 /* Clear PAT Attribute */ > 37 pgprot_val(vma->vm_page_prot) &= ~(_PAGE_PCD | _PAGE_PWT | _PAGE_PAT); > 38 > 39 /* Change Memory Type for Direct-Mapping Area */ > 40 arch_io_reserve_memtype_wc(PFN_PHYS(pfn), PAGE_SIZE); > 41 pgprot_val(vma->vm_page_prot) |= cachemode2protval(_PAGE_CACHE_MODE_WT); > 42 > 43 /* Establish pte and INC _mapcount for page */ > 44 vm_flags_set(vma, VM_MIXEDMAP); > 45 if (vm_insert_page(vma, address, fault_page)) > 46 return -EAGAIN; > 47 > 48 /* Add refcount for page */ > 49 atomic_inc(&fault_page->_refcount); > 50 /* bind fault page */ > 51 vmf->page = fault_page; > 52 > 53 return 0; > 54 > 55 err_alloc: > 56 return r; > 57 } > 58 > 59 static const struct vm_operations_struct BiscuitOS_vm_ops = { > 60 .fault = vm_fault, > 61 }; > 62 > 63 static int BiscuitOS_mmap(struct file *filp, struct vm_area_struct *vma) > 64 { > 65 /* setup vm_ops */ > 66 vma->vm_ops = &BiscuitOS_vm_ops; > 67 > 68 return 0; > 69 } > > If invoke arch_io_reserve_memtype_wc() on Line-40, and modify memory type > as WC for Direct-Mapping area, and then setup meory type as WT on Line-41, > then invoke 'vm_insert_page()' to create mapping, so you can see: > > | <----- Usespace -----> | <- Kernel space -> | > ----+------+---+-------------+---+---+------------+-- > | | | | | | | > ----+------+---+-------------+---+---+------------+-- > WT| |WC > o-------o o--------o > WT| |WC > V V > -------------------+--------+------------------------ > | DEVMEM | > -------------------+--------+------------------------ > Physical Address Space > > For this case, OS should check memory type before mapping on 'vm_insert_page()', > and keep memory type same, so add check on function: > > 07 int vm_insert_page(struct vm_area_struct *vma, unsigned long addr, > 08 struct page *page) > 09 { > 10 if (addr < vma->vm_start || addr >= vma->vm_end) > 11 return -EFAULT; > 12 if (!page_count(page)) > 13 return -EINVAL; > 14 if (!(vma->vm_flags & VM_MIXEDMAP)) { > 15 BUG_ON(mmap_read_trylock(vma->vm_mm)); > 16 BUG_ON(vma->vm_flags & VM_PFNMAP); > 17 vm_flags_set(vma, VM_MIXEDMAP); > 18 } > 19 if (track_pfn_remap(vma, &vma->vm_page_prot, > 20 page_to_pfn(page), addr, PAGE_SIZE)) > 21 return -EINVAL; > 22 return insert_page(vma, addr, page, vma->vm_page_prot); > 23 } > > And line 19 to 21, when mapping different memory type on this route, the > 'track_pfn_remap()' will notify error and change request as current, e.g. > > x86/PAT: APP:88 map pfn RAM range req write-through for [mem 0x025c1000-0x025c1fff], got write-combining > > And then, we can keep memory type same on Page-fault route for DEVMEM, the end: > > | <----- Usespace -----> | <- Kernel space -> | > ----+------+---+-------------+---+---+------------+-- > | | | | | | | > ----+------+---+-------------+---+---+------------+-- > WT| |WC > o---(X)----o----------o > |WC > V > -------------------+--------+------------------------ > | DEVMEM | > -------------------+--------+------------------------ > > Signed-off-by: buddy.zhang@biscuitos.cn > --- > mm/memory.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/mm/memory.c b/mm/memory.c > index f456f3b5049c..ed3d09f513f1 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -1989,6 +1989,9 @@ int vm_insert_page(struct vm_area_struct *vma, unsigned long addr, > BUG_ON(vma->vm_flags & VM_PFNMAP); > vm_flags_set(vma, VM_MIXEDMAP); > } > + if (track_pfn_remap(vma, &vma->vm_page_prot, > + page_to_pfn(page), addr, PAGE_SIZE)) > + return -EINVAL; > return insert_page(vma, addr, page, vma->vm_page_prot); > } > EXPORT_SYMBOL(vm_insert_page); > -- > 2.25.1 >