From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC640C7EE23 for ; Mon, 5 Jun 2023 20:43:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233248AbjFEUna (ORCPT ); Mon, 5 Jun 2023 16:43:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58146 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231667AbjFEUn3 (ORCPT ); Mon, 5 Jun 2023 16:43:29 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8A778F2; Mon, 5 Jun 2023 13:43:28 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 07ABA62AB7; Mon, 5 Jun 2023 20:43:28 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B0414C433D2; Mon, 5 Jun 2023 20:43:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1685997807; bh=We5qywEUXFBvYr91+f59TuYCdj8cINMwKn3eQvO6qKo=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=I7xvXQXLM0Tk+V+96RzBdr/UoRB92a3ftHnZrilLQZvr5w3AlNjzmN9wbeCqkkbCl DDUm1W1zxfiZwe+gjXC6K0QMqz8d9USg05gKXvmHLfrAf+FqT+OTOcMK6aGFqzG1zb E86H0paXXm7ihSYO0CbgPPXfw8rIsLX4qwzhB6/StpLTRz6eVNJy8R0u2bcXb5AObl Ay16QdoRp1sSVGG9bP4FMnVsq9x+ifwOZH0isH9pe4CqYdVcIo5Tq1wLxq/sUlyaRg pXJxzrOzrppU5klX4Kcg/mPHvek90M1w/P5hCMfIv1PTbM22Z3AvfuML6Ql3ohsT7Q qhU+5loHIlZWg== Date: Mon, 5 Jun 2023 23:42:56 +0300 From: Mike Rapoport To: "Edgecombe, Rick P" Cc: "rostedt@goodmis.org" , "tglx@linutronix.de" , "deller@gmx.de" , "mcgrof@kernel.org" , "netdev@vger.kernel.org" , "nadav.amit@gmail.com" , "linux@armlinux.org.uk" , "davem@davemloft.net" , "linux-mips@vger.kernel.org" , "linuxppc-dev@lists.ozlabs.org" , "hca@linux.ibm.com" , "catalin.marinas@arm.com" , "linux-kernel@vger.kernel.org" , "kent.overstreet@linux.dev" , "linux-s390@vger.kernel.org" , "palmer@dabbelt.com" , "chenhuacai@kernel.org" , "tsbogend@alpha.franken.de" , "linux-trace-kernel@vger.kernel.org" , "mpe@ellerman.id.au" , "linux-parisc@vger.kernel.org" , "x86@kernel.org" , "christophe.leroy@csgroup.eu" , "linux-riscv@lists.infradead.org" , "will@kernel.org" , "dinguyen@kernel.org" , "naveen.n.rao@linux.ibm.com" , "sparclinux@vger.kernel.org" , "linux-modules@vger.kernel.org" , "bpf@vger.kernel.org" , "linux-arm-kernel@lists.infradead.org" , "song@kernel.org" , "linux-mm@kvack.org" , "loongarch@lists.linux.dev" , "akpm@linux-foundation.org" Subject: Re: [PATCH 12/13] x86/jitalloc: prepare to allocate exectuatble memory as ROX Message-ID: <20230605204256.GA52412@kernel.org> References: <20230601101257.530867-13-rppt@kernel.org> <0f50ac52a5280d924beeb131e6e4717b6ad9fdf7.camel@intel.com> <68b8160454518387c53508717ba5ed5545ff0283.camel@intel.com> <50D768D7-15BF-43B8-A5FD-220B25595336@gmail.com> <20230604225244.65be9103@rorschach.local.home> <20230605081143.GA3460@kernel.org> <88a62f834688ed77d08c778e1e427014cf7d3c1b.camel@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <88a62f834688ed77d08c778e1e427014cf7d3c1b.camel@intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-parisc@vger.kernel.org On Mon, Jun 05, 2023 at 04:10:21PM +0000, Edgecombe, Rick P wrote: > On Mon, 2023-06-05 at 11:11 +0300, Mike Rapoport wrote: > > On Sun, Jun 04, 2023 at 10:52:44PM -0400, Steven Rostedt wrote: > > > On Thu, 1 Jun 2023 16:54:36 -0700 > > > Nadav Amit wrote: > > > > > > > > The way text_poke() is used here, it is creating a new writable > > > > > alias > > > > > and flushing it for *each* write to the module (like for each > > > > > write of > > > > > an individual relocation, etc). I was just thinking it might > > > > > warrant > > > > > some batching or something.  > > > > > > I am not advocating to do so, but if you want to have many > > > > efficient > > > > writes, perhaps you can just disable CR0.WP. Just saying that if > > > > you > > > > are about to write all over the memory, text_poke() does not > > > > provide > > > > too much security for the poking thread. > > > > Heh, this is definitely and easier hack to implement :) > > I don't know the details, but previously there was some strong dislike > of CR0.WP toggling. And now there is also the problem of CET. Setting > CR0.WP=0 will #GP if CR4.CET is 1 (as it currently is for kernel IBT). > I guess you might get away with toggling them both in some controlled > situation, but it might be a lot easier to hack up then to be made > fully acceptable. It does sound much more efficient though. I don't think we'd really want that, especially looking at WARN_ONCE(bits_missing, "CR0 WP bit went missing!?\n"); at native_write_cr0(). > > > Batching does exist, which is what the text_poke_queue() thing > > > does. > > > > For module loading text_poke_queue() will still be much slower than a > > bunch > > of memset()s for no good reason because we don't need all the > > complexity of > > text_poke_bp_batch() for module initialization because we are sure we > > are > > not patching live code. > > > > What we'd need here is a new batching mode that will create a > > writable > > alias mapping at the beginning of apply_relocate_*() and > > module_finalize(), > > then it will use memcpy() to that writable alias and will tear the > > mapping > > down in the end. > > It's probably only a tiny bit faster than keeping a separate writable > allocation and text_poking it in at the end. Right, but it still will be faster than text_poking every relocation. > > Another option is to teach alternatives to update a writable copy > > rather > > than do in place changes like Song suggested. My feeling is that it > > will be > > more intrusive change though. > > You mean keeping a separate RW allocation and then text_poking() the > whole thing in when you are done? That is what I was trying to say at > the beginning of this thread. The other benefit is you don't make the > intermediate loading states of the module, executable. > > I tried this technique previously [0], and I thought it was not too > bad. In most of the callers it looks similar to what you have in > do_text_poke(). Sometimes less, sometimes more. It might need > enlightening of some of the stuff currently using text_poke() during > module loading, like jump labels. So that bit is more intrusive, yea. > But it sounds so much cleaner and well controlled. Did you have a > particular trouble spot in mind? Nothing in particular, except the intrusive part. Except the changes in modules.c we'd need to teach alternatives to deal with a writable copy. > [0] > https://lore.kernel.org/lkml/20201120202426.18009-5-rick.p.edgecombe@intel.com/ -- Sincerely yours, Mike.