From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7061AC433F5 for ; Thu, 21 Apr 2022 11:08:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=c0c1tyDVVkg80jjTTG6Df12qnDTP4ZT4gsQSWLUKHaU=; b=ePAhezfiVAelzY egSjGeU7MHbni0EGBCBEac53d8AFC9XKLotjgvj/SjRG1dL0muB+LAn4luJqOceEZDv5u6BKwbGcf aMeJaA0t4En99GTMG3vupWIUrA6TrvuAQuF1IVFM9rylWG1L4DjTQPa44UB2Pk7snRwyhfsVCRId0 x1+9YIVakvzgHINDJbsYC5PTJAc8l2I3wi0MkjRZQ80OnXxySnJXg+Qk7qIH8JkJAybCiq8b/TZ6P cIsDaYDP3xA336ZU1FFjX5bX2xa4BjRTggLPOM1Q9RFr6GxTmmkZsad33bKzg40JOqUwrFqfp8LM5 b6GkZEJigKTQGr8hZN2A==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1nhUea-00D8wP-Pv; Thu, 21 Apr 2022 11:07:16 +0000 Received: from sin.source.kernel.org ([2604:1380:40e1:4800::1]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1nhUeW-00D8tj-Ix for linux-arm-kernel@lists.infradead.org; Thu, 21 Apr 2022 11:07:14 +0000 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 3D200CE21C1; Thu, 21 Apr 2022 11:07:06 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1FDB7C385A5; Thu, 21 Apr 2022 11:07:01 +0000 (UTC) Date: Thu, 21 Apr 2022 12:06:58 +0100 From: Catalin Marinas To: Christoph Hellwig Cc: Arnd Bergmann , Ard Biesheuvel , Herbert Xu , Will Deacon , Marc Zyngier , Greg Kroah-Hartman , Andrew Morton , Linus Torvalds , Linux Memory Management List , Linux ARM , Linux Kernel Mailing List , "David S. Miller" Subject: Re: [PATCH 07/10] crypto: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220421_040713_056607_93329960 X-CRM114-Status: GOOD ( 23.52 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Thu, Apr 21, 2022 at 12:20:22AM -0700, Christoph Hellwig wrote: > Btw, there is another option: Most real systems already require having > swiotlb to bounce buffer in some cases. We could simply force bounce > buffering in the dma mapping code for too small or not properly aligned > transfers and just decrease the dma alignment. We can force bounce if size is small but checking the alignment is trickier. Normally the beginning of the buffer is aligned but the end is at some sizeof() distance. We need to know whether the end is in a kmalloc-128 cache and that requires reaching out to the slab internals. That's doable and not expensive but it needs to be done for every small size getting to the DMA API, something like (for mm/slub.c): folio = virt_to_folio(x); slab = folio_slab(folio); if (slab->slab_cache->align < ARCH_DMA_MINALIGN) ... bounce ... (and a bit different for mm/slab.c) If we scrap ARCH_DMA_MINALIGN altogether from arm64, we can check the alignment against cache_line_size(), though I'd rather keep it for code that wants to avoid bouncing and goes for this compile-time alignment. I think we are down to four options (1 and 2 can be combined): 1. ARCH_DMA_MINALIGN == 128, dynamic arch_kmalloc_minalign() to reduce kmalloc() alignment to 64 on most arm64 SoC - this series. 2. ARCH_DMA_MINALIGN == 128, ARCH_KMALLOC_MINALIGN == 128, add explicit __GFP_PACKED for small allocations. It can be combined with (1) so that allocations without __GFP_PACKED can still get 64-byte alignment. 3. ARCH_DMA_MINALIGN == 128, ARCH_KMALLOC_MINALIGN == 8, swiotlb bounce. 4. undef ARCH_DMA_MINALIGN, ARCH_KMALLOC_MINALIGN == 8, swiotlb bounce. (3) and (4) don't require histogram analysis. Between them, I have a preference for (3) as it gives drivers a chance to avoid the bounce. If (2) is feasible, we don't need to bother with any bouncing or structure alignments, it's an opt-in by the driver/subsystem. However, it may be tedious to analyse the hot spots. While there are a few obvious places (kstrdup), I don't have access to a multitude of devices that may exercise the drivers and subsystems. With (3) the risk is someone complaining about performance or even running out of swiotlb space on some SoCs (I guess the fall-back can be another kmalloc() with an appropriate size). I guess we can limit the choice to either (2) or (3). I have (2) already (needs some more testing). I can attempt (3) and try to run it on some real hardware to see the perf impact. -- Catalin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel