From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90704C07E95 for ; Tue, 13 Jul 2021 18:10:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6B9C661375 for ; Tue, 13 Jul 2021 18:10:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233936AbhGMSNg convert rfc822-to-8bit (ORCPT ); Tue, 13 Jul 2021 14:13:36 -0400 Received: from mail-vs1-f53.google.com ([209.85.217.53]:33571 "EHLO mail-vs1-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229500AbhGMSNg (ORCPT ); Tue, 13 Jul 2021 14:13:36 -0400 Received: by mail-vs1-f53.google.com with SMTP id j8so12855414vsd.0 for ; Tue, 13 Jul 2021 11:10:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=qlhl2E4nqEvej10lWriIdCXDfOfI/z5zwQZ0hljp3Qc=; b=baWQAcbEN26/xaDaeP8sVQl7wX7hBxHEFYH9DcN+pge2ORvARzAVQadZbp+c4WHk9E sCzrcKjr59TZtHKyhIRhuoJMYViZgfsRBsSLd5obMCz6O0mixeqv1oeibwOwXDKTkqo8 1pb9nc/Fv//J4W0bSIFZKBqBiOOSUsuKiVrWdzQKQnp4zoNNzxO6aKrXKTrIdnzog4HR 31E/At3TfC7TNwvG19rrOloNC0ZeTtjSGcp3iGyY7FyDz3FIEKhm3dapl89QuGJ5JaRC 17syXduwVR3DxrxwnAghZYJ008eEwPTizOtYEtPsuIEex8Isc8VFK6Nfzd7czqXhXo3X H8/Q== X-Gm-Message-State: AOAM532kLldWTVQ3CkzPo1xO18Dx3wZuWf4KCAV7+DCyZghYN9bu1937 SmeHXX19fFKW1eNKcozeVHDNoFO+iq9xm0mL2cY= X-Google-Smtp-Source: ABdhPJzePsGsxOA9mqBP7SE9ett5dB4ezFPwOrW7KOdf7lH79pezfcjEWLC1BHM6GAPhc1Dl5k1pOmVQ5W5h+wTAFeY= X-Received: by 2002:a67:3c2:: with SMTP id 185mr7981244vsd.42.1626199844405; Tue, 13 Jul 2021 11:10:44 -0700 (PDT) MIME-Version: 1.0 References: <3e1dbea4-3b0f-de32-5447-2e23c6d4652a@gmail.com> <60c1f087-1e8b-8f22-7d25-86f5f3dcee3f@gmail.com> <20210710014915.GA149706@roeck-us.net> In-Reply-To: <20210710014915.GA149706@roeck-us.net> From: Geert Uytterhoeven Date: Tue, 13 Jul 2021 20:10:33 +0200 Message-ID: Subject: Re: [PATCH v3 1/1] riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall To: Guenter Roeck , Akira Tsukamoto Cc: Paul Walmsley , Palmer Dabbelt , Albert Ou , linux-riscv , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Günter, Tsukamoto-san, On Sat, Jul 10, 2021 at 3:50 AM Guenter Roeck wrote: > On Wed, Jun 23, 2021 at 09:40:39PM +0900, Akira Tsukamoto wrote: > > This patch will reduce cpu usage dramatically in kernel space especially > > for application which use sys-call with large buffer size, such as network > > applications. The main reason behind this is that every unaligned memory > > access will raise exceptions and switch between s-mode and m-mode causing > > large overhead. > > > > First copy in bytes until reaches the first word aligned boundary in > > destination memory address. This is the preparation before the bulk > > aligned word copy. > > > > The destination address is aligned now, but oftentimes the source address > > is not in an aligned boundary. To reduce the unaligned memory access, it > > reads the data from source in aligned boundaries, which will cause the > > data to have an offset, and then combines the data in the next iteration > > by fixing offset with shifting before writing to destination. The majority > > of the improving copy speed comes from this shift copy. > > > > In the lucky situation that the both source and destination address are on > > the aligned boundary, perform load and store with register size to copy the > > data. Without the unrolling, it will reduce the speed since the next store > > instruction for the same register using from the load will stall the > > pipeline. > > > > At last, copying the remainder in one byte at a time. > > > > Signed-off-by: Akira Tsukamoto > > This patch causes all riscv32 qemu emulations to stall during boot. > The log suggests that something in kernel/user communication may be wrong. > > Bad case: > > Starting syslogd: OK > Starting klogd: OK > /etc/init.d/S02sysctl: line 68: syntax error: EOF in backquote substitution > /etc/init.d/S20urandom: line 1: syntax error: unterminated quoted string > Starting network: /bin/sh: syntax error: unterminated quoted string > # first bad commit: [ca6eaaa210deec0e41cbfc380bf89cf079203569] riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall Same here on vexriscv. Bisected to the same commit. The actual scripts look fine when using "cat", but contain some garbage when executing them using "sh -v". Tsukamoto-san: glancing at the patch: + addi a0, a0, 8*SZREG + addi a1, a1, 8*SZREG I think you forgot about rv32, where registers cover only 4 bytes each? Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E49CC07E95 for ; Tue, 13 Jul 2021 18:11:31 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CB7B261370 for ; Tue, 13 Jul 2021 18:11:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CB7B261370 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-m68k.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:Subject:Message-ID:Date:From: In-Reply-To:References:MIME-Version:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=xHJri7NVU0GFXnNHn+ACP0SSUu8eK53e1EvZnfeZlVQ=; b=zgskcebCAzgZ1V Rk93ksopCW9CqUmVZv+2irWHHlZ0zZcqz4KLa+fTS98crhWkX8FIoS/BRHO7et+u8fQ+SBjS1giPj +haGxCng8IiQzUxWe/4RCSbOiig5moVljaijqc0iEPBD6UtSJl0i4TXte6JUni+eygRN6XdcA0hU6 NGg3/3jvXQvRpvZ3ezB5AngDoIRtpkkXwnk62GxWL9UZ4umsYmW5Ugoax0bBKGAvEdS2wSCBlOSyO yukdBYxtiVvAmRPchvm0DZzPyvIDN89qJ0EJ+ii6rAe8ZxL/oAiNnl7ixEHOC0NyTgJkv7sCSOXQb YX95gI2+Xak7t+kZwmtQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1m3Ms1-00B7yV-8B; Tue, 13 Jul 2021 18:11:01 +0000 Received: from mail-vs1-f45.google.com ([209.85.217.45]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1m3Mrl-00B7xl-Qs for linux-riscv@lists.infradead.org; Tue, 13 Jul 2021 18:10:47 +0000 Received: by mail-vs1-f45.google.com with SMTP id a22so6253052vso.1 for ; Tue, 13 Jul 2021 11:10:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=qlhl2E4nqEvej10lWriIdCXDfOfI/z5zwQZ0hljp3Qc=; b=th+X9eSD4+CjNSfXuaTh1DYaEJuC6V/m4XWOp4IFH9NvebMiyL06pzX+VpbWkNj3mv L0aFdMafZWkCVxOIIr9zUILtBeSYC7PtGL8UtWRLs1TidmI2veRdcEjunPj+E8XgsY+e RKwrPISJg6AHmzFxv8UZoZsv0kzkP1kWIbLsw8ttIYVDz+UXVd99IM+iKz/F/1/WzkJs SYEAVL0u/yqNiCY4v1bS/QHe7GSeUuOwRCzoJ9k45f60Ttel4aaaQMlVPUkxaI5l2Oei UvGq9DB2wwiBNBjp1IYojyTkPjqRKH+vCErra+9YEpfwy4rUXt0g9oRhvp5QrFFrJPJy hkvw== X-Gm-Message-State: AOAM530dcYAbcv3Xz7gO8bNlHev9R/yoIVE6xfgUUFgJveFxNVCtmgtD cxq8q1u5zz9jYTktIcDgyWWprgLasy4x+iN9i2s= X-Google-Smtp-Source: ABdhPJzePsGsxOA9mqBP7SE9ett5dB4ezFPwOrW7KOdf7lH79pezfcjEWLC1BHM6GAPhc1Dl5k1pOmVQ5W5h+wTAFeY= X-Received: by 2002:a67:3c2:: with SMTP id 185mr7981244vsd.42.1626199844405; Tue, 13 Jul 2021 11:10:44 -0700 (PDT) MIME-Version: 1.0 References: <3e1dbea4-3b0f-de32-5447-2e23c6d4652a@gmail.com> <60c1f087-1e8b-8f22-7d25-86f5f3dcee3f@gmail.com> <20210710014915.GA149706@roeck-us.net> In-Reply-To: <20210710014915.GA149706@roeck-us.net> From: Geert Uytterhoeven Date: Tue, 13 Jul 2021 20:10:33 +0200 Message-ID: Subject: Re: [PATCH v3 1/1] riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall To: Guenter Roeck , Akira Tsukamoto Cc: Paul Walmsley , Palmer Dabbelt , Albert Ou , linux-riscv , Linux Kernel Mailing List X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210713_111045_912095_F6EA22B8 X-CRM114-Status: GOOD ( 29.05 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org SGkgR8O8bnRlciwgVHN1a2Ftb3RvLXNhbiwKCk9uIFNhdCwgSnVsIDEwLCAyMDIxIGF0IDM6NTAg QU0gR3VlbnRlciBSb2VjayA8bGludXhAcm9lY2stdXMubmV0PiB3cm90ZToKPiBPbiBXZWQsIEp1 biAyMywgMjAyMSBhdCAwOTo0MDozOVBNICswOTAwLCBBa2lyYSBUc3VrYW1vdG8gd3JvdGU6Cj4g PiBUaGlzIHBhdGNoIHdpbGwgcmVkdWNlIGNwdSB1c2FnZSBkcmFtYXRpY2FsbHkgaW4ga2VybmVs IHNwYWNlIGVzcGVjaWFsbHkKPiA+IGZvciBhcHBsaWNhdGlvbiB3aGljaCB1c2Ugc3lzLWNhbGwg d2l0aCBsYXJnZSBidWZmZXIgc2l6ZSwgc3VjaCBhcyBuZXR3b3JrCj4gPiBhcHBsaWNhdGlvbnMu IFRoZSBtYWluIHJlYXNvbiBiZWhpbmQgdGhpcyBpcyB0aGF0IGV2ZXJ5IHVuYWxpZ25lZCBtZW1v cnkKPiA+IGFjY2VzcyB3aWxsIHJhaXNlIGV4Y2VwdGlvbnMgYW5kIHN3aXRjaCBiZXR3ZWVuIHMt bW9kZSBhbmQgbS1tb2RlIGNhdXNpbmcKPiA+IGxhcmdlIG92ZXJoZWFkLgo+ID4KPiA+IEZpcnN0 IGNvcHkgaW4gYnl0ZXMgdW50aWwgcmVhY2hlcyB0aGUgZmlyc3Qgd29yZCBhbGlnbmVkIGJvdW5k YXJ5IGluCj4gPiBkZXN0aW5hdGlvbiBtZW1vcnkgYWRkcmVzcy4gVGhpcyBpcyB0aGUgcHJlcGFy YXRpb24gYmVmb3JlIHRoZSBidWxrCj4gPiBhbGlnbmVkIHdvcmQgY29weS4KPiA+Cj4gPiBUaGUg ZGVzdGluYXRpb24gYWRkcmVzcyBpcyBhbGlnbmVkIG5vdywgYnV0IG9mdGVudGltZXMgdGhlIHNv dXJjZSBhZGRyZXNzCj4gPiBpcyBub3QgaW4gYW4gYWxpZ25lZCBib3VuZGFyeS4gVG8gcmVkdWNl IHRoZSB1bmFsaWduZWQgbWVtb3J5IGFjY2VzcywgaXQKPiA+IHJlYWRzIHRoZSBkYXRhIGZyb20g c291cmNlIGluIGFsaWduZWQgYm91bmRhcmllcywgd2hpY2ggd2lsbCBjYXVzZSB0aGUKPiA+IGRh dGEgdG8gaGF2ZSBhbiBvZmZzZXQsIGFuZCB0aGVuIGNvbWJpbmVzIHRoZSBkYXRhIGluIHRoZSBu ZXh0IGl0ZXJhdGlvbgo+ID4gYnkgZml4aW5nIG9mZnNldCB3aXRoIHNoaWZ0aW5nIGJlZm9yZSB3 cml0aW5nIHRvIGRlc3RpbmF0aW9uLiBUaGUgbWFqb3JpdHkKPiA+IG9mIHRoZSBpbXByb3Zpbmcg Y29weSBzcGVlZCBjb21lcyBmcm9tIHRoaXMgc2hpZnQgY29weS4KPiA+Cj4gPiBJbiB0aGUgbHVj a3kgc2l0dWF0aW9uIHRoYXQgdGhlIGJvdGggc291cmNlIGFuZCBkZXN0aW5hdGlvbiBhZGRyZXNz IGFyZSBvbgo+ID4gdGhlIGFsaWduZWQgYm91bmRhcnksIHBlcmZvcm0gbG9hZCBhbmQgc3RvcmUg d2l0aCByZWdpc3RlciBzaXplIHRvIGNvcHkgdGhlCj4gPiBkYXRhLiBXaXRob3V0IHRoZSB1bnJv bGxpbmcsIGl0IHdpbGwgcmVkdWNlIHRoZSBzcGVlZCBzaW5jZSB0aGUgbmV4dCBzdG9yZQo+ID4g aW5zdHJ1Y3Rpb24gZm9yIHRoZSBzYW1lIHJlZ2lzdGVyIHVzaW5nIGZyb20gdGhlIGxvYWQgd2ls bCBzdGFsbCB0aGUKPiA+IHBpcGVsaW5lLgo+ID4KPiA+IEF0IGxhc3QsIGNvcHlpbmcgdGhlIHJl bWFpbmRlciBpbiBvbmUgYnl0ZSBhdCBhIHRpbWUuCj4gPgo+ID4gU2lnbmVkLW9mZi1ieTogQWtp cmEgVHN1a2Ftb3RvIDxha2lyYS50c3VrYW1vdG9AZ21haWwuY29tPgo+Cj4gVGhpcyBwYXRjaCBj YXVzZXMgYWxsIHJpc2N2MzIgcWVtdSBlbXVsYXRpb25zIHRvIHN0YWxsIGR1cmluZyBib290Lgo+ IFRoZSBsb2cgc3VnZ2VzdHMgdGhhdCBzb21ldGhpbmcgaW4ga2VybmVsL3VzZXIgY29tbXVuaWNh dGlvbiBtYXkgYmUgd3JvbmcuCj4KPiBCYWQgY2FzZToKPgo+IFN0YXJ0aW5nIHN5c2xvZ2Q6IE9L Cj4gU3RhcnRpbmcga2xvZ2Q6IE9LCj4gL2V0Yy9pbml0LmQvUzAyc3lzY3RsOiBsaW5lIDY4OiBz eW50YXggZXJyb3I6IEVPRiBpbiBiYWNrcXVvdGUgc3Vic3RpdHV0aW9uCj4gL2V0Yy9pbml0LmQv UzIwdXJhbmRvbTogbGluZSAxOiBzeW50YXggZXJyb3I6IHVudGVybWluYXRlZCBxdW90ZWQgc3Ry aW5nCj4gU3RhcnRpbmcgbmV0d29yazogL2Jpbi9zaDogc3ludGF4IGVycm9yOiB1bnRlcm1pbmF0 ZWQgcXVvdGVkIHN0cmluZwoKPiAjIGZpcnN0IGJhZCBjb21taXQ6IFtjYTZlYWFhMjEwZGVlYzBl NDFjYmZjMzgwYmY4OWNmMDc5MjAzNTY5XSByaXNjdjogX19hc21fY29weV90by1mcm9tX3VzZXI6 IE9wdGltaXplIHVuYWxpZ25lZCBtZW1vcnkgYWNjZXNzIGFuZCBwaXBlbGluZSBzdGFsbAoKU2Ft ZSBoZXJlIG9uIHZleHJpc2N2LiBCaXNlY3RlZCB0byB0aGUgc2FtZSBjb21taXQuCgpUaGUgYWN0 dWFsIHNjcmlwdHMgbG9vayBmaW5lIHdoZW4gdXNpbmcgImNhdCIsIGJ1dCBjb250YWluIHNvbWUg Z2FyYmFnZQp3aGVuIGV4ZWN1dGluZyB0aGVtIHVzaW5nICJzaCAtdiIuCgpUc3VrYW1vdG8tc2Fu OiBnbGFuY2luZyBhdCB0aGUgcGF0Y2g6CgorICAgICAgIGFkZGkgICAgYTAsIGEwLCA4KlNaUkVH CisgICAgICAgYWRkaSAgICBhMSwgYTEsIDgqU1pSRUcKCkkgdGhpbmsgeW91IGZvcmdvdCBhYm91 dCBydjMyLCB3aGVyZSByZWdpc3RlcnMgY292ZXIgb25seSA0CmJ5dGVzIGVhY2g/CgpHcntvZXRq ZSxlZXRpbmd9cywKCiAgICAgICAgICAgICAgICAgICAgICAgIEdlZXJ0CgotLSAKR2VlcnQgVXl0 dGVyaG9ldmVuIC0tIFRoZXJlJ3MgbG90cyBvZiBMaW51eCBiZXlvbmQgaWEzMiAtLSBnZWVydEBs aW51eC1tNjhrLm9yZwoKSW4gcGVyc29uYWwgY29udmVyc2F0aW9ucyB3aXRoIHRlY2huaWNhbCBw ZW9wbGUsIEkgY2FsbCBteXNlbGYgYSBoYWNrZXIuIEJ1dAp3aGVuIEknbSB0YWxraW5nIHRvIGpv dXJuYWxpc3RzIEkganVzdCBzYXkgInByb2dyYW1tZXIiIG9yIHNvbWV0aGluZyBsaWtlIHRoYXQu CiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgLS0gTGludXMgVG9ydmFsZHMKCl9fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fCmxpbnV4LXJpc2N2IG1h aWxpbmcgbGlzdApsaW51eC1yaXNjdkBsaXN0cy5pbmZyYWRlYWQub3JnCmh0dHA6Ly9saXN0cy5p bmZyYWRlYWQub3JnL21haWxtYW4vbGlzdGluZm8vbGludXgtcmlzY3YK