From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A4E46C07E95 for ; Wed, 7 Jul 2021 10:07:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8E95D619A5 for ; Wed, 7 Jul 2021 10:07:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231258AbhGGKKB (ORCPT ); Wed, 7 Jul 2021 06:10:01 -0400 Received: from eu-smtp-delivery-151.mimecast.com ([207.82.80.151]:35949 "EHLO eu-smtp-delivery-151.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229949AbhGGKKA (ORCPT ); Wed, 7 Jul 2021 06:10:00 -0400 Received: from AcuMS.aculab.com (156.67.243.121 [156.67.243.121]) (Using TLS) by relay.mimecast.com with ESMTP id uk-mtapsc-6-g79YA_UqNSq2doqSydWg2A-1; Wed, 07 Jul 2021 11:07:17 +0100 X-MC-Unique: g79YA_UqNSq2doqSydWg2A-1 Received: from AcuMS.Aculab.com (fd9f:af1c:a25b:0:994c:f5c2:35d6:9b65) by AcuMS.aculab.com (fd9f:af1c:a25b:0:994c:f5c2:35d6:9b65) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Wed, 7 Jul 2021 11:07:16 +0100 Received: from AcuMS.Aculab.com ([fe80::994c:f5c2:35d6:9b65]) by AcuMS.aculab.com ([fe80::994c:f5c2:35d6:9b65%12]) with mapi id 15.00.1497.018; Wed, 7 Jul 2021 11:07:16 +0100 From: David Laight To: 'Palmer Dabbelt' , "akira.tsukamoto@gmail.com" CC: Paul Walmsley , "aou@eecs.berkeley.edu" , "linux-riscv@lists.infradead.org" , "linux-kernel@vger.kernel.org" Subject: RE: [PATCH v3 1/1] riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall Thread-Topic: [PATCH v3 1/1] riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall Thread-Index: AQHXcr0E9zBxyMGA8EqGz08GSOYzlas3R1qQ Date: Wed, 7 Jul 2021 10:07:16 +0000 Message-ID: <7f6e56390954403fb07a1c606fbc7e6d@AcuMS.aculab.com> References: <60c1f087-1e8b-8f22-7d25-86f5f3dcee3f@gmail.com> In-Reply-To: Accept-Language: en-GB, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.202.205.107] MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=C51A453 smtp.mailfrom=david.laight@aculab.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: aculab.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: base64 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Li4uDQo+ID4gKwlmaXh1cCBSRUdfTCAgIGE0LCAgICAgICAgMChhMSksIDEwZg0KPiA+ICsJZml4 dXAgUkVHX0wgICBhNSwgICAgU1pSRUcoYTEpLCAxMGYNCj4gPiArCWZpeHVwIFJFR19MICAgYTYs ICAyKlNaUkVHKGExKSwgMTBmDQo+ID4gKwlmaXh1cCBSRUdfTCAgIGE3LCAgMypTWlJFRyhhMSks IDEwZg0KPiA+ICsJZml4dXAgUkVHX0wgICB0MSwgIDQqU1pSRUcoYTEpLCAxMGYNCj4gPiArCWZp eHVwIFJFR19MICAgdDIsICA1KlNaUkVHKGExKSwgMTBmDQo+ID4gKwlmaXh1cCBSRUdfTCAgIHQz LCAgNipTWlJFRyhhMSksIDEwZg0KPiA+ICsJZml4dXAgUkVHX0wgICB0NCwgIDcqU1pSRUcoYTEp LCAxMGYNCj4gPiArCWZpeHVwIFJFR19TICAgYTQsICAgICAgICAwKGEwKSwgMTBmDQo+ID4gKwlm aXh1cCBSRUdfUyAgIGE1LCAgICBTWlJFRyhhMCksIDEwZg0KPiA+ICsJZml4dXAgUkVHX1MgICBh NiwgIDIqU1pSRUcoYTApLCAxMGYNCj4gPiArCWZpeHVwIFJFR19TICAgYTcsICAzKlNaUkVHKGEw KSwgMTBmDQo+ID4gKwlmaXh1cCBSRUdfUyAgIHQxLCAgNCpTWlJFRyhhMCksIDEwZg0KPiA+ICsJ Zml4dXAgUkVHX1MgICB0MiwgIDUqU1pSRUcoYTApLCAxMGYNCj4gPiArCWZpeHVwIFJFR19TICAg dDMsICA2KlNaUkVHKGEwKSwgMTBmDQo+ID4gKwlmaXh1cCBSRUdfUyAgIHQ0LCAgNypTWlJFRyhh MCksIDEwZg0KPiANCj4gVGhpcyBzZWVtcyBsaWtlIGEgc3VzcGljaW91c2x5IGxhcmdlIHVucm9s bGluZyBmYWN0b3IsIGF0IGxlYXN0IHdpdGhvdXQNCj4gYSBmYWxsYmFjay4gIE15IGd1ZXNzIGlz IHRoYXQgc29tZSB3b3JrbG9hZHMgd2lsbCB3YW50IHNvbWUgc21hbGxlcg0KPiB1bnJvbGxpbmcg ZmFjdG9ycywgYnV0IGdpdmVuIHRoYXQgd2UgcnVuIG9uIHRoZXNlIHNpbmdsZS1pc3N1ZSBpbi1v cmRlcg0KPiBwcm9jZXNzb3JzIGl0J3MgcHJvYmFibHkgYmVzdCB0byBoYXZlIHNvbWUgYmlnIHVu cm9sbGluZyBmYWN0b3JzIGFzIHdlbGwNCj4gc2luY2UgdGhleSdyZSBwcmV0dHkgbGltaXRlZCBX UlQgaW50ZWdlciBiYW5kd2lkdGguDQoNCkJ1dCBhIHNpbmdsZS1pc3N1ZSBjcHUgaXMgdW5saWtl bHkgdG8gaGF2ZSBhbiA4IGNsb2NrIGRhdGEgZGVsYXkuDQpPVE9IIGEgY3B1IHRoYW4gY2FuIGRv IGNvbmN1cnJlbnQgbWVtb3J5IHJlYWQgYW5kIHdyaXRlIG1pZ2h0DQpub3QgaGF2ZSBlbm91Z2gg J291dCBvZiBvcmRlcicgY2FwYWJpbGl0eSB0byBkbyBzbyB3aXRoIHRoZSBhYm92ZSBsb29wLg0K DQpZb3UgbWF5IHdhbnQgdG8gaW50ZXJsZWF2ZSB0aGUgcmVhZHMgYW5kIHdyaXRlcyAtIHN0YXJ0 aW5nIHdpdGgNCnR3byBvciB0aHJlZSByZWFkcyAocG9zc2libHkgd2l0aCB0aGUgZXh0cmEgb25l cyBvdXRzaWRlIHRoZSBsb29wKS4NCg0KSSBkb24ndCBrbm93IHRoZSBtaWNyb2FyY2hpdGVjdHVy ZXMgd2VsbCBlbm91Z2ggKHdlbGwgYXQgYWxsKQ0KdG8ga25vdyB0aGUgZXhhY3QgcGl0ZmFsbHMu DQoNClRoZSB2ZXJ5IHNpbXBsZSBjcHUgbWlnaHQgaGF2ZSB0aGUgc2FtZSAnaXNzdWUnIHRoZSBO aW9zMiBoYXMNCihhbm90aGVyIE1JUFMgY2xvbmUgZnBnYSBzb2Z0IGNwdSkgd2hlcmUgdGhlcmUg Y2FuIGJlIGEgcGlwZWxpbmUNCnN0YWxsIGJldHdlZW4gYSB3cml0ZSBhbmQgcmVhZC4NCkkgZG91 YnQgdGhlIG5vbi10cml2aWFsIHJpc2N2IGhhdmUgdGhhdCBpc3N1ZSB0aG91Z2guDQoNCj4gPiAr CWFkZGkJYTAsIGEwLCA4KlNaUkVHDQo+ID4gKwlhZGRpCWExLCBhMSwgOCpTWlJFRw0KPiA+ICsJ Ymx0dQlhMCwgdDAsIDJiDQoNCkZvciBhIGR1YWwtaXNzdWUgY3B1IHlvdSB3YW50IHRvIG1vdmUg dGhlIHR3byAnYWRkaScgaGlnaGVyDQp1cCB0aGUgbG9vcCBzbyB0aGF0IHRoZXkgYXJlICdmcmVl Jy4NCg0KCURhdmlkDQoNCi0NClJlZ2lzdGVyZWQgQWRkcmVzcyBMYWtlc2lkZSwgQnJhbWxleSBS b2FkLCBNb3VudCBGYXJtLCBNaWx0b24gS2V5bmVzLCBNSzEgMVBULCBVSw0KUmVnaXN0cmF0aW9u IE5vOiAxMzk3Mzg2IChXYWxlcykNCg== From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 13888C07E9B for ; Wed, 7 Jul 2021 10:37:56 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CD42D61A06 for ; Wed, 7 Jul 2021 10:37:55 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CD42D61A06 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=ACULAB.COM Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:In-Reply-To:References: Message-ID:Date:Subject:CC:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=MpIoxB4COOQDOCGqSI7EAFPG/5kQw21gGyXSiafivx0=; b=ZWKlb6Qgshxz5l 0MesTfxf2iRIQORLSVp5jskTsJowxmBijKM57X6LjdtferwS8eJjws1IGOBFhj8IzpxRorWwi9vo9 ukNcZpB/BLDs20QSb4djVzLCSfqgDHhKLtbT7wIS/6AM3vpvzaxOoqOLsWwzmFTUrbnNfU3MDaF5L A5bHAZOxo+CB1pmtzVAIYmc8Q2nBp/s/zRrK6YLLF79ROM3vFIqdLJCbtBZt4XBYyjjlHIpVokecd DIBwqZVf+x+rBC+cbR0vXx8YCq5CUgt5/8jol7K8Ff6yVYFHCnnk2itdCVVqbanV18VT0YW8vTgJL m7Lk2k5IMKFE4aOVP13g==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1m14vr-00EVuG-5k; Wed, 07 Jul 2021 10:37:31 +0000 Received: from eu-smtp-delivery-151.mimecast.com ([207.82.80.151]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1m14Sf-00EPBk-HB for linux-riscv@lists.infradead.org; Wed, 07 Jul 2021 10:07:23 +0000 Received: from AcuMS.aculab.com (156.67.243.121 [156.67.243.121]) (Using TLS) by relay.mimecast.com with ESMTP id uk-mtapsc-6-g79YA_UqNSq2doqSydWg2A-1; Wed, 07 Jul 2021 11:07:17 +0100 X-MC-Unique: g79YA_UqNSq2doqSydWg2A-1 Received: from AcuMS.Aculab.com (fd9f:af1c:a25b:0:994c:f5c2:35d6:9b65) by AcuMS.aculab.com (fd9f:af1c:a25b:0:994c:f5c2:35d6:9b65) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Wed, 7 Jul 2021 11:07:16 +0100 Received: from AcuMS.Aculab.com ([fe80::994c:f5c2:35d6:9b65]) by AcuMS.aculab.com ([fe80::994c:f5c2:35d6:9b65%12]) with mapi id 15.00.1497.018; Wed, 7 Jul 2021 11:07:16 +0100 From: David Laight To: 'Palmer Dabbelt' , "akira.tsukamoto@gmail.com" CC: Paul Walmsley , "aou@eecs.berkeley.edu" , "linux-riscv@lists.infradead.org" , "linux-kernel@vger.kernel.org" Subject: RE: [PATCH v3 1/1] riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall Thread-Topic: [PATCH v3 1/1] riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall Thread-Index: AQHXcr0E9zBxyMGA8EqGz08GSOYzlas3R1qQ Date: Wed, 7 Jul 2021 10:07:16 +0000 Message-ID: <7f6e56390954403fb07a1c606fbc7e6d@AcuMS.aculab.com> References: <60c1f087-1e8b-8f22-7d25-86f5f3dcee3f@gmail.com> In-Reply-To: Accept-Language: en-GB, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.202.205.107] MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=C51A453 smtp.mailfrom=david.laight@aculab.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: aculab.com Content-Language: en-US X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210707_030721_905953_C35B9EC6 X-CRM114-Status: GOOD ( 10.85 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org ... > > + fixup REG_L a4, 0(a1), 10f > > + fixup REG_L a5, SZREG(a1), 10f > > + fixup REG_L a6, 2*SZREG(a1), 10f > > + fixup REG_L a7, 3*SZREG(a1), 10f > > + fixup REG_L t1, 4*SZREG(a1), 10f > > + fixup REG_L t2, 5*SZREG(a1), 10f > > + fixup REG_L t3, 6*SZREG(a1), 10f > > + fixup REG_L t4, 7*SZREG(a1), 10f > > + fixup REG_S a4, 0(a0), 10f > > + fixup REG_S a5, SZREG(a0), 10f > > + fixup REG_S a6, 2*SZREG(a0), 10f > > + fixup REG_S a7, 3*SZREG(a0), 10f > > + fixup REG_S t1, 4*SZREG(a0), 10f > > + fixup REG_S t2, 5*SZREG(a0), 10f > > + fixup REG_S t3, 6*SZREG(a0), 10f > > + fixup REG_S t4, 7*SZREG(a0), 10f > > This seems like a suspiciously large unrolling factor, at least without > a fallback. My guess is that some workloads will want some smaller > unrolling factors, but given that we run on these single-issue in-order > processors it's probably best to have some big unrolling factors as well > since they're pretty limited WRT integer bandwidth. But a single-issue cpu is unlikely to have an 8 clock data delay. OTOH a cpu than can do concurrent memory read and write might not have enough 'out of order' capability to do so with the above loop. You may want to interleave the reads and writes - starting with two or three reads (possibly with the extra ones outside the loop). I don't know the microarchitectures well enough (well at all) to know the exact pitfalls. The very simple cpu might have the same 'issue' the Nios2 has (another MIPS clone fpga soft cpu) where there can be a pipeline stall between a write and read. I doubt the non-trivial riscv have that issue though. > > + addi a0, a0, 8*SZREG > > + addi a1, a1, 8*SZREG > > + bltu a0, t0, 2b For a dual-issue cpu you want to move the two 'addi' higher up the loop so that they are 'free'. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv