From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0EA6C48BE8 for ; Tue, 15 Jun 2021 08:57:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8D96961426 for ; Tue, 15 Jun 2021 08:57:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231320AbhFOI7Q convert rfc822-to-8bit (ORCPT ); Tue, 15 Jun 2021 04:59:16 -0400 Received: from eu-smtp-delivery-151.mimecast.com ([185.58.85.151]:22845 "EHLO eu-smtp-delivery-151.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230519AbhFOI7P (ORCPT ); Tue, 15 Jun 2021 04:59:15 -0400 Received: from AcuMS.aculab.com (156.67.243.121 [156.67.243.121]) (Using TLS) by relay.mimecast.com with ESMTP id uk-mta-68-ED9ugU7XN0-kOTjofplO1Q-1; Tue, 15 Jun 2021 09:57:08 +0100 X-MC-Unique: ED9ugU7XN0-kOTjofplO1Q-1 Received: from AcuMS.Aculab.com (10.202.163.6) by AcuMS.aculab.com (10.202.163.6) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Tue, 15 Jun 2021 09:57:07 +0100 Received: from AcuMS.Aculab.com ([fe80::994c:f5c2:35d6:9b65]) by AcuMS.aculab.com ([fe80::994c:f5c2:35d6:9b65%12]) with mapi id 15.00.1497.018; Tue, 15 Jun 2021 09:57:07 +0100 From: David Laight To: 'Matteo Croce' , "linux-riscv@lists.infradead.org" CC: "linux-kernel@vger.kernel.org" , "linux-arch@vger.kernel.org" , Paul Walmsley , Palmer Dabbelt , Albert Ou , Atish Patra , "Emil Renner Berthing" , Akira Tsukamoto , Drew Fustini , Bin Meng Subject: RE: [PATCH 1/3] riscv: optimized memcpy Thread-Topic: [PATCH 1/3] riscv: optimized memcpy Thread-Index: AQHXYY/3XkdMIImxVUmoQbZ37iIZIqsUw3ig Date: Tue, 15 Jun 2021 08:57:07 +0000 Message-ID: <6cff2a895db94e6fadd4ddffb8906a73@AcuMS.aculab.com> References: <20210615023812.50885-1-mcroce@linux.microsoft.com> <20210615023812.50885-2-mcroce@linux.microsoft.com> In-Reply-To: <20210615023812.50885-2-mcroce@linux.microsoft.com> Accept-Language: en-GB, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.202.205.107] MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=C51A453 smtp.mailfrom=david.laight@aculab.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: aculab.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Matteo Croce > Sent: 15 June 2021 03:38 > > Write a C version of memcpy() which uses the biggest data size allowed, > without generating unaligned accesses. I'm surprised that the C loop: > + for (; count >= bytes_long; count -= bytes_long) > + *d.ulong++ = *s.ulong++; ends up being faster than the ASM 'read lots' - 'write lots' loop. Especially since there was an earlier patch to convert copy_to/from_user() to use the ASM 'read lots' - 'write lots' loop instead of a tight single register copy loop. I'd also guess that the performance needs to be measured on different classes of riscv cpu. A simple cpu will behave differently to one that can execute multiple instructions per clock. Any form of 'out of order' execution also changes things. The other big change is whether the cpu can to a memory read and write in the same clock. I'd guess that riscv exist with some/all of those features. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)