From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.3 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 46B11C4338F for ; Thu, 5 Aug 2021 10:31:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 282226105A for ; Thu, 5 Aug 2021 10:31:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240179AbhHEKbz (ORCPT ); Thu, 5 Aug 2021 06:31:55 -0400 Received: from linux.microsoft.com ([13.77.154.182]:47730 "EHLO linux.microsoft.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239963AbhHEKbx (ORCPT ); Thu, 5 Aug 2021 06:31:53 -0400 Received: from mail-pj1-f46.google.com (mail-pj1-f46.google.com [209.85.216.46]) by linux.microsoft.com (Postfix) with ESMTPSA id 8FAB920B36ED; Thu, 5 Aug 2021 03:31:39 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 8FAB920B36ED DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1628159499; bh=ePlKPhwRIbIIFc+MuT827+okoKvGxizkUBUP/w4ErsE=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=pjAa4Q58vbxnUikwyYqOtoHBFxksV2ox51gdNK8v6ZdmyeLfBka8OCjfdWkiu/oQR 8Io+iu0nNtKxyKl2Jn9y+eBlz9MQWvgCaFoqtiQcluqUgeqL1GkwiDN+LfVHszd+np EA0N1IBvlOvftaEGDTq/I2JWJJ+4umT2At7V+VHc= Received: by mail-pj1-f46.google.com with SMTP id l19so7669616pjz.0; Thu, 05 Aug 2021 03:31:39 -0700 (PDT) X-Gm-Message-State: AOAM531YqyRCmOidWnKiMz3RXQMB2X9PlCofjzHNSLBQ7VcYRrPbtViB ooyqRNyUMoxdffXmugKa2rTjm9F8qBllXgbVQZk= X-Google-Smtp-Source: ABdhPJxdohcSXq2Fna33fIGUiEu6kUIdxZ6/UhzFU8Koaz6JrTkw4lOh1yVNgX/OfIBQqhE040kGEhxRvLm7q8DNB0E= X-Received: by 2002:aa7:80d1:0:b029:399:ce3a:d617 with SMTP id a17-20020aa780d10000b0290399ce3ad617mr4286293pfn.16.1628159499106; Thu, 05 Aug 2021 03:31:39 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Matteo Croce Date: Thu, 5 Aug 2021 12:31:04 +0200 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] riscv: use the generic string routines To: Palmer Dabbelt Cc: linux-riscv , Linux Kernel Mailing List , linux-arch , Paul Walmsley , Albert Ou , Atish Patra , Emil Renner Berthing , Akira Tsukamoto , Drew Fustini , Bin Meng , David Laight , Guo Ren , Christoph Hellwig Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-arch@vger.kernel.org On Wed, Aug 4, 2021 at 10:40 PM Palmer Dabbelt wrote: > > On Tue, 03 Aug 2021 09:54:34 PDT (-0700), mcroce@linux.microsoft.com wrote: > > On Mon, Jul 19, 2021 at 1:44 PM Matteo Croce wrote: > >> > >> From: Matteo Croce > >> > >> Use the generic routines which handle alignment properly. > >> > >> These are the performances measured on a BeagleV machine for a > >> 32 mbyte buffer: > >> > >> memcpy: > >> original aligned: 75 Mb/s > >> original unaligned: 75 Mb/s > >> new aligned: 114 Mb/s > >> new unaligned: 107 Mb/s > >> > >> memset: > >> original aligned: 140 Mb/s > >> original unaligned: 140 Mb/s > >> new aligned: 241 Mb/s > >> new unaligned: 241 Mb/s > >> > >> TCP throughput with iperf3 gives a similar improvement as well. > >> > >> This is the binary size increase according to bloat-o-meter: > >> > >> add/remove: 0/0 grow/shrink: 4/2 up/down: 432/-36 (396) > >> Function old new delta > >> memcpy 36 324 +288 > >> memset 32 148 +116 > >> strlcpy 116 132 +16 > >> strscpy_pad 84 96 +12 > >> strlcat 176 164 -12 > >> memmove 76 52 -24 > >> Total: Before=1225371, After=1225767, chg +0.03% > >> > >> Signed-off-by: Matteo Croce > >> Signed-off-by: Emil Renner Berthing > >> --- > > > > Hi, > > > > can someone have a look at this change and share opinions? > > This LGTM. How are the generic string routines landing? I'm happy to > take this into my for-next, but IIUC we need the optimized generic > versions first so we don't have a performance regression falling back to > the trivial ones for a bit. Is there a shared tag I can pull in? Hi, I see them only in linux-next by now. -- per aspera ad upstream