From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 158F6C5DF61 for ; Thu, 7 Nov 2019 09:04:53 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B96162084D for ; Thu, 7 Nov 2019 09:04:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B96162084D Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arndb.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 67E0F6B0003; Thu, 7 Nov 2019 04:04:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 654D46B0006; Thu, 7 Nov 2019 04:04:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 56AFD6B0007; Thu, 7 Nov 2019 04:04:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0172.hostedemail.com [216.40.44.172]) by kanga.kvack.org (Postfix) with ESMTP id 4261C6B0003 for ; Thu, 7 Nov 2019 04:04:52 -0500 (EST) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id 16F8981D6 for ; Thu, 7 Nov 2019 09:04:52 +0000 (UTC) X-FDA: 76128896424.09.honey78_47823fc12823e X-HE-Tag: honey78_47823fc12823e X-Filterd-Recvd-Size: 6133 Received: from mout.kundenserver.de (mout.kundenserver.de [212.227.126.130]) by imf46.hostedemail.com (Postfix) with ESMTP for ; Thu, 7 Nov 2019 09:04:51 +0000 (UTC) Received: from mail-qt1-f171.google.com ([209.85.160.171]) by mrelayeu.kundenserver.de (mreue009 [212.227.15.129]) with ESMTPSA (Nemesis) id 1M8hMn-1iWkS01u9B-004iJ1 for ; Thu, 07 Nov 2019 10:04:49 +0100 Received: by mail-qt1-f171.google.com with SMTP id o49so1617531qta.7 for ; Thu, 07 Nov 2019 01:04:49 -0800 (PST) X-Gm-Message-State: APjAAAWHyjC/dDll01RIRq9LmRkI1qTqEx1qNUiefZHdKV4+khkNJkB1 KCMaJgeXvTRWJZirhCCIdqtF4+rbc4S+ytSDPv4= X-Google-Smtp-Source: APXvYqwEAZHLeB1bhzWOfFNmTmXnFWhU9AbrNgVCtxzvK7/cijILpndjStbnyCVUf2fEGTSZepVcTrI04E6gwotiZ98= X-Received: by 2002:ac8:67d9:: with SMTP id r25mr2633461qtp.7.1573117488078; Thu, 07 Nov 2019 01:04:48 -0800 (PST) MIME-Version: 1.0 References: <20191030142237.249532-1-glider@google.com> <20191030142237.249532-3-glider@google.com> <20191101055033.GA226263@google.com> <20191107060816.GA93084@google.com> In-Reply-To: <20191107060816.GA93084@google.com> From: Arnd Bergmann Date: Thu, 7 Nov 2019 10:04:32 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH RFC v2 02/25] stackdepot: prevent Clang from optimizing away stackdepot_memcmp() To: Sergey Senozhatsky Cc: Alexander Potapenko , Vegard Nossum , Dmitry Vyukov , Linux Memory Management List , Al Viro , Andrew Morton , Andrey Ryabinin , Andy Lutomirski , Ard Biesheuvel , Christoph Hellwig , Dmitry Torokhov , Eric Dumazet , Eric Van Hensbergen , Greg Kroah-Hartman , Harry Wentland , Herbert Xu , Ingo Molnar , Jens Axboe , "Martin K . Petersen" , Martin Schwidefsky , "Michael S. Tsirkin" , Michal Simek , Petr Mladek , Sergey Senozhatsky , Steven Rostedt , Takashi Iwai , "Theodore Ts'o" , Thomas Gleixner , Wolfram Sang , Vasily Gorbik , Ilya Leoshkevich , Mark Rutland , Matthew Wilcox , Randy Dunlap , Andrey Konovalov , Marco Elver Content-Type: text/plain; charset="UTF-8" X-Provags-ID: V03:K1:UXDyXt9d96fk563Eu/Yr90pNhGtWHJFEDKV9TBGooOTBKRoJq1r sZIgiZh0+rc1qpxF6inZmJwh4PFJHi+cUjqyQhF1Ts9yNlnvkfBP/aXmDPODF9g/5rgpIlA SKzArWwbBl2QUhujUEy1WhAXi/QW8LoG4i56yBv/CErASc+z4lsdD5yT3ipge9U//A/9HyR vPbyLKQUImGOtmilvwECA== X-UI-Out-Filterresults: notjunk:1;V03:K0:2wFebb2VsHk=:UVx54oWtGpZlfFqg021G0X pz3Ur2GKT4M4IsFoJGJGkfHaGJOm7K5vMkc51Pz1QfmJLWyeCLXVusZsAo67R013sJwwm3Qj+ x7+wapkYcgkxSofZkodWAdLKZdHmYlgujWT1i8/W4KsMpMb8nRFwYpxdQ6z2NQs8pm2kXASp6 f3NDC/3PLXalKkcJ7a7Z7M7vCNB4VFAJQjn7SCSaoxM5moXDH9XpKsD1xSC1rO7Ph2G/iTYkO qseSCVHinx2KzqQdhIarl86olCpEvqtfG+47Z+G/RL/rEbCF4kQlYcVdhbeDBBy1O52XURHUj DIoGmL5QsQ2aXJUV/Ld9veAmx40oiqGjrYlt8JMQr/WOE5YbIU2z7HuzI8c3A3Ml59b+qVLbZ Km1ZNFBo4v7bXWtpSB44MTYZtlGBlMWREl8J5NPlGXk85oh5dCINtq7pekyW7B3czoPQbizmo wt4AfuNMXNhkW6PuWJJFgSr75snuVP74hoH2K92h0uLRjXNi/7xFNCH90kcOgPFmu4iKktNQz B5TOFk3pexJ9tmIEfyK0ggccgfYRZZIOfP8u3kw0U/pixMi1oUsJ6H72JQnOUQ3Uy3AfHd4NR QDmJuwxmAHUW5CXXUnxDuRZIrwunWs6zivv3JnQZfteW1CckDgpNY6nExezPmqow/X/xF0Svn +FNuNJGQLqu30U09n5FRsmeAL6WcXP/5OJh1uxDSrIBFArNP4+UKawIyJGr6uWCziPobSi0n/ mzz04oabFZcd94XRQ6vqXPZ6hURyiNgPHwIZOy4icvtEawmueToWeZEm+Rl3jfvChySlB+xU6 wliwblHLdIU9E3nXgM3wyW4TkpgxMzo1y3ayhTtBZCsm0G4elZk3Kd2c3+4sY8ZbOWLT5kjhD BVUn9dgZKcPm8NSpni4w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Nov 7, 2019 at 7:08 AM Sergey Senozhatsky wrote: > > On (19/11/06 12:43), Alexander Potapenko wrote: > > > On (19/10/30 15:22), glider@google.com wrote: > > > > @@ -163,6 +163,11 @@ int stackdepot_memcmp(const unsigned long *u1, const unsigned long *u2, > > > > unsigned int n) > > > > { > > > > for ( ; n-- ; u1++, u2++) { > > > > + /* > > > > + * Prevent Clang from replacing this function with a bcmp() > > > > + * call. > > > > + */ > > > > + barrier(); > > > > if (*u1 != *u2) > > > > return 1; > > > > } > > > > > > Would 'volatile' do the trick? > > It does. I can replace the barrier with a volatile if you think that's better. > > However this'll add a checkpatch warning, as volatiles are discouraged > > for synchronization (although in this case there's no synchronization) > > Yeah, 'volatile' in this case will do what it sort of meant to do - prevent > compiler optimizations. So, like you said, it's not a synchronization issue > and we don't 'volatile' data structures. The normal way to do a volatile access would be READ_ONCE()/WRITE_ONCE(), but that seems stronger than the barrier() here. I'd just stick to adding a barrier. > Do you need to do barrier() on every iteration? Does clang behave if > you do one barrier() instead of 'n' barrier()-s? If it does things right, it would make that a single-byte copy plus a call to bcmp(). I certainly wouldn't want to have an implementation that relies on the compiler making sub-optimal decisions ;-) Arnd