From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756324Ab2BANLN (ORCPT ); Wed, 1 Feb 2012 08:11:13 -0500 Received: from mail-qw0-f53.google.com ([209.85.216.53]:34516 "EHLO mail-qw0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753825Ab2BANLM convert rfc822-to-8bit (ORCPT ); Wed, 1 Feb 2012 08:11:12 -0500 MIME-Version: 1.0 In-Reply-To: References: <1327674295-3700-1-git-send-email-akinobu.mita@gmail.com> <1327674295-3700-2-git-send-email-akinobu.mita@gmail.com> <1327684589.12089.22.camel@joe2Laptop> Date: Wed, 1 Feb 2012 22:11:11 +0900 Message-ID: Subject: Re: [PATCH] mtd/nand: use string library From: Akinobu Mita To: Brian Norris Cc: Joe Perches , Scott Branden , linux-kernel@vger.kernel.org, linux-mtd@lists.infradead.org, Jiandong Zheng , akpm@linux-foundation.org, David Woodhouse , eric.dumazet@gmail.com Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 2012/2/1 Brian Norris : > On Fri, Jan 27, 2012 at 3:52 PM, Akinobu Mita wrote: >> 2012/1/28 Brian Norris : >>> On Fri, Jan 27, 2012 at 9:16 AM, Joe Perches wrote: >>>> On Fri, 2012-01-27 at 23:24 +0900, Akinobu Mita wrote: >>>>> - Use memchr_inv to check if the data contains all 0xFF bytes. >>>>>   It is faster than looping for each byte. >>>> >>>> Stupid question: >>>> >>>> Are there any mtd devices modified that are slower >>>> at 64 bit accesses than repeated 8 bit accesses? >>> >>> I believe this patch deals with kernel buffers, not any kind of direct >>> access to the MTD, so the question (which is not stupid IMO) should be >>> regarding CPU architectures. And my educated guess is that 64-bit >>> access should not be any slower. I do know that 8-bit access *is* >>> slower for some relevant architectures. >> >> It could be slower when the number of bytes scanned is very small >> (found a unmatched character immediately, or the size of the area >> is very small), because memchr_inv() needs to generate a 64bit pattern >> to compare before starting the loop.  I recalled that Eric Dumazet >> pointed out it could generate the 64bit pattern more efficiently. >> (https://lkml.org/lkml/2011/8/8/480) >> >> Even if that small scanning is slower, this change can be assumed cleanup >> patch that simplifies the code. > > Well, I agree that it qualifies as cleanup as well, but we should at > least make an attempt not to cause performance regression... > > So by my understanding, the use of memchr_inv() is on buffers of > minimum length of 10 in this patch, so we're likely to have decent > results. And memcmp() usage looks fine to me. Sorry, I answered without checking memchr_inv() carefully. If the size of buffer is less than 16 bytes, memchr_inv() scans for each byte as the original code did. So it is unlikely to be slower in the most cases. But I mentioned in the previous email, there are some problems in memchr_inv(). I'll send the patch in a few days. > So unless other concerns arise: > > Acked-by: Brian Norris Thanks.