From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751383AbbCSWPN (ORCPT ); Thu, 19 Mar 2015 18:15:13 -0400 Received: from mail-lb0-f176.google.com ([209.85.217.176]:34614 "EHLO mail-lb0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750891AbbCSWPJ (ORCPT ); Thu, 19 Mar 2015 18:15:09 -0400 From: Rasmus Villemoes To: Alexey Dobriyan Cc: Andrew Morton , Linux Kernel , Peter Zijlstra , Tejun Heo , Denis Vlasenko , KAMEZAWA Hiroyuki Subject: Re: + lib-vsprintfc-even-faster-decimal-conversion.patch added to -mm tree Organization: D03 References: <5500b987.kerYYCYfIffruy3Z%akpm@linux-foundation.org> <87y4n0xxn3.fsf@rasmusvillemoes.dk> <20150314092104.GA1674@p183.telecom.by> <20150317150406.f4ec837c6787dc8f3d1661f2@linux-foundation.org> X-Hashcash: 1:20:150319:vda.linux@googlemail.com::VaMAiGI9IF8wTYWv:00000000000000000000000000000000000000MeX X-Hashcash: 1:20:150319:linux-kernel@vger.kernel.org::ReFInKkcSoIB+RZq:0000000000000000000000000000000000aNP X-Hashcash: 1:20:150319:akpm@linux-foundation.org::BNZrAD7POK+v7tjl:000000000000000000000000000000000000018L X-Hashcash: 1:20:150319:kamezawa.hiroyu@jp.fujitsu.com::CVc5ur4WAfZynOZh:00000000000000000000000000000001PSQ X-Hashcash: 1:20:150319:tj@kernel.org::ssGMyNI7xW8jdqH/:00006ZNf X-Hashcash: 1:20:150319:adobriyan@gmail.com::RLgD46pF92H8tOQ8:0000000000000000000000000000000000000000006SFj X-Hashcash: 1:20:150319:peterz@infradead.org::PwXpfysEF9cOHqkn:00000000000000000000000000000000000000000D+ol Date: Thu, 19 Mar 2015 23:15:05 +0100 In-Reply-To: (Alexey Dobriyan's message of "Thu, 19 Mar 2015 15:17:10 +0300") Message-ID: <87k2ycprbq.fsf@rasmusvillemoes.dk> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 19 2015, Alexey Dobriyan wrote: > On Wed, Mar 18, 2015 at 1:04 AM, Andrew Morton > wrote: >> On Mon, 16 Mar 2015 18:19:41 +0300 Alexey Dobriyan wrote: >> >>> Rasmus, I redid benchmarks: >> >> tl;dr ;) Is this an ack or a nack? > > New code executes slower for some input on one CPU I've benchmarked, > both with -O2 and -Os (Core 2 Duo E6550). Running Alexey's code on my Core 2 Duo, I can confirm that. However, my own benchmark did show the claimed 25-50% improvement, depending on distribution. One difference between our benchmarks is that in Alexey's case all branches are perfectly predictable - whether that matters I can't tell [is there a "flush branch prediction" instruction?]. Also, I found a somewhat subtle flaw in his benchmark [1] which gave the old code a small (1-2 cycles) advantage. Fixing that and applying the small tweak I just sent out [2], Alexey's benchmark no longer shows any difference between the old and new code on the Core 2 Duo. Rasmus [1] put_dec was inlined into num_to_str in the old code - in the actual kernel code, it is and was not, since it has another caller. I somehow just cargo-culted the noinline_for_stack annotations all over, so it also wasn't inlined in the benchmark of the new code. [2] https://lkml.org/lkml/2015/3/19/802