From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751071AbdKLLd7 (ORCPT ); Sun, 12 Nov 2017 06:33:59 -0500 Received: from ozlabs.org ([103.22.144.67]:58075 "EHLO ozlabs.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750944AbdKLLd6 (ORCPT ); Sun, 12 Nov 2017 06:33:58 -0500 From: Michael Ellerman To: Yury Norov , linux-kernel@vger.kernel.org Cc: Yury Norov , Alexey Dobriyan , Andrew Morton , Clement Courbet , Matthew Wilcox , Rasmus Villemoes Subject: Re: [PATCH] lib: test module for find_*_bit() functions In-Reply-To: <20171109140714.13168-1-ynorov@caviumnetworks.com> References: <20171109140714.13168-1-ynorov@caviumnetworks.com> Date: Sun, 12 Nov 2017 22:33:55 +1100 Message-ID: <87r2t3wx24.fsf@concordia.ellerman.id.au> MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Yury Norov writes: > find_bit functions are widely used in the kernel, including hot paths. > This module tests performance of that functions in 2 typical scenarios: > randomly filled bitmap with relatively equal distribution of set and > cleared bits, and sparse bitmap which has 1 set bit for 500 cleared bits. > > On ThunderX machine: > > Start testing find_bit() with random-filled bitmap > [1032111.632383] find_next_bit: 240043 cycles, 164062 iterations > [1032111.647236] find_next_zero_bit: 312848 cycles, 163619 iterations > [1032111.661585] find_last_bit: 193748 cycles, 164062 iterations > [1032113.450517] find_first_bit: 177720874 cycles, 164062 iterations > [1032113.462930] > Start testing find_bit() with sparse bitmap > [1032113.477229] find_next_bit: 3633 cycles, 656 iterations > [1032113.494281] find_next_zero_bit: 620399 cycles, 327025 iterations > [1032113.506723] find_last_bit: 3038 cycles, 656 iterations > [1032113.524485] find_first_bit: 691407 cycles, 656 iterations Have you thought about timing it rather than using get_cycles()? get_cycles() has the downside that it can't be compared across different architectures or even platforms within an architecture. cheers