From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754587AbcAHQix (ORCPT ); Fri, 8 Jan 2016 11:38:53 -0500 Received: from g9t5009.houston.hp.com ([15.240.92.67]:53634 "EHLO g9t5009.houston.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753041AbcAHQiv convert rfc822-to-8bit (ORCPT ); Fri, 8 Jan 2016 11:38:51 -0500 From: "Elliott, Robert (Persistent Memory)" To: Matt Fleming , Andy Shevchenko CC: Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , "x86@kernel.org" , "linux-efi@vger.kernel.org" , "linux-kernel@vger.kernel.org" Subject: RE: [PATCH 4/4] x86/efi: print size and base in binary units in efi_print_memmap Thread-Topic: [PATCH 4/4] x86/efi: print size and base in binary units in efi_print_memmap Thread-Index: AQHRQLPMH+jcM8QWzEeNDGUnDvX3C57xnDOAgABDS+A= Date: Fri, 8 Jan 2016 16:38:17 +0000 Message-ID: <94D0CD8314A33A4D9D801C0FE68B40295BF07E19@G4W3202.americas.hpqcorp.net> References: <1450402114-3606-1-git-send-email-elliott@hpe.com> <1450402114-3606-5-git-send-email-elliott@hpe.com> <20151221161629.GG4227@codeblueprint.co.uk> <20160108121921.GI2532@codeblueprint.co.uk> In-Reply-To: <20160108121921.GI2532@codeblueprint.co.uk> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [16.210.48.36] Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > -----Original Message----- > From: Matt Fleming [mailto:matt@codeblueprint.co.uk] > Sent: Friday, January 8, 2016 6:19 AM > To: Andy Shevchenko > Cc: Elliott, Robert (Persistent Memory) ; Thomas Gleixner > ; Ingo Molnar ; H. Peter Anvin > ; x86@kernel.org; linux-efi@vger.kernel.org; linux- > kernel@vger.kernel.org > Subject: Re: [PATCH 4/4] x86/efi: print size and base in binary units in > efi_print_memmap > > On Sun, 27 Dec, at 04:35:12PM, Andy Shevchenko wrote: > > On Mon, Dec 21, 2015 at 6:16 PM, Matt Fleming > wrote: > > >> diff --git a/arch/x86/platform/efi/efi.c > b/arch/x86/platform/efi/efi.c > > >> index 635a955..030ba91 100644 > > >> --- a/arch/x86/platform/efi/efi.c > > >> +++ b/arch/x86/platform/efi/efi.c > > >> @@ -222,6 +222,25 @@ int __init efi_memblock_x86_reserve_range(void) > > >> return 0; > > >> } > > >> > > >> +char * __init efi_size_format(char *buf, size_t size, u64 bytes) > > >> +{ > > >> + if (!bytes || (bytes & 0x3ff)) > > >> + snprintf(buf, size, "%llu B", bytes); > > >> + else if (bytes & 0xfffff) > > >> + snprintf(buf, size, "%llu KiB", bytes >> 10); > > >> + else if (bytes & 0x3fffffff) > > >> + snprintf(buf, size, "%llu MiB", bytes >> 20); > > >> + else if (bytes & 0xffffffffff) > > >> + snprintf(buf, size, "%llu GiB", bytes >> 30); > > >> + else if (bytes & 0x3ffffffffffff) > > >> + snprintf(buf, size, "%llu TiB", bytes >> 40); > > >> + else if (bytes & 0xfffffffffffffff) > > >> + snprintf(buf, size, "%llu PiB", bytes >> 50); > > >> + else > > >> + snprintf(buf, size, "%llu EiB", bytes >> 60); > > >> + return buf; > > > > For me it looks like ffs with name in the table can be used. > > Could you provide a patch? I think this is functionally equivalent: #include char * efi_size_format_ffsl(char *buf, size_t size, u64 bytes) { if (!bytes || ffsl(bytes) < 10) snprintf(buf, size, "%llu B", bytes); else if (ffsl(bytes) < 20) snprintf(buf, size, "%llu KiB", bytes >> 10); else if (ffsl(bytes) < 30) snprintf(buf, size, "%llu MiB", bytes >> 20); else if (ffsl(bytes) < 40) snprintf(buf, size, "%llu GiB", bytes >> 30); else if (ffsl(bytes) < 50) snprintf(buf, size, "%llu TiB", bytes >> 40); else if (ffsl(bytes) < 60) snprintf(buf, size, "%llu PiB", bytes >> 50); else snprintf(buf, size, "%llu EiB", bytes >> 60); return buf; } Compiled as a user program with gcc -O2, the original results in mov and testq instructions: movq %rdi, %rbx je .L2 testl $1023, %edx jne .L2 testl $1048575, %edx jne .L15 testl $1073741823, %edx jne .L16 movabsq $1099511627775, %rax testq %rax, %rdx jne .L17 movabsq $1125899906842623, %rax testq %rax, %rdx jne .L18 movabsq $1152921504606846975, %rax movq %rdx, %rcx testq %rax, %rdx jne .L19 while the ffs version uses bit scan forward (bsfq) and only needs cmpl instructions since the values are smaller: movq %rdi, %rbx je .L21 bsfq %rdx, %rcx addq $1, %rcx cmpl $9, %ecx jle .L21 cmpl $19, %ecx jle .L33 cmpl $29, %ecx jle .L34 cmpl $39, %ecx .p2align 4,,2 jle .L35 cmpl $49, %ecx .p2align 4,,2 jle .L36 cmpl $59, %ecx .p2align 4,,2 jle .L37 The kernel offers ffs(int x) but not ffsl(), and it uses inline assembly for one of these: bsfl bsfl, cmovzl bsfl, jnz, movl I don't know which code is the most efficient. --- Robert Elliott, HPE Persistent Memory