From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754460AbcFPNMO (ORCPT ); Thu, 16 Jun 2016 09:12:14 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:34161 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754033AbcFPNMJ (ORCPT ); Thu, 16 Jun 2016 09:12:09 -0400 X-IBM-Helo: d23dlp01.au.ibm.com X-IBM-MailFrom: maddy@linux.vnet.ibm.com X-IBM-RcptTo: linux-kernel@vger.kernel.org Subject: Re: [PATCH] tools/perf: fix the word selected in find_*_bit To: Yury Norov References: <1465990973-31483-1-git-send-email-maddy@linux.vnet.ibm.com> <20160615195127.GA6039@yury-N73SV> <20160615211104.GA6978@yury-N73SV> Cc: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, Arnaldo Carvalho de Melo , Adrian Hunter , Borislav Petkov , David Ahern , George Spelvin , Jiri Olsa , Namhyung Kim , Rasmus Villemoes , Wang Nan , Yury Norov , Michael Ellerman From: Madhavan Srinivasan Date: Thu, 16 Jun 2016 18:41:50 +0530 User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1 MIME-Version: 1.0 In-Reply-To: <20160615211104.GA6978@yury-N73SV> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16061613-0012-0000-0000-000001A00A7C X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16061613-0013-0000-0000-00000569E254 Message-Id: <632e6ce5-afc6-5507-ef91-fddf727992ff@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-06-16_06:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1604210000 definitions=main-1606160147 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thursday 16 June 2016 02:41 AM, Yury Norov wrote: > On Wed, Jun 15, 2016 at 10:51:27PM +0300, Yury Norov wrote: >> Hi Madhavan, >> >> On Wed, Jun 15, 2016 at 05:12:53PM +0530, Madhavan Srinivasan wrote: >>> When decoding the perf_regs mask in regs_dump__printf(), >>> we loop through the mask using find_first_bit and find_next_bit functions. >>> And mask is of type "u64". But "u64" is send as a "unsigned long *" to >>> lib functions along with sizeof(). >>> >>> While the exisitng code works fine in most of the case, when using a 32bit perf >>> on a 64bit kernel (Big Endian), we end reading the wrong word. In find_first_bit(), >>> one word at a time (based on BITS_PER_LONG) is loaded and >>> checked for any bit set. In 32bit BE userspace, >>> BITS_PER_LONG turns out to be 32, and for a mask value of >>> "0x00000000000000ff", find_first_bit will return 32, instead of 0. >>> Reason for this is that, value in the word0 is all zeros and value >>> in word1 is 0xff. Ideally, second word in the mask should be loaded >>> and searched. Patch swaps the word to look incase of 32bit BE. >> I think this is not a problem of find_bit() at all. You have wrong >> typecast as the source of problem (tools/perf/util/session.c"): >> >> 940 static void regs_dump__printf(u64 mask, u64 *regs) >> 941 { >> 942 unsigned rid, i = 0; >> 943 >> 944 for_each_set_bit(rid, (unsigned long *) &mask, sizeof(mask) * 8) { >> ^^^^ Here ^^^^ >> 945 u64 val = regs[i++]; >> 946 >> 947 printf(".... %-5s 0x%" PRIx64 "\n", >> 948 perf_reg_name(rid), val); >> 949 } >> 950 } >> >> But for some reason you change correct find_bit()... >> >> Though proper fix is like this for me: >> >> static void regs_dump__printf(u64 mask, u64 *regs) >> { >> unsigned rid, i = 0; >> unsigned long _mask[sizeof(mask)/sizeof(unsigned long)]; >> >> _mask[0] = mask & ULONG_MAX; >> if (sizeof(mask) > sizeof(unsigned long)) >> _mask[1] = mask >> BITS_PER_LONG; >> >> for_each_set_bit(rid, _mask, sizeof(mask) * BITS_PER_BYTE) { >> u64 val = regs[i++]; >> >> printf(".... %-5s 0x%" PRIx64 "\n", >> perf_reg_name(rid), val); >> } >> } >> >> Maybe there already is some macro doing the conversion for you... > yes it is, cpu_to_le64() is what you want no wait, on second look, cpu_to_le64() is not right. Because we will end up swapping within 32bit. But what you suggested looks to be fine. I can repost this with one minor tweak, right shift with 32 instead of BITS_PER_LONG (since I see compiler errors in 64bit). Maddy > >> Yury. >> >>> Cc: Arnaldo Carvalho de Melo >>> Cc: Adrian Hunter >>> Cc: Borislav Petkov >>> Cc: David Ahern >>> Cc: George Spelvin >>> Cc: Jiri Olsa >>> Cc: Namhyung Kim >>> Cc: Rasmus Villemoes >>> Cc: Wang Nan >>> Cc: Yury Norov >>> Cc: Michael Ellerman >>> Signed-off-by: Madhavan Srinivasan >>> --- >>> tools/lib/find_bit.c | 17 +++++++++++++++++ >>> 1 file changed, 17 insertions(+) >>> >>> diff --git a/tools/lib/find_bit.c b/tools/lib/find_bit.c >>> index 9122a9e80046..996b3e04324f 100644 >>> --- a/tools/lib/find_bit.c >>> +++ b/tools/lib/find_bit.c >>> @@ -37,7 +37,12 @@ static unsigned long _find_next_bit(const unsigned long *addr, >>> if (!nbits || start >= nbits) >>> return nbits; >>> >>> +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64) >>> + tmp = addr[(((nbits - 1)/BITS_PER_LONG) - (start / BITS_PER_LONG))] >>> + ^ invert; >>> +#else >>> tmp = addr[start / BITS_PER_LONG] ^ invert; >>> +#endif >>> >>> /* Handle 1st word. */ >>> tmp &= BITMAP_FIRST_WORD_MASK(start); >>> @@ -48,7 +53,12 @@ static unsigned long _find_next_bit(const unsigned long *addr, >>> if (start >= nbits) >>> return nbits; >>> >>> +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64) >>> + tmp = addr[(((nbits - 1)/BITS_PER_LONG) - (start / BITS_PER_LONG))] >>> + ^ invert; >>> +#else >>> tmp = addr[start / BITS_PER_LONG] ^ invert; >>> +#endif >>> } >>> >>> return min(start + __ffs(tmp), nbits); >>> @@ -75,8 +85,15 @@ unsigned long find_first_bit(const unsigned long *addr, unsigned long size) >>> unsigned long idx; >>> >>> for (idx = 0; idx * BITS_PER_LONG < size; idx++) { >>> +#if (__BYTE_ORDER == __BIG_ENDIAN) && (BITS_PER_LONG != 64) >>> + if (addr[(((size-1)/BITS_PER_LONG) - idx)]) >>> + return min(idx * BITS_PER_LONG + >>> + __ffs(addr[(((size-1)/BITS_PER_LONG) - idx)]), >>> + size); >>> +#else >>> if (addr[idx]) >>> return min(idx * BITS_PER_LONG + __ffs(addr[idx]), size); >>> +#endif >>> } >>> >>> return size; >>> -- >>> 1.9.1