From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751466AbdJSCLC (ORCPT ); Wed, 18 Oct 2017 22:11:02 -0400 Received: from LGEAMRELO11.lge.com ([156.147.23.51]:47855 "EHLO lgeamrelo11.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751217AbdJSCLA (ORCPT ); Wed, 18 Oct 2017 22:11:00 -0400 X-Original-SENDERIP: 156.147.1.151 X-Original-MAILFROM: iamjoonsoo.kim@lge.com X-Original-SENDERIP: 10.177.222.138 X-Original-MAILFROM: iamjoonsoo.kim@lge.com Date: Thu, 19 Oct 2017 11:14:38 +0900 From: Joonsoo Kim To: Thomas Gleixner Cc: Linus Torvalds , Josh Poimboeuf , kernel test robot , Ingo Molnar , Andy Lutomirski , Borislav Petkov , Brian Gerst , Denys Vlasenko , "H. Peter Anvin" , Jiri Slaby , Mike Galbraith , Peter Zijlstra , LKML , LKP , linux-mm , Pekka Enberg , David Rientjes , Andrew Morton , Christoph Lameter Subject: Re: [lkp-robot] [x86/kconfig] 81d3871900: BUG:unable_to_handle_kernel Message-ID: <20171019021437.GA3662@js1304-P5Q-DELUXE> References: <20171010121513.GC5445@yexl-desktop> <20171011023106.izaulhwjcoam55jt@treble> <20171011170120.7flnk6r77dords7a@treble> <20171017073326.GA23865@js1304-P5Q-DELUXE> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 18, 2017 at 03:15:03PM +0200, Thomas Gleixner wrote: > On Wed, 18 Oct 2017, Linus Torvalds wrote: > > On Tue, Oct 17, 2017 at 3:33 AM, Joonsoo Kim wrote: > > > > > > It looks like a compiler bug. The code of slob_units() try to read two > > > bytes at ffff88001c4afffe. It's valid. But the compiler generates > > > wrong code that try to read four bytes. > > > > > > static slobidx_t slob_units(slob_t *s) > > > { > > > if (s->units > 0) > > > return s->units; > > > return 1; > > > } > > > > > > s->units is defined as two bytes in this setup. > > > > > > Wrongly generated code for this part. > > > > > > 'mov 0x0(%rbp), %ebp' > > > > > > %ebp is four bytes. > > > > > > I guess that this wrong four bytes read cross over the valid memory > > > boundary and this issue happend. > > > > Hmm. I can see why the compiler would do that (16-bit accesses are > > slow), but it's definitely wrong. > > > > Does it work ok if that slob_units() code is written as > > > > static slobidx_t slob_units(slob_t *s) > > { > > int units = READ_ONCE(s->units); > > > > if (units > 0) > > return units; > > return 1; > > } > > > > which might be an acceptable workaround for now? > > Discussed exactly that with Peter Zijlstra yesterday, but we came to the > conclusion that this is a whack a mole game. It might fix this slob issue, > but what guarantees that we don't have the same problem in some other > place? Just duct taping this particular instance makes me nervous. I have checked that above patch works fine but I agree with Thomas. > Joonsoo says: > > > gcc 4.8 and 4.9 fails to generate proper code. gcc 5.1 and > > the latest version works fine. > > > I guess that this problem is related to the corner case of some > > optimization feature since minor code change makes the result > > different. And, with -O2, proper code is generated even if gcc 4.8 is > > used. > > So it would be useful to figure out which optimization bit is causing that > and blacklist it for the affected compiler versions. I have tried it but cannot find any clue. What I did is that compiling with -O2 and disabling some options to make option list as same as -Os. Some guide line is roughly mentioned in gcc man page. However, I cannot reproduce the issue by this way. Thanks. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf0-f200.google.com (mail-pf0-f200.google.com [209.85.192.200]) by kanga.kvack.org (Postfix) with ESMTP id 561B06B025E for ; Wed, 18 Oct 2017 22:11:01 -0400 (EDT) Received: by mail-pf0-f200.google.com with SMTP id 76so4668579pfr.3 for ; Wed, 18 Oct 2017 19:11:01 -0700 (PDT) Received: from lgeamrelo11.lge.com (LGEAMRELO11.lge.com. [156.147.23.51]) by mx.google.com with ESMTP id r8si8767584pli.733.2017.10.18.19.10.59 for ; Wed, 18 Oct 2017 19:11:00 -0700 (PDT) Date: Thu, 19 Oct 2017 11:14:38 +0900 From: Joonsoo Kim Subject: Re: [lkp-robot] [x86/kconfig] 81d3871900: BUG:unable_to_handle_kernel Message-ID: <20171019021437.GA3662@js1304-P5Q-DELUXE> References: <20171010121513.GC5445@yexl-desktop> <20171011023106.izaulhwjcoam55jt@treble> <20171011170120.7flnk6r77dords7a@treble> <20171017073326.GA23865@js1304-P5Q-DELUXE> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: To: Thomas Gleixner Cc: Linus Torvalds , Josh Poimboeuf , kernel test robot , Ingo Molnar , Andy Lutomirski , Borislav Petkov , Brian Gerst , Denys Vlasenko , "H. Peter Anvin" , Jiri Slaby , Mike Galbraith , Peter Zijlstra , LKML , LKP , linux-mm , Pekka Enberg , David Rientjes , Andrew Morton , Christoph Lameter On Wed, Oct 18, 2017 at 03:15:03PM +0200, Thomas Gleixner wrote: > On Wed, 18 Oct 2017, Linus Torvalds wrote: > > On Tue, Oct 17, 2017 at 3:33 AM, Joonsoo Kim wrote: > > > > > > It looks like a compiler bug. The code of slob_units() try to read two > > > bytes at ffff88001c4afffe. It's valid. But the compiler generates > > > wrong code that try to read four bytes. > > > > > > static slobidx_t slob_units(slob_t *s) > > > { > > > if (s->units > 0) > > > return s->units; > > > return 1; > > > } > > > > > > s->units is defined as two bytes in this setup. > > > > > > Wrongly generated code for this part. > > > > > > 'mov 0x0(%rbp), %ebp' > > > > > > %ebp is four bytes. > > > > > > I guess that this wrong four bytes read cross over the valid memory > > > boundary and this issue happend. > > > > Hmm. I can see why the compiler would do that (16-bit accesses are > > slow), but it's definitely wrong. > > > > Does it work ok if that slob_units() code is written as > > > > static slobidx_t slob_units(slob_t *s) > > { > > int units = READ_ONCE(s->units); > > > > if (units > 0) > > return units; > > return 1; > > } > > > > which might be an acceptable workaround for now? > > Discussed exactly that with Peter Zijlstra yesterday, but we came to the > conclusion that this is a whack a mole game. It might fix this slob issue, > but what guarantees that we don't have the same problem in some other > place? Just duct taping this particular instance makes me nervous. I have checked that above patch works fine but I agree with Thomas. > Joonsoo says: > > > gcc 4.8 and 4.9 fails to generate proper code. gcc 5.1 and > > the latest version works fine. > > > I guess that this problem is related to the corner case of some > > optimization feature since minor code change makes the result > > different. And, with -O2, proper code is generated even if gcc 4.8 is > > used. > > So it would be useful to figure out which optimization bit is causing that > and blacklist it for the affected compiler versions. I have tried it but cannot find any clue. What I did is that compiling with -O2 and disabling some options to make option list as same as -Os. Some guide line is roughly mentioned in gcc man page. However, I cannot reproduce the issue by this way. Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: multipart/mixed; boundary="===============5532204759277220455==" MIME-Version: 1.0 From: Joonsoo Kim To: lkp@lists.01.org Subject: Re: [lkp-robot] [x86/kconfig] 81d3871900: BUG:unable_to_handle_kernel Date: Thu, 19 Oct 2017 11:14:38 +0900 Message-ID: <20171019021437.GA3662@js1304-P5Q-DELUXE> In-Reply-To: List-Id: --===============5532204759277220455== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable On Wed, Oct 18, 2017 at 03:15:03PM +0200, Thomas Gleixner wrote: > On Wed, 18 Oct 2017, Linus Torvalds wrote: > > On Tue, Oct 17, 2017 at 3:33 AM, Joonsoo Kim = wrote: > > > > > > It looks like a compiler bug. The code of slob_units() try to read two > > > bytes at ffff88001c4afffe. It's valid. But the compiler generates > > > wrong code that try to read four bytes. > > > > > > static slobidx_t slob_units(slob_t *s) > > > { > > > if (s->units > 0) > > > return s->units; > > > return 1; > > > } > > > > > > s->units is defined as two bytes in this setup. > > > > > > Wrongly generated code for this part. > > > > > > 'mov 0x0(%rbp), %ebp' > > > > > > %ebp is four bytes. > > > > > > I guess that this wrong four bytes read cross over the valid memory > > > boundary and this issue happend. > > = > > Hmm. I can see why the compiler would do that (16-bit accesses are > > slow), but it's definitely wrong. > > = > > Does it work ok if that slob_units() code is written as > > = > > static slobidx_t slob_units(slob_t *s) > > { > > int units =3D READ_ONCE(s->units); > > = > > if (units > 0) > > return units; > > return 1; > > } > > = > > which might be an acceptable workaround for now? > = > Discussed exactly that with Peter Zijlstra yesterday, but we came to the > conclusion that this is a whack a mole game. It might fix this slob issue, > but what guarantees that we don't have the same problem in some other > place? Just duct taping this particular instance makes me nervous. I have checked that above patch works fine but I agree with Thomas. > Joonsoo says: > = > > gcc 4.8 and 4.9 fails to generate proper code. gcc 5.1 and > > the latest version works fine. > = > > I guess that this problem is related to the corner case of some > > optimization feature since minor code change makes the result > > different. And, with -O2, proper code is generated even if gcc 4.8 is > > used. > = > So it would be useful to figure out which optimization bit is causing that > and blacklist it for the affected compiler versions. I have tried it but cannot find any clue. What I did is that compiling with -O2 and disabling some options to make option list as same as -Os. Some guide line is roughly mentioned in gcc man page. However, I cannot reproduce the issue by this way. Thanks. --===============5532204759277220455==--