From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S967039AbcCPUVA (ORCPT ); Wed, 16 Mar 2016 16:21:00 -0400 Received: from mx2.suse.de ([195.135.220.15]:59897 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S964807AbcCPUU4 (ORCPT ); Wed, 16 Mar 2016 16:20:56 -0400 Date: Wed, 16 Mar 2016 21:20:53 +0100 From: "Luis R. Rodriguez" To: "Luis R. Rodriguez" Cc: Yinghai Lu , Stuart Hayes , Prarit Bhargava , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , Linux Kernel Mailing List , the arch/x86 maintainers , Toshi Kani Subject: Re: Fwd: [PATCH] x86: Use larger chunks in mtrr_cleanup Message-ID: <20160316202053.GO1990@wotan.suse.de> References: <55E477DE.2060106@gmail.com> <55E47B4D.1050103@gmail.com> <20150903024542.GS8051@wotan.suse.de> <55E83A3E.3030000@redhat.com> <55F6DDDF.70909@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 05, 2015 at 11:43:59AM -0800, Luis R. Rodriguez wrote: > On Thu, Nov 5, 2015 at 11:14 AM, Yinghai Lu wrote: > > On Mon, Sep 14, 2015 at 7:46 AM, Stuart Hayes wrote: > >> > >> Booting with 'disable_mtrr_cleanup' works, but the system I am working with > >> isn't actually failing--it just gets ugly error messages. And the BIOS on the > >> system I am working with had set up the MTRRs correctly. > > > > Please post boot log and /proc/mtrr for: > > 1. without your patch > > 2. without your patch and with disable_mtrr_cleanup in boot command line. > > 3. with your patch. > > Stuart, > > to provide some context -- I reached out to Yinghai as he wrote the > original mtrr cleanup code. The commit logs seem to read that a crash > was possible on systems with > 4 GiB RAM with some types of BIOSes... > The cleanup code seems to trigger when variable MTRRs do not exist > that are UC, or when all varible MTRRs that exist are just UC + WB > (Yinghai correct me if I'm wrong). The commit log in question > (95ffa2438d0e9 "x86: mtrr cleanup for converting continuous to > discrete layout, v8") was not very clear about the cause of the crash > -- but suppose the issue here was the BIOS on some systems might want > to create some UC variable MTRRs early on and there was no UC MTRRs > available, and I can only guess the cleanup exists as hack for those > BIOSes. Even if that was the case -- its still not clear *why* the > crash would happen but I suppose a driver mishap can happen without UC > guarantees for some devices the BIOS may want to enable UC MTRR on. > > To be able to determine what we do upstream we need to understand the > above first. We also need to understand if the cleanup might also be > implicated by userspace drivers using /proc/mtrr, or if a proprietary > driver exists that does use mtrr_add() directly even though PAT has > been available for ages and all drivers are now properly converted. > > With clear answers to the above we'll be able to determine what the > right course of action should be for this patch. For instance I'm > inclined to strive to disable the complex cleanup code if we don't > need it anymore, but if we do need it your patch makes sense. If the > patch makes sense then though are we going to have to keep updating > the segment size *every time* as systems grow? That seems rather > silly. And if PAT is prevalent why are vendors adding MTRRs still? The > cleanup seems complex and a major hack for a fix for some BIOSes, I'd > much rather identify the exact issue and only have a fix to address > that case. I never heard back... so let's take this up on the other thread I just raised. Luis