From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751463AbbHARDS (ORCPT ); Sat, 1 Aug 2015 13:03:18 -0400 Received: from mail-ig0-f172.google.com ([209.85.213.172]:37809 "EHLO mail-ig0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751292AbbHARDR (ORCPT ); Sat, 1 Aug 2015 13:03:17 -0400 MIME-Version: 1.0 In-Reply-To: <20150801164910.GA15407@nazgul.tnic> References: <1431714237-880-6-git-send-email-toshi.kani@hp.com> <1432628901-18044-6-git-send-email-bp@alien8.de> <20150731131802.GW25159@twins.programming.kicks-ass.net> <20150731144452.GA8106@nazgul.tnic> <20150731150806.GX25159@twins.programming.kicks-ass.net> <20150731152713.GA9756@nazgul.tnic> <20150801142820.GU30479@wotan.suse.de> <20150801163311.GA15356@nazgul.tnic> <20150801164910.GA15407@nazgul.tnic> Date: Sat, 1 Aug 2015 10:03:16 -0700 X-Google-Sender-Auth: KpKXiyOGzgB8P0ivus96hnfotbM Message-ID: Subject: Re: [tip:x86/mm] x86/mm/mtrr: Clean up mtrr_type_lookup() From: Linus Torvalds To: Borislav Petkov Cc: "Luis R. Rodriguez" , Toshi Kani , Peter Zijlstra , Ingo Molnar , Peter Anvin , Denys Vlasenko , Borislav Petkov , Andrew Morton , Brian Gerst , Thomas Gleixner , linux-mm , Andy Lutomirski , Linux Kernel Mailing List , "linux-tip-commits@vger.kernel.org" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Aug 1, 2015 at 9:49 AM, Borislav Petkov wrote: > > My simplistic mental picture while thinking of this is the IO range > where you send the commands to the device and you don't really want to > delay those but they should reach the device as they get issued. Well, even for command streams, people often do go for a write-combining approach, simply because it is *so* much more efficient on the bus to buffer and burst things. The interface is set up to not really "combine" things in the over-writing sense, but just in the "combine continuous writes into bigger buffers on the CPU, and then write it out as efficiently as possible" sense. Of course, the device (and the driver) has to be designed properly for that, and it makes sense only with certain kinds of models, but it can actually be much more efficient to make the device interface be something like "write 32-byte command packets to a circular write-combining buffer" than it is to do things other ways. Back in the days, that was one of the most efficient ways to try to fill up the PCI bandwidth. There are other approaches too, of course, with the modern variation tending to be "the device does all real accesses by reading over DMA, and the only time you use IO accesses is for setup and as a 'start your DMA transfers now' kind of interface". But write-combining MMIO used to be a very common model for high-performace IO not that long ago, because DMA didn't actually use to be all that efficient at all (nasty behavior with caches and snooping etc back before the memory controller was on-die and DMA accesses snooped caches directly). So the "DMA is efficient even for smaller things" thing is relatively recent. Linus From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ig0-f169.google.com (mail-ig0-f169.google.com [209.85.213.169]) by kanga.kvack.org (Postfix) with ESMTP id 2E0C86B0255 for ; Sat, 1 Aug 2015 13:03:17 -0400 (EDT) Received: by igbpg9 with SMTP id pg9so49871987igb.0 for ; Sat, 01 Aug 2015 10:03:17 -0700 (PDT) Received: from mail-ig0-x22e.google.com (mail-ig0-x22e.google.com. [2607:f8b0:4001:c05::22e]) by mx.google.com with ESMTPS id 70si9898166ioe.170.2015.08.01.10.03.16 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 01 Aug 2015 10:03:16 -0700 (PDT) Received: by igbpg9 with SMTP id pg9so49871928igb.0 for ; Sat, 01 Aug 2015 10:03:16 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20150801164910.GA15407@nazgul.tnic> References: <1431714237-880-6-git-send-email-toshi.kani@hp.com> <1432628901-18044-6-git-send-email-bp@alien8.de> <20150731131802.GW25159@twins.programming.kicks-ass.net> <20150731144452.GA8106@nazgul.tnic> <20150731150806.GX25159@twins.programming.kicks-ass.net> <20150731152713.GA9756@nazgul.tnic> <20150801142820.GU30479@wotan.suse.de> <20150801163311.GA15356@nazgul.tnic> <20150801164910.GA15407@nazgul.tnic> Date: Sat, 1 Aug 2015 10:03:16 -0700 Message-ID: Subject: Re: [tip:x86/mm] x86/mm/mtrr: Clean up mtrr_type_lookup() From: Linus Torvalds Content-Type: text/plain; charset=UTF-8 Sender: owner-linux-mm@kvack.org List-ID: To: Borislav Petkov Cc: "Luis R. Rodriguez" , Toshi Kani , Peter Zijlstra , Ingo Molnar , Peter Anvin , Denys Vlasenko , Borislav Petkov , Andrew Morton , Brian Gerst , Thomas Gleixner , linux-mm , Andy Lutomirski , Linux Kernel Mailing List , "linux-tip-commits@vger.kernel.org" On Sat, Aug 1, 2015 at 9:49 AM, Borislav Petkov wrote: > > My simplistic mental picture while thinking of this is the IO range > where you send the commands to the device and you don't really want to > delay those but they should reach the device as they get issued. Well, even for command streams, people often do go for a write-combining approach, simply because it is *so* much more efficient on the bus to buffer and burst things. The interface is set up to not really "combine" things in the over-writing sense, but just in the "combine continuous writes into bigger buffers on the CPU, and then write it out as efficiently as possible" sense. Of course, the device (and the driver) has to be designed properly for that, and it makes sense only with certain kinds of models, but it can actually be much more efficient to make the device interface be something like "write 32-byte command packets to a circular write-combining buffer" than it is to do things other ways. Back in the days, that was one of the most efficient ways to try to fill up the PCI bandwidth. There are other approaches too, of course, with the modern variation tending to be "the device does all real accesses by reading over DMA, and the only time you use IO accesses is for setup and as a 'start your DMA transfers now' kind of interface". But write-combining MMIO used to be a very common model for high-performace IO not that long ago, because DMA didn't actually use to be all that efficient at all (nasty behavior with caches and snooping etc back before the memory controller was on-die and DMA accesses snooped caches directly). So the "DMA is efficient even for smaller things" thing is relatively recent. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org