* Re: [Bug 10328] New: [regression] performance drop for glx
[not found] <bug-10328-27@http.bugzilla.kernel.org/>
@ 2008-03-25 22:28 ` Andrew Morton
2008-03-26 0:39 ` Suresh Siddha
0 siblings, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2008-03-25 22:28 UTC (permalink / raw)
To: Siddha, Suresh B, Ingo Molnar, Thomas Gleixner
Cc: bugme-daemon, linux-kernel, Rafael J. Wysocki
(switched to email. Please respond via emailed reply-to-all, not via the
bugzilla web interface).
On Tue, 25 Mar 2008 15:11:15 -0700 (PDT)
bugme-daemon@bugzilla.kernel.org wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=10328
>
> Summary: [regression] performance drop for glx
> Product: Memory Management
> Version: 2.5
> KernelVersion: 2.6.25-rc6
> Platform: All
> OS/Version: Linux
> Tree: Mainline
> Status: NEW
> Severity: normal
> Priority: P1
> Component: Other
> AssignedTo: akpm@osdl.org
> ReportedBy: bug-track@fisher-privat.net
>
>
> after commit 4138cc3418f5eaa7524ff8e927102863f1ba0ea5 i expirience some grafik
> related perfomance issues.
>
> I used glxgears for test.
> before this patch: 1281.005 FPS
> and after: 765.000 FPS
It nearly halved.
> latest tested commit a4083c9271e0a697278e089f2c0b9a95363ada0a
> still hase bad performance.
>
> I use Pentium D with 2GB RAM, Grafick: i945G, ICH7
>
That's
: commit 4138cc3418f5eaa7524ff8e927102863f1ba0ea5
: Author: Siddha, Suresh B <suresh.b.siddha@intel.com>
: Date: Wed Jan 30 13:33:43 2008 +0100
:
: x86: set strong uncacheable where UC is really desired
:
: Also use _PAGE_PWT for all the mappings which need uncache mapping.
: Instead of existing PAT2 which is UC- (and can be overwritten by MTRRs),
: we now use PAT3 which is strong uncacheable.
:
: This makes it consistent with pgprot_noncached()
:
: Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
: Signed-off-by: Ingo Molnar <mingo@elte.hu>
: Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
:
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug 10328] New: [regression] performance drop for glx
2008-03-25 22:28 ` [Bug 10328] New: [regression] performance drop for glx Andrew Morton
@ 2008-03-26 0:39 ` Suresh Siddha
2008-03-26 4:42 ` Arjan van de Ven
2008-03-26 5:29 ` Ingo Molnar
0 siblings, 2 replies; 9+ messages in thread
From: Suresh Siddha @ 2008-03-26 0:39 UTC (permalink / raw)
To: Andrew Morton
Cc: Siddha, Suresh B, Ingo Molnar, Thomas Gleixner, bugme-daemon,
linux-kernel, Rafael J. Wysocki, venkatesh.pallipadi, arjan,
bug-track
On Tue, Mar 25, 2008 at 03:28:09PM -0700, Andrew Morton wrote:
> (switched to email. Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
>
> On Tue, 25 Mar 2008 15:11:15 -0700 (PDT)
> bugme-daemon@bugzilla.kernel.org wrote:
>
> > http://bugzilla.kernel.org/show_bug.cgi?id=10328
> >
> > Summary: [regression] performance drop for glx
> >
> > after commit 4138cc3418f5eaa7524ff8e927102863f1ba0ea5 i expirience some grafik
> > related perfomance issues.
> >
> > I used glxgears for test.
> > before this patch: 1281.005 FPS
> > and after: 765.000 FPS
>
> It nearly halved.
>
> > latest tested commit a4083c9271e0a697278e089f2c0b9a95363ada0a
> > still hase bad performance.
> >
> > I use Pentium D with 2GB RAM, Grafick: i945G, ICH7
> >
>
> That's
>
> : commit 4138cc3418f5eaa7524ff8e927102863f1ba0ea5
> : Author: Siddha, Suresh B <suresh.b.siddha@intel.com>
> : Date: Wed Jan 30 13:33:43 2008 +0100
> :
> : x86: set strong uncacheable where UC is really desired
> :
> : Also use _PAGE_PWT for all the mappings which need uncache mapping.
> : Instead of existing PAT2 which is UC- (and can be overwritten by MTRRs),
> : we now use PAT3 which is strong uncacheable.
> :
> : This makes it consistent with pgprot_noncached()
Alexey, Can you please try the appended patch?
Andrew, can you please push the appended patch for 2.6.25? Thanks.
---
fb drivers are using ioremap()/ioremap_nocache(), followed by mtrr_add with
WC attribute. Recent changes in page attribute code made both
ioremap()/ioremap_nocache() mappings as UC (instead of previous UC-). This
breaks the graphics performance, as the effective memory type is UC instead
of expected WC.
The correct way to fix this is to add ioremap_wc() (which uses UC- in the
absence of PAT kernel support and WC with PAT) and change all the
fb drivers to use this new ioremap_wc() API.
We can take this correct and longer route for post 2.6.25. For now,
revert back to the UC- behavior for ioremap/ioremap_nocache.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
---
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 4afaba0..794895c 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -137,7 +137,11 @@ static void __iomem *__ioremap(resource_size_t phys_addr, unsigned long size,
switch (mode) {
case IOR_MODE_UNCACHED:
default:
- prot = PAGE_KERNEL_NOCACHE;
+ /*
+ * FIXME: we will use UC MINUS for now, as video fb drivers
+ * depend on it. Upcoming ioremap_wc() will fix this behavior.
+ */
+ prot = PAGE_KERNEL_UC_MINUS;
break;
case IOR_MODE_CACHED:
prot = PAGE_KERNEL;
diff --git a/include/asm-x86/pgtable.h b/include/asm-x86/pgtable.h
index 174b877..9cf472a 100644
--- a/include/asm-x86/pgtable.h
+++ b/include/asm-x86/pgtable.h
@@ -85,6 +85,7 @@ extern pteval_t __PAGE_KERNEL, __PAGE_KERNEL_EXEC;
#define __PAGE_KERNEL_RX (__PAGE_KERNEL_EXEC & ~_PAGE_RW)
#define __PAGE_KERNEL_EXEC_NOCACHE (__PAGE_KERNEL_EXEC | _PAGE_PCD | _PAGE_PWT)
#define __PAGE_KERNEL_NOCACHE (__PAGE_KERNEL | _PAGE_PCD | _PAGE_PWT)
+#define __PAGE_KERNEL_UC_MINUS (__PAGE_KERNEL | _PAGE_PCD)
#define __PAGE_KERNEL_VSYSCALL (__PAGE_KERNEL_RX | _PAGE_USER)
#define __PAGE_KERNEL_VSYSCALL_NOCACHE (__PAGE_KERNEL_VSYSCALL | _PAGE_PCD | _PAGE_PWT)
#define __PAGE_KERNEL_LARGE (__PAGE_KERNEL | _PAGE_PSE)
@@ -101,6 +102,7 @@ extern pteval_t __PAGE_KERNEL, __PAGE_KERNEL_EXEC;
#define PAGE_KERNEL_EXEC MAKE_GLOBAL(__PAGE_KERNEL_EXEC)
#define PAGE_KERNEL_RX MAKE_GLOBAL(__PAGE_KERNEL_RX)
#define PAGE_KERNEL_NOCACHE MAKE_GLOBAL(__PAGE_KERNEL_NOCACHE)
+#define PAGE_KERNEL_UC_MINUS MAKE_GLOBAL(__PAGE_KERNEL_UC_MINUS)
#define PAGE_KERNEL_EXEC_NOCACHE MAKE_GLOBAL(__PAGE_KERNEL_EXEC_NOCACHE)
#define PAGE_KERNEL_LARGE MAKE_GLOBAL(__PAGE_KERNEL_LARGE)
#define PAGE_KERNEL_LARGE_EXEC MAKE_GLOBAL(__PAGE_KERNEL_LARGE_EXEC)
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [Bug 10328] New: [regression] performance drop for glx
2008-03-26 0:39 ` Suresh Siddha
@ 2008-03-26 4:42 ` Arjan van de Ven
2008-04-25 5:35 ` Alexey Fisher
2008-03-26 5:29 ` Ingo Molnar
1 sibling, 1 reply; 9+ messages in thread
From: Arjan van de Ven @ 2008-03-26 4:42 UTC (permalink / raw)
To: Suresh Siddha
Cc: Andrew Morton, Ingo Molnar, Thomas Gleixner, bugme-daemon,
linux-kernel, Rafael J. Wysocki, venkatesh.pallipadi, bug-track
Suresh Siddha wrote:
> On Tue, Mar 25, 2008 at 03:28:09PM -0700, Andrew Morton wrote:
>> (switched to email. Please respond via emailed reply-to-all, not via the
>> bugzilla web interface).
>>
>> On Tue, 25 Mar 2008 15:11:15 -0700 (PDT)
>> bugme-daemon@bugzilla.kernel.org wrote:
>>
>>> http://bugzilla.kernel.org/show_bug.cgi?id=10328
>>>
>>> Summary: [regression] performance drop for glx
>>>
>>> after commit 4138cc3418f5eaa7524ff8e927102863f1ba0ea5 i expirience some grafik
>>> related perfomance issues.
>>>
>>> I used glxgears for test.
>>> before this patch: 1281.005 FPS
>>> and after: 765.000 FPS
>> It nearly halved.
>>
>>> latest tested commit a4083c9271e0a697278e089f2c0b9a95363ada0a
>>> still hase bad performance.
>>>
>>> I use Pentium D with 2GB RAM, Grafick: i945G, ICH7
>>>
>> That's
>>
>> : commit 4138cc3418f5eaa7524ff8e927102863f1ba0ea5
>> : Author: Siddha, Suresh B <suresh.b.siddha@intel.com>
>> : Date: Wed Jan 30 13:33:43 2008 +0100
>> :
>> : x86: set strong uncacheable where UC is really desired
>> :
>> : Also use _PAGE_PWT for all the mappings which need uncache mapping.
>> : Instead of existing PAT2 which is UC- (and can be overwritten by MTRRs),
>> : we now use PAT3 which is strong uncacheable.
>> :
>> : This makes it consistent with pgprot_noncached()
>
> Alexey, Can you please try the appended patch?
>
> Andrew, can you please push the appended patch for 2.6.25? Thanks.
> ---
>
> fb drivers are using ioremap()/ioremap_nocache(), followed by mtrr_add with
> WC attribute. Recent changes in page attribute code made both
> ioremap()/ioremap_nocache() mappings as UC (instead of previous UC-). This
> breaks the graphics performance, as the effective memory type is UC instead
> of expected WC.
>
> The correct way to fix this is to add ioremap_wc() (which uses UC- in the
> absence of PAT kernel support and WC with PAT) and change all the
> fb drivers to use this new ioremap_wc() API.
>
> We can take this correct and longer route for post 2.6.25. For now,
> revert back to the UC- behavior for ioremap/ioremap_nocache.
I would still like to add an ioremap_wc() even in 2.6.25; even if it's for now
identical to ioremap_nocache(). Better get the right API in place as soon as possible
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug 10328] New: [regression] performance drop for glx
2008-03-26 0:39 ` Suresh Siddha
2008-03-26 4:42 ` Arjan van de Ven
@ 2008-03-26 5:29 ` Ingo Molnar
2008-03-26 17:55 ` Suresh Siddha
1 sibling, 1 reply; 9+ messages in thread
From: Ingo Molnar @ 2008-03-26 5:29 UTC (permalink / raw)
To: Suresh Siddha
Cc: Andrew Morton, Thomas Gleixner, bugme-daemon, linux-kernel,
Rafael J. Wysocki, venkatesh.pallipadi, arjan, bug-track
* Suresh Siddha <suresh.b.siddha@intel.com> wrote:
> fb drivers are using ioremap()/ioremap_nocache(), followed by mtrr_add
> with WC attribute. Recent changes in page attribute code made both
> ioremap()/ioremap_nocache() mappings as UC (instead of previous UC-).
> This breaks the graphics performance, as the effective memory type is
> UC instead of expected WC.
>
> The correct way to fix this is to add ioremap_wc() (which uses UC- in
> the absence of PAT kernel support and WC with PAT) and change all the
> fb drivers to use this new ioremap_wc() API.
>
> We can take this correct and longer route for post 2.6.25. For now,
> revert back to the UC- behavior for ioremap/ioremap_nocache.
thanks Suresh, applied.
Ingo
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug 10328] New: [regression] performance drop for glx
2008-03-26 5:29 ` Ingo Molnar
@ 2008-03-26 17:55 ` Suresh Siddha
0 siblings, 0 replies; 9+ messages in thread
From: Suresh Siddha @ 2008-03-26 17:55 UTC (permalink / raw)
To: Ingo Molnar
Cc: Suresh Siddha, Andrew Morton, Thomas Gleixner, bugme-daemon,
linux-kernel, Rafael J. Wysocki, venkatesh.pallipadi, arjan,
bug-track
On Wed, Mar 26, 2008 at 06:29:11AM +0100, Ingo Molnar wrote:
>
> * Suresh Siddha <suresh.b.siddha@intel.com> wrote:
>
> > fb drivers are using ioremap()/ioremap_nocache(), followed by mtrr_add
> > with WC attribute. Recent changes in page attribute code made both
> > ioremap()/ioremap_nocache() mappings as UC (instead of previous UC-).
> > This breaks the graphics performance, as the effective memory type is
> > UC instead of expected WC.
> >
> > The correct way to fix this is to add ioremap_wc() (which uses UC- in
> > the absence of PAT kernel support and WC with PAT) and change all the
> > fb drivers to use this new ioremap_wc() API.
> >
> > We can take this correct and longer route for post 2.6.25. For now,
> > revert back to the UC- behavior for ioremap/ioremap_nocache.
>
> thanks Suresh, applied.
Well, we need to take care of set_memory_uc() aswell, as the previous version
didn't fix Alexey's issue.
Alexey, can you please test and ack this, before Andrew/Ingo can push relevant
bits to their trees. Thanks.
---
fb drivers are using ioremap()/ioremap_nocache(), followed by mtrr_add with
WC attribute. Recent changes in page attribute code made both
ioremap()/ioremap_nocache() mappings as UC (instead of previous UC-). This
breaks the graphics performance, as the effective memory type is UC instead
of expected WC.
The correct way to fix this is to add ioremap_wc() (which uses UC- in the
absence of PAT kernel support and WC with PAT) and change all the
drivers to use this new ioremap_wc() API.
We can take this correct and longer route for post 2.6.25. For now,
revert back to the UC- behavior for ioremap/ioremap_nocache.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Alexey Fisher <bug-track@fisher-privat.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
---
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 4afaba0..794895c 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -137,7 +137,11 @@ static void __iomem *__ioremap(resource_size_t phys_addr, unsigned long size,
switch (mode) {
case IOR_MODE_UNCACHED:
default:
- prot = PAGE_KERNEL_NOCACHE;
+ /*
+ * FIXME: we will use UC MINUS for now, as video fb drivers
+ * depend on it. Upcoming ioremap_wc() will fix this behavior.
+ */
+ prot = PAGE_KERNEL_UC_MINUS;
break;
case IOR_MODE_CACHED:
prot = PAGE_KERNEL;
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 14e48b5..7b79f6b 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -771,7 +771,7 @@ static inline int change_page_attr_clear(unsigned long addr, int numpages,
int set_memory_uc(unsigned long addr, int numpages)
{
return change_page_attr_set(addr, numpages,
- __pgprot(_PAGE_PCD | _PAGE_PWT));
+ __pgprot(_PAGE_PCD));
}
EXPORT_SYMBOL(set_memory_uc);
diff --git a/include/asm-x86/pgtable.h b/include/asm-x86/pgtable.h
index 174b877..9cf472a 100644
--- a/include/asm-x86/pgtable.h
+++ b/include/asm-x86/pgtable.h
@@ -85,6 +85,7 @@ extern pteval_t __PAGE_KERNEL, __PAGE_KERNEL_EXEC;
#define __PAGE_KERNEL_RX (__PAGE_KERNEL_EXEC & ~_PAGE_RW)
#define __PAGE_KERNEL_EXEC_NOCACHE (__PAGE_KERNEL_EXEC | _PAGE_PCD | _PAGE_PWT)
#define __PAGE_KERNEL_NOCACHE (__PAGE_KERNEL | _PAGE_PCD | _PAGE_PWT)
+#define __PAGE_KERNEL_UC_MINUS (__PAGE_KERNEL | _PAGE_PCD)
#define __PAGE_KERNEL_VSYSCALL (__PAGE_KERNEL_RX | _PAGE_USER)
#define __PAGE_KERNEL_VSYSCALL_NOCACHE (__PAGE_KERNEL_VSYSCALL | _PAGE_PCD | _PAGE_PWT)
#define __PAGE_KERNEL_LARGE (__PAGE_KERNEL | _PAGE_PSE)
@@ -101,6 +102,7 @@ extern pteval_t __PAGE_KERNEL, __PAGE_KERNEL_EXEC;
#define PAGE_KERNEL_EXEC MAKE_GLOBAL(__PAGE_KERNEL_EXEC)
#define PAGE_KERNEL_RX MAKE_GLOBAL(__PAGE_KERNEL_RX)
#define PAGE_KERNEL_NOCACHE MAKE_GLOBAL(__PAGE_KERNEL_NOCACHE)
+#define PAGE_KERNEL_UC_MINUS MAKE_GLOBAL(__PAGE_KERNEL_UC_MINUS)
#define PAGE_KERNEL_EXEC_NOCACHE MAKE_GLOBAL(__PAGE_KERNEL_EXEC_NOCACHE)
#define PAGE_KERNEL_LARGE MAKE_GLOBAL(__PAGE_KERNEL_LARGE)
#define PAGE_KERNEL_LARGE_EXEC MAKE_GLOBAL(__PAGE_KERNEL_LARGE_EXEC)
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [Bug 10328] New: [regression] performance drop for glx
2008-03-26 4:42 ` Arjan van de Ven
@ 2008-04-25 5:35 ` Alexey Fisher
2008-04-26 0:07 ` Suresh Siddha
0 siblings, 1 reply; 9+ messages in thread
From: Alexey Fisher @ 2008-04-25 5:35 UTC (permalink / raw)
To: Arjan van de Ven
Cc: Suresh Siddha, Andrew Morton, Ingo Molnar, Thomas Gleixner,
bugme-daemon, linux-kernel, Rafael J. Wysocki,
venkatesh.pallipadi
[-- Attachment #1: Type: text/plain, Size: 2978 bytes --]
Am Dienstag, den 25.03.2008, 21:42 -0700 schrieb Arjan van de Ven:
> Suresh Siddha wrote:
> > On Tue, Mar 25, 2008 at 03:28:09PM -0700, Andrew Morton wrote:
> >> (switched to email. Please respond via emailed reply-to-all, not via the
> >> bugzilla web interface).
> >>
> >> On Tue, 25 Mar 2008 15:11:15 -0700 (PDT)
> >> bugme-daemon@bugzilla.kernel.org wrote:
> >>
> >>> http://bugzilla.kernel.org/show_bug.cgi?id=10328
> >>>
> >>> Summary: [regression] performance drop for glx
> >>>
> >>> after commit 4138cc3418f5eaa7524ff8e927102863f1ba0ea5 i expirience some grafik
> >>> related perfomance issues.
> >>>
> >>> I used glxgears for test.
> >>> before this patch: 1281.005 FPS
> >>> and after: 765.000 FPS
> >> It nearly halved.
> >>
> >>> latest tested commit a4083c9271e0a697278e089f2c0b9a95363ada0a
> >>> still hase bad performance.
> >>>
> >>> I use Pentium D with 2GB RAM, Grafick: i945G, ICH7
> >>>
> >> That's
> >>
> >> : commit 4138cc3418f5eaa7524ff8e927102863f1ba0ea5
> >> : Author: Siddha, Suresh B <suresh.b.siddha@intel.com>
> >> : Date: Wed Jan 30 13:33:43 2008 +0100
> >> :
> >> : x86: set strong uncacheable where UC is really desired
> >> :
> >> : Also use _PAGE_PWT for all the mappings which need uncache mapping.
> >> : Instead of existing PAT2 which is UC- (and can be overwritten by MTRRs),
> >> : we now use PAT3 which is strong uncacheable.
> >> :
> >> : This makes it consistent with pgprot_noncached()
> >
> > Alexey, Can you please try the appended patch?
> >
> > Andrew, can you please push the appended patch for 2.6.25? Thanks.
> > ---
> >
This issue come again with commit:
commit 2e5d9c857d4e6c9e7b7d8c8c86a68a7842d213d6
Author: venkatesh.pallipadi@intel.com <venkatesh.pallipadi@intel.com>
Date: Tue Mar 18 17:00:14 2008 -0700
x86: PAT infrastructure patch
Sets up pat_init() infrastructure.
PAT MSR has following setting.
PAT
|PCD
||PWT
|||
000 WB _PAGE_CACHE_WB
001 WC _PAGE_CACHE_WC
010 UC- _PAGE_CACHE_UC_MINUS
011 UC _PAGE_CACHE_UC
We are effectively changing WT from boot time setting to WC.
UC_MINUS is used to provide backward compatibility to existing /dev/mem
users(X).
reserve_memtype and free_memtype are new interfaces for maintaining
alias-free
mapping. It is currently implemented in a simple way with a linked list and
not optimized. reserve and free tracks the effective memory type, as a
result
of PAT and MTRR setting rather than what is actually requested in PAT.
pat_init piggy backs on mtrr_init as the rules for setting both pat and
mtrr
are same.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
[-- Attachment #2: Dies ist ein digital signierter Nachrichtenteil --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug 10328] New: [regression] performance drop for glx
2008-04-25 5:35 ` Alexey Fisher
@ 2008-04-26 0:07 ` Suresh Siddha
2008-04-26 11:43 ` Alexey Fisher
2008-04-28 16:34 ` Ingo Molnar
0 siblings, 2 replies; 9+ messages in thread
From: Suresh Siddha @ 2008-04-26 0:07 UTC (permalink / raw)
To: Alexey Fisher
Cc: Arjan van de Ven, Suresh Siddha, Andrew Morton, Ingo Molnar,
Thomas Gleixner, bugme-daemon, linux-kernel, Rafael J. Wysocki,
venkatesh.pallipadi, jbarnes, keithp, airlied
On Fri, Apr 25, 2008 at 07:35:23AM +0200, Alexey Fisher wrote:
> > >> On Tue, 25 Mar 2008 15:11:15 -0700 (PDT)
> > >> bugme-daemon@bugzilla.kernel.org wrote:
> > >>
> > >>> http://bugzilla.kernel.org/show_bug.cgi?id=10328
> > >>>
> > >>> Summary: [regression] performance drop for glx
> > >
> > > Alexey, Can you please try the appended patch?
> > >
> > > Andrew, can you please push the appended patch for 2.6.25? Thanks.
> > > ---
> > >
>
>
> This issue come again with commit:
>
> commit 2e5d9c857d4e6c9e7b7d8c8c86a68a7842d213d6
> Author: venkatesh.pallipadi@intel.com <venkatesh.pallipadi@intel.com>
> Date: Tue Mar 18 17:00:14 2008 -0700
>
> x86: PAT infrastructure patch
>
> Sets up pat_init() infrastructure.
>
Alexey, Sorry. I missed posting a fix in the post PAT series for this issue. Can you
please try the appended patch and Ack?
copying X folks get their attention and their Ack on the patch and next steps outlined.
Thanks.
---
[patch] use UC minus for ioremap(), ioremap_nocache() and pci_mmap_page_range()
Use UC_MINUS for ioremap(), ioremap_nocache() instead of strong UC.
Once all the X drivers move to ioremap_wc(), we can go back to strong
UC semantics for ioremap() and ioremap_nocache().
To avoid attribute aliasing issues, pci_mmap_page_range() will also
use UC_MINUS for default non write-combining mapping request.
Next steps:
a) change all the video drivers using ioremap() or ioremap_nocache()
and adding WC MTTR using mttr_add() to ioremap_wc()
b) for strict usage, we can go back to strong uc semantics
for ioremap() and ioremap_nocache() after some grace period for
completing step-a.
c) user level X server needs to use the appropriate method for setting
up WC mapping (like using resourceX_wc sysfs file instead of
adding MTRR for WC and using /dev/mem or resourceX under /sys)
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
---
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 3a4baf9..c053977 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -176,11 +176,11 @@ static void __iomem *__ioremap(resource_size_t phys_addr, unsigned long size,
/*
* Do not fallback to certain memory types with certain
* requested type:
- * - request is uncached, return cannot be write-back
- * - request is uncached, return cannot be write-combine
+ * - request is uc-, return cannot be write-back
+ * - request is uc-, return cannot be write-combine
* - request is write-combine, return cannot be write-back
*/
- if ((prot_val == _PAGE_CACHE_UC &&
+ if ((prot_val == _PAGE_CACHE_UC_MINUS &&
(new_prot_val == _PAGE_CACHE_WB ||
new_prot_val == _PAGE_CACHE_WC)) ||
(prot_val == _PAGE_CACHE_WC &&
@@ -201,6 +201,9 @@ static void __iomem *__ioremap(resource_size_t phys_addr, unsigned long size,
default:
prot = PAGE_KERNEL_NOCACHE;
break;
+ case _PAGE_CACHE_UC_MINUS:
+ prot = PAGE_KERNEL_UC_MINUS;
+ break;
case _PAGE_CACHE_WC:
prot = PAGE_KERNEL_WC;
break;
@@ -255,7 +258,16 @@ static void __iomem *__ioremap(resource_size_t phys_addr, unsigned long size,
*/
void __iomem *ioremap_nocache(resource_size_t phys_addr, unsigned long size)
{
- return __ioremap(phys_addr, size, _PAGE_CACHE_UC);
+ /*
+ * Ideally, this should be:
+ * pat_wc_enabled ? _PAGE_CACHE_UC : _PAGE_CACHE_UC_MINUS;
+ *
+ * Till we fix all X drivers to use ioremap_wc(), we will use
+ * UC MINUS.
+ */
+ unsigned long val = _PAGE_CACHE_UC_MINUS;
+
+ return __ioremap(phys_addr, size, val);
}
EXPORT_SYMBOL(ioremap_nocache);
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index c29ebd0..ad3202c 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -779,14 +779,20 @@ static inline int change_page_attr_clear(unsigned long addr, int numpages,
int _set_memory_uc(unsigned long addr, int numpages)
{
+ /*
+ * for now UC MINUS. see comments in ioremap_nocache()
+ */
return change_page_attr_set(addr, numpages,
- __pgprot(_PAGE_CACHE_UC));
+ __pgprot(_PAGE_CACHE_UC_MINUS));
}
int set_memory_uc(unsigned long addr, int numpages)
{
+ /*
+ * for now UC MINUS. see comments in ioremap_nocache()
+ */
if (reserve_memtype(addr, addr + numpages * PAGE_SIZE,
- _PAGE_CACHE_UC, NULL))
+ _PAGE_CACHE_UC_MINUS, NULL))
return -EINVAL;
return _set_memory_uc(addr, numpages);
diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c
index 94f6c73..8af0f0b 100644
--- a/arch/x86/pci/i386.c
+++ b/arch/x86/pci/i386.c
@@ -301,6 +301,13 @@ int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
prot = pgprot_val(vma->vm_page_prot);
if (pat_wc_enabled && write_combine)
prot |= _PAGE_CACHE_WC;
+ else if (pat_wc_enabled)
+ /*
+ * ioremap() and ioremap_nocache() defaults to UC MINUS for now.
+ * To avoid attribute conflicts, request UC MINUS here
+ * aswell.
+ */
+ prot |= _PAGE_CACHE_UC_MINUS;
else if (boot_cpu_data.x86 > 3)
prot |= _PAGE_CACHE_UC;
@@ -319,9 +326,8 @@ int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma,
* - request is uncached, return cannot be write-combine
* - request is write-combine, return cannot be write-back
*/
- if ((flags == _PAGE_CACHE_UC &&
- (new_flags == _PAGE_CACHE_WB ||
- new_flags == _PAGE_CACHE_WC)) ||
+ if ((flags == _PAGE_CACHE_UC_MINUS &&
+ (new_flags == _PAGE_CACHE_WB)) ||
(flags == _PAGE_CACHE_WC &&
new_flags == _PAGE_CACHE_WB)) {
free_memtype(addr, addr+len);
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [Bug 10328] New: [regression] performance drop for glx
2008-04-26 0:07 ` Suresh Siddha
@ 2008-04-26 11:43 ` Alexey Fisher
2008-04-28 16:34 ` Ingo Molnar
1 sibling, 0 replies; 9+ messages in thread
From: Alexey Fisher @ 2008-04-26 11:43 UTC (permalink / raw)
To: Suresh Siddha
Cc: Arjan van de Ven, Andrew Morton, Ingo Molnar, Thomas Gleixner,
bugme-daemon, linux-kernel, Rafael J. Wysocki,
venkatesh.pallipadi, jbarnes, keithp, airlied
[-- Attachment #1: Type: text/plain, Size: 1473 bytes --]
Am Freitag, den 25.04.2008, 17:07 -0700 schrieb Suresh Siddha:
> On Fri, Apr 25, 2008 at 07:35:23AM +0200, Alexey Fisher wrote:
> > > >> On Tue, 25 Mar 2008 15:11:15 -0700 (PDT)
> > > >> bugme-daemon@bugzilla.kernel.org wrote:
> > > >>
> > > >>> http://bugzilla.kernel.org/show_bug.cgi?id=10328
> > > >>>
> > > >>> Summary: [regression] performance drop for glx
> > > >
> > > > Alexey, Can you please try the appended patch?
> > > >
> > > > Andrew, can you please push the appended patch for 2.6.25? Thanks.
> > > > ---
> > > >
> >
> >
> > This issue come again with commit:
> >
> > commit 2e5d9c857d4e6c9e7b7d8c8c86a68a7842d213d6
> > Author: venkatesh.pallipadi@intel.com <venkatesh.pallipadi@intel.com>
> > Date: Tue Mar 18 17:00:14 2008 -0700
> >
> > x86: PAT infrastructure patch
> >
> > Sets up pat_init() infrastructure.
> >
>
> Alexey, Sorry. I missed posting a fix in the post PAT series for this issue. Can you
> please try the appended patch and Ack?
>
> copying X folks get their attention and their Ack on the patch and next steps outlined.
>
> Thanks.
> ---
>
> [patch] use UC minus for ioremap(), ioremap_nocache() and pci_mmap_page_range()
>
> Use UC_MINUS for ioremap(), ioremap_nocache() instead of strong UC.
> Once all the X drivers move to ioremap_wc(), we can go back to strong
> UC semantics for ioremap() and ioremap_nocache().
>
Thank you. This patch working.
Ack ?
[-- Attachment #2: Dies ist ein digital signierter Nachrichtenteil --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug 10328] New: [regression] performance drop for glx
2008-04-26 0:07 ` Suresh Siddha
2008-04-26 11:43 ` Alexey Fisher
@ 2008-04-28 16:34 ` Ingo Molnar
1 sibling, 0 replies; 9+ messages in thread
From: Ingo Molnar @ 2008-04-28 16:34 UTC (permalink / raw)
To: Suresh Siddha
Cc: Alexey Fisher, Arjan van de Ven, Andrew Morton, Thomas Gleixner,
bugme-daemon, linux-kernel, Rafael J. Wysocki,
venkatesh.pallipadi, jbarnes, keithp, airlied
* Suresh Siddha <suresh.b.siddha@intel.com> wrote:
> Alexey, Sorry. I missed posting a fix in the post PAT series for this
> issue. Can you please try the appended patch and Ack?
>
> copying X folks get their attention and their Ack on the patch and
> next steps outlined.
i queued up the patch below - hand-merged to x86.git.
Ingo
----------->
Subject: x86 PAT: fix performance drop for glx, use UC minus for ioremap(), ioremap_nocache() and pci_mmap_page_range()
From: Suresh Siddha <suresh.b.siddha@intel.com>
Date: Fri, 25 Apr 2008 17:07:22 -0700
Use UC_MINUS for ioremap(), ioremap_nocache() instead of strong UC.
Once all the X drivers move to ioremap_wc(), we can go back to strong
UC semantics for ioremap() and ioremap_nocache().
To avoid attribute aliasing issues, pci_mmap_page_range() will also
use UC_MINUS for default non write-combining mapping request.
Next steps:
a) change all the video drivers using ioremap() or ioremap_nocache()
and adding WC MTTR using mttr_add() to ioremap_wc()
b) for strict usage, we can go back to strong uc semantics
for ioremap() and ioremap_nocache() after some grace period for
completing step-a.
c) user level X server needs to use the appropriate method for setting
up WC mapping (like using resourceX_wc sysfs file instead of
adding MTRR for WC and using /dev/mem or resourceX under /sys)
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
arch/x86/mm/ioremap.c | 20 ++++++++++++++++----
arch/x86/mm/pageattr.c | 10 ++++++++--
arch/x86/pci/i386.c | 12 +++++++++---
3 files changed, 33 insertions(+), 9 deletions(-)
Index: linux-x86.q/arch/x86/mm/ioremap.c
===================================================================
--- linux-x86.q.orig/arch/x86/mm/ioremap.c
+++ linux-x86.q/arch/x86/mm/ioremap.c
@@ -176,11 +176,11 @@ static void __iomem *__ioremap(resource_
/*
* Do not fallback to certain memory types with certain
* requested type:
- * - request is uncached, return cannot be write-back
- * - request is uncached, return cannot be write-combine
+ * - request is uc-, return cannot be write-back
+ * - request is uc-, return cannot be write-combine
* - request is write-combine, return cannot be write-back
*/
- if ((prot_val == _PAGE_CACHE_UC &&
+ if ((prot_val == _PAGE_CACHE_UC_MINUS &&
(new_prot_val == _PAGE_CACHE_WB ||
new_prot_val == _PAGE_CACHE_WC)) ||
(prot_val == _PAGE_CACHE_WC &&
@@ -201,6 +201,9 @@ static void __iomem *__ioremap(resource_
default:
prot = PAGE_KERNEL_NOCACHE;
break;
+ case _PAGE_CACHE_UC_MINUS:
+ prot = PAGE_KERNEL_UC_MINUS;
+ break;
case _PAGE_CACHE_WC:
prot = PAGE_KERNEL_WC;
break;
@@ -255,7 +258,16 @@ static void __iomem *__ioremap(resource_
*/
void __iomem *ioremap_nocache(resource_size_t phys_addr, unsigned long size)
{
- return __ioremap(phys_addr, size, _PAGE_CACHE_UC);
+ /*
+ * Ideally, this should be:
+ * pat_wc_enabled ? _PAGE_CACHE_UC : _PAGE_CACHE_UC_MINUS;
+ *
+ * Till we fix all X drivers to use ioremap_wc(), we will use
+ * UC MINUS.
+ */
+ unsigned long val = _PAGE_CACHE_UC_MINUS;
+
+ return __ioremap(phys_addr, size, val);
}
EXPORT_SYMBOL(ioremap_nocache);
Index: linux-x86.q/arch/x86/mm/pageattr.c
===================================================================
--- linux-x86.q.orig/arch/x86/mm/pageattr.c
+++ linux-x86.q/arch/x86/mm/pageattr.c
@@ -829,14 +829,20 @@ static inline int change_page_attr_clear
int _set_memory_uc(unsigned long addr, int numpages)
{
+ /*
+ * for now UC MINUS. see comments in ioremap_nocache()
+ */
return change_page_attr_set(&addr, numpages,
- __pgprot(_PAGE_CACHE_UC), 0);
+ __pgprot(_PAGE_CACHE_UC_MINUS), 0);
}
int set_memory_uc(unsigned long addr, int numpages)
{
+ /*
+ * for now UC MINUS. see comments in ioremap_nocache()
+ */
if (reserve_memtype(addr, addr + numpages * PAGE_SIZE,
- _PAGE_CACHE_UC, NULL))
+ _PAGE_CACHE_UC_MINUS, NULL))
return -EINVAL;
return _set_memory_uc(addr, numpages);
Index: linux-x86.q/arch/x86/pci/i386.c
===================================================================
--- linux-x86.q.orig/arch/x86/pci/i386.c
+++ linux-x86.q/arch/x86/pci/i386.c
@@ -301,6 +301,13 @@ int pci_mmap_page_range(struct pci_dev *
prot = pgprot_val(vma->vm_page_prot);
if (pat_wc_enabled && write_combine)
prot |= _PAGE_CACHE_WC;
+ else if (pat_wc_enabled)
+ /*
+ * ioremap() and ioremap_nocache() defaults to UC MINUS for now.
+ * To avoid attribute conflicts, request UC MINUS here
+ * aswell.
+ */
+ prot |= _PAGE_CACHE_UC_MINUS;
else if (boot_cpu_data.x86 > 3)
prot |= _PAGE_CACHE_UC;
@@ -319,9 +326,8 @@ int pci_mmap_page_range(struct pci_dev *
* - request is uncached, return cannot be write-combine
* - request is write-combine, return cannot be write-back
*/
- if ((flags == _PAGE_CACHE_UC &&
- (new_flags == _PAGE_CACHE_WB ||
- new_flags == _PAGE_CACHE_WC)) ||
+ if ((flags == _PAGE_CACHE_UC_MINUS &&
+ (new_flags == _PAGE_CACHE_WB)) ||
(flags == _PAGE_CACHE_WC &&
new_flags == _PAGE_CACHE_WB)) {
free_memtype(addr, addr+len);
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2008-04-28 16:35 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <bug-10328-27@http.bugzilla.kernel.org/>
2008-03-25 22:28 ` [Bug 10328] New: [regression] performance drop for glx Andrew Morton
2008-03-26 0:39 ` Suresh Siddha
2008-03-26 4:42 ` Arjan van de Ven
2008-04-25 5:35 ` Alexey Fisher
2008-04-26 0:07 ` Suresh Siddha
2008-04-26 11:43 ` Alexey Fisher
2008-04-28 16:34 ` Ingo Molnar
2008-03-26 5:29 ` Ingo Molnar
2008-03-26 17:55 ` Suresh Siddha
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.