All of lore.kernel.org
 help / color / mirror / Atom feed
* RE: Windows side agrees that lowmem corruption is a problem too
@ 2010-06-08 18:28 Yuhong Bao
  2010-06-08 19:06 ` H. Peter Anvin
  0 siblings, 1 reply; 12+ messages in thread
From: Yuhong Bao @ 2010-06-08 18:28 UTC (permalink / raw)
  To: linux-kernel; +Cc: Yuhong Bao, mingo, gregkh, hpa


Adding mingo and gregkh to CC list.
> Remember the lowmem corruption problems that lead the code that displays this to be added to Linux:> AMI BIOS detected: BIOS may corrupt low RAM, working around it.> which was IMO way too broad. Good news, the Windows side agree that this is a problem too:> http://www.microsoft.com/whdc/system/platform/firmware/mem-corrupt.mspx>> Yuhong Bao

 		 	   		  
_________________________________________________________________
Hotmail is redefining busy with tools for the New Busy. Get more from your inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_2

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Windows side agrees that lowmem corruption is a problem too
  2010-06-08 18:28 Windows side agrees that lowmem corruption is a problem too Yuhong Bao
@ 2010-06-08 19:06 ` H. Peter Anvin
  2010-06-08 19:08   ` Ingo Molnar
  0 siblings, 1 reply; 12+ messages in thread
From: H. Peter Anvin @ 2010-06-08 19:06 UTC (permalink / raw)
  To: Yuhong Bao; +Cc: linux-kernel, mingo, gregkh

On 06/08/2010 11:28 AM, Yuhong Bao wrote:
> 
> Adding mingo and gregkh to CC list.
>> Remember the lowmem corruption problems that lead the code that displays this to be added to Linux:> AMI BIOS detected: BIOS may corrupt low RAM, working around it.> which was IMO way too broad. Good news, the Windows side agree that this is a problem too:> http://www.microsoft.com/whdc/system/platform/firmware/mem-corrupt.mspx>> Yuhong Bao
> 

Hardly "way too broad".  I'm starting to think we should enable it
unconditionally, given the number of machines which have exhibited that
problem.  As shown in the whitepaper, Vista/Win7 even avoid using
< 1 MB for a lot of things, presumably for this reason.

If it only was suspend, it would be one thing, but from what I've seen
it has been known to happen at other times too (e.g. HDMI cable insertion!)

	-hpa

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Windows side agrees that lowmem corruption is a problem too
  2010-06-08 19:06 ` H. Peter Anvin
@ 2010-06-08 19:08   ` Ingo Molnar
  2010-06-08 19:22     ` Ondrej Zary
  0 siblings, 1 reply; 12+ messages in thread
From: Ingo Molnar @ 2010-06-08 19:08 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Yuhong Bao, linux-kernel, gregkh


* H. Peter Anvin <hpa@zytor.com> wrote:

> On 06/08/2010 11:28 AM, Yuhong Bao wrote:
> > 
> > Adding mingo and gregkh to CC list.
> >> Remember the lowmem corruption problems that lead the code that displays this to be added to Linux:> AMI BIOS detected: BIOS may corrupt low RAM, working around it.> which was IMO way too broad. Good news, the Windows side agree that this is a problem too:> http://www.microsoft.com/whdc/system/platform/firmware/mem-corrupt.mspx>> Yuhong Bao
> > 
> 
> Hardly "way too broad".  I'm starting to think we should enable it
> unconditionally, given the number of machines which have exhibited that
> problem.  As shown in the whitepaper, Vista/Win7 even avoid using
> < 1 MB for a lot of things, presumably for this reason.
> 
> If it only was suspend, it would be one thing, but from what I've seen
> it has been known to happen at other times too (e.g. HDMI cable insertion!)

Yep, patterns of some silly OSD bitmap showed up in one of the corruption - 
firmware displaying a 'you inserted a cable' kind of icon somewhere and 
messing up the SMM code or so ...

I agree that dis-using <1M by default is probably the sanest option.

	Ingo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Windows side agrees that lowmem corruption is a problem too
  2010-06-08 19:08   ` Ingo Molnar
@ 2010-06-08 19:22     ` Ondrej Zary
  2010-06-08 19:31       ` Yuhong Bao
                         ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Ondrej Zary @ 2010-06-08 19:22 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: H. Peter Anvin, Yuhong Bao, linux-kernel, gregkh

On Tuesday 08 June 2010 21:08:48 Ingo Molnar wrote:
> * H. Peter Anvin <hpa@zytor.com> wrote:
> > On 06/08/2010 11:28 AM, Yuhong Bao wrote:
> > > Adding mingo and gregkh to CC list.
> > >
> > >> Remember the lowmem corruption problems that lead the code that
> > >> displays this to be added to Linux:> AMI BIOS detected: BIOS may
> > >> corrupt low RAM, working around it.> which was IMO way too broad. Good
> > >> news, the Windows side agree that this is a problem too:>
> > >> http://www.microsoft.com/whdc/system/platform/firmware/mem-corrupt.msp
> > >>x>> Yuhong Bao
> >
> > Hardly "way too broad".  I'm starting to think we should enable it
> > unconditionally, given the number of machines which have exhibited that
> > problem.  As shown in the whitepaper, Vista/Win7 even avoid using
> > < 1 MB for a lot of things, presumably for this reason.
> >
> > If it only was suspend, it would be one thing, but from what I've seen
> > it has been known to happen at other times too (e.g. HDMI cable
> > insertion!)
>
> Yep, patterns of some silly OSD bitmap showed up in one of the corruption -
> firmware displaying a 'you inserted a cable' kind of icon somewhere and
> messing up the SMM code or so ...
>
> I agree that dis-using <1M by default is probably the sanest option.

But please limit it to newer systems only (DMI present && year > 200?). There 
are many old machines running fine. Losing 1MB from 16MB is a bad thing.

-- 
Ondrej Zary

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: Windows side agrees that lowmem corruption is a problem too
  2010-06-08 19:22     ` Ondrej Zary
@ 2010-06-08 19:31       ` Yuhong Bao
  2010-06-08 20:31       ` H. Peter Anvin
  2010-06-08 21:56       ` Alan Cox
  2 siblings, 0 replies; 12+ messages in thread
From: Yuhong Bao @ 2010-06-08 19:31 UTC (permalink / raw)
  To: linux, mingo; +Cc: hpa, linux-kernel, gregkh


> But please limit it to newer systems only (DMI present && year> 200?). There
> are many old machines running fine. Losing 1MB from 16MB is a bad thing.
I'd just check the amount of extended memory available.
Yuhong Bao 		 	   		  
_________________________________________________________________
Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_1

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Windows side agrees that lowmem corruption is a problem too
  2010-06-08 19:22     ` Ondrej Zary
  2010-06-08 19:31       ` Yuhong Bao
@ 2010-06-08 20:31       ` H. Peter Anvin
  2010-06-08 20:49         ` Yuhong Bao
  2010-06-11  1:15         ` Robert Hancock
  2010-06-08 21:56       ` Alan Cox
  2 siblings, 2 replies; 12+ messages in thread
From: H. Peter Anvin @ 2010-06-08 20:31 UTC (permalink / raw)
  To: Ondrej Zary; +Cc: Ingo Molnar, Yuhong Bao, linux-kernel, gregkh

On 06/08/2010 12:22 PM, Ondrej Zary wrote:
>>
>> Yep, patterns of some silly OSD bitmap showed up in one of the corruption -
>> firmware displaying a 'you inserted a cable' kind of icon somewhere and
>> messing up the SMM code or so ...
>>
>> I agree that dis-using <1M by default is probably the sanest option.
> 
> But please limit it to newer systems only (DMI present && year > 200?). There 
> are many old machines running fine. Losing 1MB from 16MB is a bad thing.
> 

Disusing 64K is something we can do unconditionally (especially since
we're only talking about 60K -- 15 pages -- of actually usable memory
anyway.)

Dropping all the low 0.6 MB (which is what it really is) is probably
unacceptable by default, but perhaps it makes sense to use it only for
ZONE_DMA or something.

	-hpa

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: Windows side agrees that lowmem corruption is a problem too
  2010-06-08 20:31       ` H. Peter Anvin
@ 2010-06-08 20:49         ` Yuhong Bao
  2010-06-11  1:15         ` Robert Hancock
  1 sibling, 0 replies; 12+ messages in thread
From: Yuhong Bao @ 2010-06-08 20:49 UTC (permalink / raw)
  To: hpa, linux; +Cc: mingo, linux-kernel, gregkh


> Disusing 64K is something we can do unconditionally (especially since
> we're only talking about 60K -- 15 pages -- of actually usable memory
> anyway.)
Unless you really have no extended memory, agreed.
> Dropping all the low 0.6 MB (which is what it really is) is probably
> unacceptable by default, but perhaps it makes sense to use it only for
> ZONE_DMA or something.
For example, for really old 8-bit ISA devices that can only address 20-bit of address space and do not use the system 8237 DMA controller.
Yuhong Bao 		 	   		  
_________________________________________________________________
The New Busy think 9 to 5 is a cute idea. Combine multiple calendars with Hotmail. 
http://www.windowslive.com/campaign/thenewbusy?tile=multicalendar&ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_5

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Windows side agrees that lowmem corruption is a problem too
  2010-06-08 19:22     ` Ondrej Zary
  2010-06-08 19:31       ` Yuhong Bao
  2010-06-08 20:31       ` H. Peter Anvin
@ 2010-06-08 21:56       ` Alan Cox
  2010-06-08 21:57         ` H. Peter Anvin
  2 siblings, 1 reply; 12+ messages in thread
From: Alan Cox @ 2010-06-08 21:56 UTC (permalink / raw)
  To: Ondrej Zary; +Cc: Ingo Molnar, H. Peter Anvin, Yuhong Bao, linux-kernel, gregkh

> > I agree that dis-using <1M by default is probably the sanest option.
> 
> But please limit it to newer systems only (DMI present && year > 200?). There 
> are many old machines running fine. Losing 1MB from 16MB is a bad thing.

Losing the low 1MB is bad thing anyway for things like firmware flashing
and other weird crap that needs low pages (floppy controllers etc).

Losing 64K (but reporting corruption in it in a big scary way) is
probably sensible for distributions, but its a config item so its policy
so that wouldn't be a problem.

It has to be painful to the vendors so they get complaints, reports and
support call costs. Otherwise they won't have the correct incentives to
fix their mess.

Alan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Windows side agrees that lowmem corruption is a problem too
  2010-06-08 21:56       ` Alan Cox
@ 2010-06-08 21:57         ` H. Peter Anvin
  2010-06-09  1:08           ` Yuhong Bao
  0 siblings, 1 reply; 12+ messages in thread
From: H. Peter Anvin @ 2010-06-08 21:57 UTC (permalink / raw)
  To: Alan Cox; +Cc: Ondrej Zary, Ingo Molnar, Yuhong Bao, linux-kernel, gregkh

On 06/08/2010 02:56 PM, Alan Cox wrote:
>>> I agree that dis-using <1M by default is probably the sanest option.
>>
>> But please limit it to newer systems only (DMI present && year > 200?). There 
>> are many old machines running fine. Losing 1MB from 16MB is a bad thing.
> 
> Losing the low 1MB is bad thing anyway for things like firmware flashing
> and other weird crap that needs low pages (floppy controllers etc).
> 
> Losing 64K (but reporting corruption in it in a big scary way) is
> probably sensible for distributions, but its a config item so its policy
> so that wouldn't be a problem.
> 
> It has to be painful to the vendors so they get complaints, reports and
> support call costs. Otherwise they won't have the correct incentives to
> fix their mess.

We have already functionally lost 64K on all existing machines... I
think the current blacklist covers 90% or more of all systems in
existence, and we keep filling in the few holes that remain.

Adding the remaining half-megabyte of RAM really shouldn't be done
unconditionally, but as I said it could plausibly be reserved for
ZONE_DMA only.

	-hpa


^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: Windows side agrees that lowmem corruption is a problem too
  2010-06-08 21:57         ` H. Peter Anvin
@ 2010-06-09  1:08           ` Yuhong Bao
  0 siblings, 0 replies; 12+ messages in thread
From: Yuhong Bao @ 2010-06-09  1:08 UTC (permalink / raw)
  To: hpa, alan; +Cc: linux, mingo, linux-kernel, gregkh


>> It has to be painful to the vendors so they get complaints, reports and
>> support call costs. Otherwise they won't have the correct incentives to
>> fix their mess.
Notice that Windows 7 logs an event log entry when this is detected during sleep, even though they don't use the low meg at all.
> We have already functionally lost 64K on all existing machines... I
> think the current blacklist covers 90% or more of all systems in
> existence, and we keep filling in the few holes that remain.
Which was why I said it was too broad.

If you really mean to do it unconditionally, just do it unconditionally.

Yuhong Bao
 		 	   		  
_________________________________________________________________
The New Busy is not the old busy. Search, chat and e-mail from your inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Windows side agrees that lowmem corruption is a problem too
  2010-06-08 20:31       ` H. Peter Anvin
  2010-06-08 20:49         ` Yuhong Bao
@ 2010-06-11  1:15         ` Robert Hancock
  1 sibling, 0 replies; 12+ messages in thread
From: Robert Hancock @ 2010-06-11  1:15 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Ondrej Zary, Ingo Molnar, Yuhong Bao, linux-kernel, gregkh

On 06/08/2010 02:31 PM, H. Peter Anvin wrote:
> On 06/08/2010 12:22 PM, Ondrej Zary wrote:
>>>
>>> Yep, patterns of some silly OSD bitmap showed up in one of the corruption -
>>> firmware displaying a 'you inserted a cable' kind of icon somewhere and
>>> messing up the SMM code or so ...
>>>
>>> I agree that dis-using<1M by default is probably the sanest option.
>>
>> But please limit it to newer systems only (DMI present&&  year>  200?). There
>> are many old machines running fine. Losing 1MB from 16MB is a bad thing.
>>
>
> Disusing 64K is something we can do unconditionally (especially since
> we're only talking about 60K -- 15 pages -- of actually usable memory
> anyway.)
>
> Dropping all the low 0.6 MB (which is what it really is) is probably
> unacceptable by default, but perhaps it makes sense to use it only for
> ZONE_DMA or something.

According to the document, "Neither Windows Vista nor Windows 7 stores 
operating system code and data in the lowest 1 MB of physical memory, 
regardless of whether Windows is running on real or virtualized 
hardware", so doing the same in general might not be a bad thing (unless 
we have less than a certain amount of RAM).

They're also checksumming the low 1MB and writing an event log entry if 
corruption is detected after sleep events, so if WHQL tests start 
checking for that, maybe these bugs will start going away on new 
machines. Of course, on some machines the corruption apparently happens 
other times as well, so who knows..

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Windows side agrees that lowmem corruption is a problem too
@ 2010-06-08 18:12 Yuhong Bao
  0 siblings, 0 replies; 12+ messages in thread
From: Yuhong Bao @ 2010-06-08 18:12 UTC (permalink / raw)
  To: linux-kernel


Remember the lowmem corruption problems that lead the code that displays this to be added to Linux:AMI BIOS detected: BIOS may corrupt low RAM, working around it.which was IMO way too broad. Good news, the Windows side agree that this is a problem too:http://www.microsoft.com/whdc/system/platform/firmware/mem-corrupt.mspx
Yuhong Bao 		 	   		  
_________________________________________________________________
Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_1

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2010-06-11  1:15 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-06-08 18:28 Windows side agrees that lowmem corruption is a problem too Yuhong Bao
2010-06-08 19:06 ` H. Peter Anvin
2010-06-08 19:08   ` Ingo Molnar
2010-06-08 19:22     ` Ondrej Zary
2010-06-08 19:31       ` Yuhong Bao
2010-06-08 20:31       ` H. Peter Anvin
2010-06-08 20:49         ` Yuhong Bao
2010-06-11  1:15         ` Robert Hancock
2010-06-08 21:56       ` Alan Cox
2010-06-08 21:57         ` H. Peter Anvin
2010-06-09  1:08           ` Yuhong Bao
  -- strict thread matches above, loose matches on Subject: below --
2010-06-08 18:12 Yuhong Bao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.