linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Abysmal HDD/USB write speed after sleep on a UEFI system
@ 2013-02-10 10:43 Artem S. Tashkinov
  0 siblings, 0 replies; 33+ messages in thread
From: Artem S. Tashkinov @ 2013-02-10 10:43 UTC (permalink / raw)
  To: linux-kernel

Hello,

I have a P8P67 Pro motherboard made by ASUS and recently I decided to switch to EUFI boot.
Maybe it's a coincidence or maybe Linux kernel 3.7.6 (vanilla) has some serious bug but after
waking up from sleep write performance becomes intolerable.

On boot I have:

HDD write performance: ~120MB/sec
USB write performance: ~18MB/sec

After sleep:

HDD write performance: ~7MB/sec (i.e 17 times slower)
USB write performance: ~0.5MB/sec (i.e. 36 times slower)

This is totally unacceptable, the computer becomes unusable.

I'm open to suggestions how to debug this extremely serious problem.

P.S. Since I'm still using x86 kernel, on boot it switches x86-64 UEFI off:

[    0.000000] efi: EFI v2.31 by American Megatrends
[    0.000000] efi:  ACPI=0xdf385000  ACPI 2.0=0xdf385000  SMBIOS=0xdec28e98  MPS=0xfc9a0
[    0.000000] efi: No EFI runtime due to 32/64-bit mismatch with kernel
...
[    0.000000] efi: Setup done, disabling due to 32/64-bit mismatch

Best regards,

Artem

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
       [not found] <587312497.6453.1360650312498.JavaMail.mail@webmail01>
@ 2013-02-12 17:29 ` Linus Torvalds
  2013-02-12 18:29   ` Artem S. Tashkinov
  2013-07-10 17:25   ` hyphop
  0 siblings, 2 replies; 33+ messages in thread
From: Linus Torvalds @ 2013-02-12 17:29 UTC (permalink / raw)
  To: Artem S. Tashkinov, Bjorn Helgaas; +Cc: Linux Kernel Mailing List

On Mon, Feb 11, 2013 at 10:25 PM, Artem S. Tashkinov <t.artem@lycos.com> wrote:
> Hello Linus,
>
> I've already posted a bug report (https://bugzilla.kernel.org/show_bug.cgi?id=53551),
> a message to LKML (http://lkml.indiana.edu/hypermail/linux/kernel/1302.1/00837.html)
> and so far I've received zero response even though the bug is quite critical as it prevents
> me from using suspend altogether.
>
> I wonder if you could tell me who is responsible for this problem and who I need to CC in
> bugzilla.

According to your bugzilla it doesn't really seem to be strictly
UEFI-specific, and it's hard to tell what subsystem is to blame.

A few things to try to pinpoint:

 (a) Is it *only* write performance that suffers, or is it other
performance too? Networking (DMA? Perhaps only writing *to* the
network?)? CPU?

 (b) the fact that it apparently happens with both SATA and USB
implies that it's neither, and is more likely something core like
memory speed (mtrr, caching) or PCI (DMA, burst sizes, whatever).

 (c) can you find anything that changes over the suspend/resume? IOW,
look at things like "lspci -vvxxx" before-and-after, and see what
changed on the bridges leading to both things etc.

The performance drop sounds extreme enough that it sounds like caches
got disabled or something, but that should show up as CPU performance
in general being slow, not just writes to disk. But basically, I think
we need more clues about which sub-area is actually the culprit. My
*guess* would be some core PCI thing not being initialized, but I
don't see how you could even make PCI go that slow. Interrupt
problems? DMA failures? I have no idea.

Has it ever worked? Suspend on desktop motherboards used to be quite
spotty (nobody ever used it, manufacturers didn't care), but it
generally has gotten better since people use it more these days..

Added lkml and Bjorn to the participants, in case anybody has any ideas..

                Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-02-12 17:29 ` Abysmal HDD/USB write speed after sleep on a UEFI system Linus Torvalds
@ 2013-02-12 18:29   ` Artem S. Tashkinov
  2013-02-12 19:32     ` Linus Torvalds
  2013-07-10 17:25   ` hyphop
  1 sibling, 1 reply; 33+ messages in thread
From: Artem S. Tashkinov @ 2013-02-12 18:29 UTC (permalink / raw)
  To: torvalds; +Cc: bhelgaas, linux-kernel

Feb 12, 2013 11:30:20 PM, Linus Torvalds wrote:
>On Mon, Feb 11, 2013 at 10:25 PM, Artem S. Tashkinov wrote:
>> Hello Linus,
>>
>> I  've already posted a bug report (https://bugzilla.kernel.org/show_bug.cgi?id=53551),
>> a message to LKML (http://lkml.indiana.edu/hypermail/linux/kernel/1302.1/00837.html)
>> and so far I  've received zero response even though the bug is quite critical as it prevents
>> me from using suspend altogether.
>>
>> I wonder if you could tell me who is responsible for this problem and who I need to CC in
>> bugzilla.
>
>According to your bugzilla it doesn  't really seem to be strictly
>UEFI-specific, and it  's hard to tell what subsystem is to blame.
>
>A few things to try to pinpoint:
>
> (a) Is it *only* write performance that suffers, or is it other
>performance too? Networking (DMA? Perhaps only writing *to* the
>network?)? CPU?

I've tested hdpard -tT --direct and the output on boot and after suspend
is quite similar.

I've also checked my network read/write speed, and it's the same
~ 100MBit/sec (I have no 1Gbit computers on my network
unfortunately).

>
> (b) the fact that it apparently happens with both SATA and USB
>implies that it  's neither, and is more likely something core like
>memory speed (mtrr, caching) or PCI (DMA, burst sizes, whatever).

I've no idea, please, check my bug report where I've just added lots of
information including a diff between on boot and after suspend.

lspci outputs differ quite substantially, but the things that have change
say nothing to me - you'll want to see it for yourself. I see changes like:

-                       Changed: MRL- PresDet- LinkState-
+                       Changed: MRL- PresDet+ LinkState-

i.e. PresDet minus to PresDet plus.

-               Address: 00000000fee0f00c  Data: 41e1
+               Address: 0000000000000000  Data: 0000

-       Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- TAbort- 
> (c) can you find anything that changes over the suspend/resume? IOW,
>look at things like "lspci -vvxxx" before-and-after, and see what
>changed on the bridges leading to both things etc.
>
>The performance drop sounds extreme enough that it sounds like caches
>got disabled or something, but that should show up as CPU performance
>in general being slow, not just writes to disk. But basically, I think
>we need more clues about which sub-area is actually the culprit. My
>*guess* would be some core PCI thing not being initialized, but I
>don  't see how you could even make PCI go that slow. Interrupt
>problems? DMA failures? I have no idea.
>
>Has it ever worked? Suspend on desktop motherboards used to be quite
>spotty (nobody ever used it, manufacturers didn  't care), but it
>generally has gotten better since people use it more these days..

I remember it used to work before, but I've never suspended more than once
during one boot session before (this time I did it out of pure curiosity) and
I've never run Linux from UEFI.

>
>Added lkml and Bjorn to the participants, in case anybody has any ideas..
>

I'll gladly provide any information you need.

Thanks a lot,

Artem

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-02-12 18:29   ` Artem S. Tashkinov
@ 2013-02-12 19:32     ` Linus Torvalds
  2013-02-12 20:13       ` Artem S. Tashkinov
  0 siblings, 1 reply; 33+ messages in thread
From: Linus Torvalds @ 2013-02-12 19:32 UTC (permalink / raw)
  To: Artem S. Tashkinov; +Cc: Bjorn Helgaas, Linux Kernel Mailing List

On Tue, Feb 12, 2013 at 10:29 AM, Artem S. Tashkinov <t.artem@lycos.com> wrote:
> Feb 12, 2013 11:30:20 PM, Linus Torvalds wrote:
>>
>>A few things to try to pinpoint:
>>
>> (a) Is it *only* write performance that suffers, or is it other
>>performance too? Networking (DMA? Perhaps only writing *to* the
>>network?)? CPU?
>
> I've tested hdpard -tT --direct and the output on boot and after suspend
> is quite similar.
>
> I've also checked my network read/write speed, and it's the same
> ~ 100MBit/sec (I have no 1Gbit computers on my network
> unfortunately).

Ok. So it really sounds like just USB and HD writes. Which is quite
odd, since they have basically nothing in common I can think of
(except the obvious block layer issues).

>> (b) the fact that it apparently happens with both SATA and USB
>>implies that it  's neither, and is more likely something core like
>>memory speed (mtrr, caching) or PCI (DMA, burst sizes, whatever).
>
> I've no idea, please, check my bug report where I've just added lots of
> information including a diff between on boot and after suspend.

I'm not seeing anything particularly interesting there.

Except why/how did the MSI address/data change for the SATA
controller? The irq itself hasn't changed.. There's probably some sane
reason for that too (it's an odd encoding, maybe they code for the
same thing), and there's nothing like that for USB, so...

And if it was irq problems, I'd expect you to see it more for reads
than for writes anyway. Along with a few messages about missed irqs
and whatever.

I'm stumped, and have no ideas. I can't even begin to guess how this
would happen. One thing to try is if it happens for all USB ports (you
have multiple controllers) and I assume performance doesn't come back
if you unplug and replug the USB disk..

                 Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-02-12 19:32     ` Linus Torvalds
@ 2013-02-12 20:13       ` Artem S. Tashkinov
  2013-02-13  4:26         ` Bjorn Helgaas
  0 siblings, 1 reply; 33+ messages in thread
From: Artem S. Tashkinov @ 2013-02-12 20:13 UTC (permalink / raw)
  To: torvalds; +Cc: bhelgaas, linux-kernel

Feb 13, 2013 01:32:53 AM, Linus Torvalds wrote:
On Tue, Feb 12, 2013 at 10:29 AM, Artem S. Tashkinov wrote:
>> Feb 12, 2013 11:30:20 PM, Linus Torvalds wrote:
>>>
>>>A few things to try to pinpoint:
>>>
>>> (a) Is it *only* write performance that suffers, or is it other
>>>performance too? Networking (DMA? Perhaps only writing *to* the
>>>network?)? CPU?
>>
>> I  've tested hdpard -tT --direct and the output on boot and after suspend
>> is quite similar.
>>
>> I  've also checked my network read/write speed, and it  's the same
>> ~ 100MBit/sec (I have no 1Gbit computers on my network
>> unfortunately).
>
>Ok. So it really sounds like just USB and HD writes. Which is quite
>odd, since they have basically nothing in common I can think of
>(except the obvious block layer issues).
>
>>> (b) the fact that it apparently happens with both SATA and USB
>>>implies that it    's neither, and is more likely something core like
>>>memory speed (mtrr, caching) or PCI (DMA, burst sizes, whatever).
>>
>> I  've no idea, please, check my bug report where I  've just added lots of
>> information including a diff between on boot and after suspend.
>
>I  'm not seeing anything particularly interesting there.
>
>Except why/how did the MSI address/data change for the SATA
>controller? The irq itself hasn  't changed.. There  's probably some sane
>reason for that too (it  's an odd encoding, maybe they code for the
>same thing), and there  's nothing like that for USB, so...
>
>And if it was irq problems, I  'd expect you to see it more for reads
>than for writes anyway. Along with a few messages about missed irqs
>and whatever.
>
>I'm stumped, and have no ideas. I can  't even begin to guess how this
>would happen. One thing to try is if it happens for all USB ports (you
>have multiple controllers) and I assume performance doesn  't come back
>if you unplug and replug the USB disk..

I've just plugged and unplugged my USB stick into all available hubs
(including a USB3 one, that is xhci_hcd) and I've got the same write speed
on all of them - around 930KB/sec (quite a weird number - as if I'm on USB
1.1) - lsusb says I'm happily running ehci_hcd/2p, 480M and xhci_hcd/2p,
5000M.

The only pattern that I see here is that write speed to real devices degrades,
tmpfs write speed stays the same:

$ dd if=/dev/zero of=test bs=32M count=32
32+0 records indegrade
32+0 records out
1073741824 bytes (1.1 GB) copied, 0.296323 s, 3.6 GB/s

Best regards,

Artem

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-02-12 20:13       ` Artem S. Tashkinov
@ 2013-02-13  4:26         ` Bjorn Helgaas
  2013-02-19 16:22           ` Alan Stern
  0 siblings, 1 reply; 33+ messages in thread
From: Bjorn Helgaas @ 2013-02-13  4:26 UTC (permalink / raw)
  To: Artem S. Tashkinov
  Cc: torvalds, linux-kernel, linux-pci, Rafael J. Wysocki, Alan Stern

[+cc linux-pci, Rafael, Alan]

[https://bugzilla.kernel.org/show_bug.cgi?id=53551]

On Tue, Feb 12, 2013 at 1:13 PM, Artem S. Tashkinov <t.artem@lycos.com> wrote:
> Feb 13, 2013 01:32:53 AM, Linus Torvalds wrote:
> On Tue, Feb 12, 2013 at 10:29 AM, Artem S. Tashkinov wrote:
>>> Feb 12, 2013 11:30:20 PM, Linus Torvalds wrote:
>>>>
>>>>A few things to try to pinpoint:
>>>>
>>>> (a) Is it *only* write performance that suffers, or is it other
>>>>performance too? Networking (DMA? Perhaps only writing *to* the
>>>>network?)? CPU?
>>>
>>> I  've tested hdpard -tT --direct and the output on boot and after suspend
>>> is quite similar.
>>>
>>> I  've also checked my network read/write speed, and it  's the same
>>> ~ 100MBit/sec (I have no 1Gbit computers on my network
>>> unfortunately).
>>
>>Ok. So it really sounds like just USB and HD writes. Which is quite
>>odd, since they have basically nothing in common I can think of
>>(except the obvious block layer issues).
>>
>>>> (b) the fact that it apparently happens with both SATA and USB
>>>>implies that it    's neither, and is more likely something core like
>>>>memory speed (mtrr, caching) or PCI (DMA, burst sizes, whatever).
>>>
>>> I  've no idea, please, check my bug report where I  've just added lots of
>>> information including a diff between on boot and after suspend.
>>
>>I  'm not seeing anything particularly interesting there.
>>
>>Except why/how did the MSI address/data change for the SATA
>>controller? The irq itself hasn  't changed.. There  's probably some sane
>>reason for that too (it  's an odd encoding, maybe they code for the
>>same thing), and there  's nothing like that for USB, so...
>>
>>And if it was irq problems, I  'd expect you to see it more for reads
>>than for writes anyway. Along with a few messages about missed irqs
>>and whatever.
>>
>>I'm stumped, and have no ideas. I can  't even begin to guess how this
>>would happen. One thing to try is if it happens for all USB ports (you
>>have multiple controllers) and I assume performance doesn  't come back
>>if you unplug and replug the USB disk..
>
> I've just plugged and unplugged my USB stick into all available hubs
> (including a USB3 one, that is xhci_hcd) and I've got the same write speed
> on all of them - around 930KB/sec (quite a weird number - as if I'm on USB
> 1.1) - lsusb says I'm happily running ehci_hcd/2p, 480M and xhci_hcd/2p,
> 5000M.
>
> The only pattern that I see here is that write speed to real devices degrades,
> tmpfs write speed stays the same:
>
> $ dd if=/dev/zero of=test bs=32M count=32
> 32+0 records indegrade
> 32+0 records out
> 1073741824 bytes (1.1 GB) copied, 0.296323 s, 3.6 GB/s

I'm sort of stumped here, too.  For the SATA controller, the only
PCI-related difference I see is the change in the MSI address, which
should just change the target CPU, which doesn't seem like it should
make this much difference.  But could you try this after the resume:

    $ sudo setpci -s00:1f.2 0x84.L=0xfee0400c

to set the MSI address back to the original value to see if it makes a
difference?

The XHCI controllers both have Unsupported Request errors logged.  I
assume these are related to the suspend/resume, and it seems like we
ought to either avoid them or clean them up somehow, but I don't know
enough about AER, and I don't know whether they would cause the
performance issue you're seeing.

There should be more AER logging than is decoded by lspci, so can you
also collect the output of "lspci -vvv -xxxx"?  That will include the
raw logging registers that lspci doesn't decode.

Bjorn

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-02-13  4:26         ` Bjorn Helgaas
@ 2013-02-19 16:22           ` Alan Stern
  2013-02-25 21:57             ` Bjorn Helgaas
  0 siblings, 1 reply; 33+ messages in thread
From: Alan Stern @ 2013-02-19 16:22 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Artem S. Tashkinov, torvalds, linux-kernel, linux-pci, Rafael J. Wysocki

On Tue, 12 Feb 2013, Bjorn Helgaas wrote:

> [+cc linux-pci, Rafael, Alan]
> 
> [https://bugzilla.kernel.org/show_bug.cgi?id=53551]
> 
> On Tue, Feb 12, 2013 at 1:13 PM, Artem S. Tashkinov <t.artem@lycos.com> wrote:
> > Feb 13, 2013 01:32:53 AM, Linus Torvalds wrote:
> > On Tue, Feb 12, 2013 at 10:29 AM, Artem S. Tashkinov wrote:
> >>> Feb 12, 2013 11:30:20 PM, Linus Torvalds wrote:
> >>>>
> >>>>A few things to try to pinpoint:
> >>>>
> >>>> (a) Is it *only* write performance that suffers, or is it other
> >>>>performance too? Networking (DMA? Perhaps only writing *to* the
> >>>>network?)? CPU?
> >>>
> >>> I  've tested hdpard -tT --direct and the output on boot and after suspend
> >>> is quite similar.
> >>>
> >>> I  've also checked my network read/write speed, and it  's the same
> >>> ~ 100MBit/sec (I have no 1Gbit computers on my network
> >>> unfortunately).
> >>
> >>Ok. So it really sounds like just USB and HD writes. Which is quite
> >>odd, since they have basically nothing in common I can think of
> >>(except the obvious block layer issues).

There's a slight chance that we might get some ideas by comparing
usbmon traces showing disk activity before and after the
problem-causing suspend.

Alan Stern


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-02-19 16:22           ` Alan Stern
@ 2013-02-25 21:57             ` Bjorn Helgaas
  2013-02-26  6:35               ` Artem S. Tashkinov
  0 siblings, 1 reply; 33+ messages in thread
From: Bjorn Helgaas @ 2013-02-25 21:57 UTC (permalink / raw)
  To: Alan Stern
  Cc: Artem S. Tashkinov, torvalds, linux-kernel, linux-pci, Rafael J. Wysocki

On Tue, Feb 19, 2013 at 9:22 AM, Alan Stern <stern@rowland.harvard.edu> wrote:
> On Tue, 12 Feb 2013, Bjorn Helgaas wrote:
>
>> [+cc linux-pci, Rafael, Alan]
>>
>> [https://bugzilla.kernel.org/show_bug.cgi?id=53551]
>>
>> On Tue, Feb 12, 2013 at 1:13 PM, Artem S. Tashkinov <t.artem@lycos.com> wrote:
>> > Feb 13, 2013 01:32:53 AM, Linus Torvalds wrote:
>> > On Tue, Feb 12, 2013 at 10:29 AM, Artem S. Tashkinov wrote:
>> >>> Feb 12, 2013 11:30:20 PM, Linus Torvalds wrote:
>> >>>>
>> >>>>A few things to try to pinpoint:
>> >>>>
>> >>>> (a) Is it *only* write performance that suffers, or is it other
>> >>>>performance too? Networking (DMA? Perhaps only writing *to* the
>> >>>>network?)? CPU?
>> >>>
>> >>> I  've tested hdpard -tT --direct and the output on boot and after suspend
>> >>> is quite similar.
>> >>>
>> >>> I  've also checked my network read/write speed, and it  's the same
>> >>> ~ 100MBit/sec (I have no 1Gbit computers on my network
>> >>> unfortunately).
>> >>
>> >>Ok. So it really sounds like just USB and HD writes. Which is quite
>> >>odd, since they have basically nothing in common I can think of
>> >>(except the obvious block layer issues).
>
> There's a slight chance that we might get some ideas by comparing
> usbmon traces showing disk activity before and after the
> problem-causing suspend.

Where are we at with this, Artem?  I assume it's still a problem.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-02-25 21:57             ` Bjorn Helgaas
@ 2013-02-26  6:35               ` Artem S. Tashkinov
  2013-02-26 18:46                 ` Bjorn Helgaas
  0 siblings, 1 reply; 33+ messages in thread
From: Artem S. Tashkinov @ 2013-02-26  6:35 UTC (permalink / raw)
  To: bhelgaas; +Cc: stern, torvalds, linux-kernel, linux-pci, rjw

Feb 26, 2013 03:57:52 AM, Bjorn Helgaas wrote:
>
>Where are we at with this, Artem?  I assume it's still a problem.
>

Yes, it is, Bjorn.

In order to eliminate this problem I switched back to MBR yesterday, because
so far I haven't received any instructions or guidance as to how I can debug
it further. I'm absolutely sure USB write speed is just another manifestation of
it so I decided not to debug USB specifically (it just doesn't make too much
sense).

What I see is that something terribly wrong is going on but if Linus has no ideas
I, as an average Joe, don't have a slightest clue as to what I can do.

The bug report with necessary, but seemingly useless information, can be 
found here: https://bugzilla.kernel.org/show_bug.cgi?id=53551

If anyone comes up with new ideas I can quickly try UEFI again now that I
have two HDDs at my disposal (the old one is formatted as GPT, the new one is
MBR).

Best regards,

Artem

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-02-26  6:35               ` Artem S. Tashkinov
@ 2013-02-26 18:46                 ` Bjorn Helgaas
  2013-02-26 19:14                   ` Artem S. Tashkinov
  0 siblings, 1 reply; 33+ messages in thread
From: Bjorn Helgaas @ 2013-02-26 18:46 UTC (permalink / raw)
  To: Artem S. Tashkinov; +Cc: stern, torvalds, linux-kernel, linux-pci, rjw

On Mon, Feb 25, 2013 at 11:35 PM, Artem S. Tashkinov <t.artem@lycos.com> wrote:
> Feb 26, 2013 03:57:52 AM, Bjorn Helgaas wrote:
>>
>>Where are we at with this, Artem?  I assume it's still a problem.
>>
>
> Yes, it is, Bjorn.
>
> In order to eliminate this problem I switched back to MBR yesterday, because
> so far I haven't received any instructions or guidance as to how I can debug
> it further. I'm absolutely sure USB write speed is just another manifestation of
> it so I decided not to debug USB specifically (it just doesn't make too much
> sense).
>
> What I see is that something terribly wrong is going on but if Linus has no ideas
> I, as an average Joe, don't have a slightest clue as to what I can do.
>
> The bug report with necessary, but seemingly useless information, can be
> found here: https://bugzilla.kernel.org/show_bug.cgi?id=53551
>
> If anyone comes up with new ideas I can quickly try UEFI again now that I
> have two HDDs at my disposal (the old one is formatted as GPT, the new one is
> MBR).

The ideas I saw are:

1) Figure out whether it ever worked.  If an older kernel worked
correctly and a newer one is broken, bisection is at least a
possibility.  You mentioned that it did work before (Feb 12), but in
the past you never suspended twice in one boot session, whereas maybe
you did when seeing the problem?

2) Try the "setpci" to set the MSI address back to the original value
to see if it makes a difference (see my Feb 12 message).

3) Collect "lspci -vvv -xxxx" output to investigate the XHCI
Unsupported Request errors.

4) Use usbmon to collect traces before and after the suspend.

I googled around a bit looking for similar reports.  I found lots of
suspend issues, mostly with Windows, but no leads yet.  It looks like
the board has been around for a while, so you would think we'd have
some other reports of a problem this bad.  But maybe it really is
related to UEFI and nobody really uses that yet?

Bjorn

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-02-26 18:46                 ` Bjorn Helgaas
@ 2013-02-26 19:14                   ` Artem S. Tashkinov
  2013-03-07  0:17                     ` Bjorn Helgaas
  0 siblings, 1 reply; 33+ messages in thread
From: Artem S. Tashkinov @ 2013-02-26 19:14 UTC (permalink / raw)
  To: bhelgaas; +Cc: stern, torvalds, linux-kernel, linux-pci, rjw

Feb 27, 2013 12:47:01 AM, Bjorn Helgaas wrote:
On Mon, Feb 25, 2013 at 11:35 PM, Artem S. Tashkinov wrote:
>> Feb 26, 2013 03:57:52 AM, Bjorn Helgaas wrote:
>>>
>>>Where are we at with this, Artem?  I assume it's still a problem.
>>>
>>
>> Yes, it is, Bjorn.
>>
>> In order to eliminate this problem I switched back to MBR yesterday, because
>> so far I haven't received any instructions or guidance as to how I can debug
>> it further. I'm absolutely sure USB write speed is just another manifestation of
>> it so I decided not to debug USB specifically (it just doesn't make too much
>> sense).
>>
>> What I see is that something terribly wrong is going on but if Linus has no ideas
>> I, as an average Joe, don't have a slightest clue as to what I can do.
>>
>> The bug report with necessary, but seemingly useless information, can be
>> found here: https://bugzilla.kernel.org/show_bug.cgi?id=53551
>>
>> If anyone comes up with new ideas I can quickly try UEFI again now that I
>> have two HDDs at my disposal (the old one is formatted as GPT, the new one is
>> MBR).
>
>The ideas I saw are:
>
>1) Figure out whether it ever worked.  If an older kernel worked
>correctly and a newer one is broken, bisection is at least a
>possibility.  You mentioned that it did work before (Feb 12), but in
>the past you never suspended twice in one boot session, whereas maybe
>you did when seeing the problem?

This is difficult to say since the first kernel I tried to run in EUFI mode was
3.7.x, so I've no idea if any previous ones ever worked.

>
>2) Try the "setpci" to set the MSI address back to the original value
>to see if it makes a difference (see my Feb 12 message).

I will try it soon and report back to you.

>
>3) Collect "lspci -vvv -xxxx" output to investigate the XHCI
>Unsupported Request errors.
>
>4) Use usbmon to collect traces before and after the suspend.

Likewise. Still I don't quite understand why you are persistent in your
desire to investigate USB controllers specifically - my problem affects
all storage devices that I have.

>
>I googled around a bit looking for similar reports.  I found lots of
>suspend issues, mostly with Windows, but no leads yet.  It looks like
>the board has been around for a while, so you would think we'd have
>some other reports of a problem this bad.  But maybe it really is
>related to UEFI and nobody really uses that yet?

99% of people around me don't use UEFI, and the ones who use it do
it because they want to run Hacintosh (it's quite complicated to run
a EUFI OS from a non UEFI BIOS).

That's the main reason you don't see similar reports. EUFI so far hasn't
proven its supremacy and efficiency over BIOS. When 3TB and larger
HDD's become more widespread people will have to use UEFI. They will
simply have no choice (unless of course you have two HDDs, where one
is BIOS formatted to boot your system, and another one is GPT
partitioned in order to support > 2,2TB space).

Best regards,

Artem

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-02-26 19:14                   ` Artem S. Tashkinov
@ 2013-03-07  0:17                     ` Bjorn Helgaas
  2013-04-26 21:36                       ` Bjorn Helgaas
  0 siblings, 1 reply; 33+ messages in thread
From: Bjorn Helgaas @ 2013-03-07  0:17 UTC (permalink / raw)
  To: Artem S. Tashkinov; +Cc: stern, torvalds, linux-kernel, linux-pci, rjw

On Tue, Feb 26, 2013 at 12:14 PM, Artem S. Tashkinov <t.artem@lycos.com> wrote:
> Feb 27, 2013 12:47:01 AM, Bjorn Helgaas wrote:
> On Mon, Feb 25, 2013 at 11:35 PM, Artem S. Tashkinov wrote:
>>> Feb 26, 2013 03:57:52 AM, Bjorn Helgaas wrote:
>>>>
>>>>Where are we at with this, Artem?  I assume it's still a problem.
>>>>
>>>
>>> Yes, it is, Bjorn.
>>>
>>> In order to eliminate this problem I switched back to MBR yesterday, because
>>> so far I haven't received any instructions or guidance as to how I can debug
>>> it further. I'm absolutely sure USB write speed is just another manifestation of
>>> it so I decided not to debug USB specifically (it just doesn't make too much
>>> sense).
>>>
>>> What I see is that something terribly wrong is going on but if Linus has no ideas
>>> I, as an average Joe, don't have a slightest clue as to what I can do.
>>>
>>> The bug report with necessary, but seemingly useless information, can be
>>> found here: https://bugzilla.kernel.org/show_bug.cgi?id=53551
>>>
>>> If anyone comes up with new ideas I can quickly try UEFI again now that I
>>> have two HDDs at my disposal (the old one is formatted as GPT, the new one is
>>> MBR).
>>
>>The ideas I saw are:
>>
>>1) Figure out whether it ever worked.  If an older kernel worked
>>correctly and a newer one is broken, bisection is at least a
>>possibility.  You mentioned that it did work before (Feb 12), but in
>>the past you never suspended twice in one boot session, whereas maybe
>>you did when seeing the problem?
>
> This is difficult to say since the first kernel I tried to run in EUFI mode was
> 3.7.x, so I've no idea if any previous ones ever worked.
>
>>
>>2) Try the "setpci" to set the MSI address back to the original value
>>to see if it makes a difference (see my Feb 12 message).
>
> I will try it soon and report back to you.
>
>>
>>3) Collect "lspci -vvv -xxxx" output to investigate the XHCI
>>Unsupported Request errors.
>>
>>4) Use usbmon to collect traces before and after the suspend.
>
> Likewise. Still I don't quite understand why you are persistent in your
> desire to investigate USB controllers specifically - my problem affects
> all storage devices that I have.

Well, in the absence of good ideas about what's going on, I guess we
have to pursue even the bad ideas that don't seem like they'd be
related :)  Speaking of bad ideas, any news on 2) and 3) above?

You mentioned in the bugzilla that Windows complains about MTRRs being
changed across the S4 sleep state transition.  I don't think Linux
looks for such a change.  You could try looking at /proc/mtrr before
and after the suspend/resume to see if anything changed there.  It
looks like there's even support for *writing* the MTRRs via
/proc/mtrr, so if anything did change, you could also try changing it
back.

Bjorn

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-03-07  0:17                     ` Bjorn Helgaas
@ 2013-04-26 21:36                       ` Bjorn Helgaas
  2013-04-27 10:10                         ` Artem S. Tashkinov
  0 siblings, 1 reply; 33+ messages in thread
From: Bjorn Helgaas @ 2013-04-26 21:36 UTC (permalink / raw)
  To: Artem S. Tashkinov
  Cc: Alan Stern, Linus Torvalds, linux-kernel, linux-pci, Rafael J. Wysocki

On Wed, Mar 6, 2013 at 5:17 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> On Tue, Feb 26, 2013 at 12:14 PM, Artem S. Tashkinov <t.artem@lycos.com> wrote:
>> Feb 27, 2013 12:47:01 AM, Bjorn Helgaas wrote:
>> On Mon, Feb 25, 2013 at 11:35 PM, Artem S. Tashkinov wrote:
>>>> Feb 26, 2013 03:57:52 AM, Bjorn Helgaas wrote:
>>>>>
>>>>>Where are we at with this, Artem?  I assume it's still a problem.
>>>>>
>>>>
>>>> Yes, it is, Bjorn.
>>>>
>>>> In order to eliminate this problem I switched back to MBR yesterday, because
>>>> so far I haven't received any instructions or guidance as to how I can debug
>>>> it further. I'm absolutely sure USB write speed is just another manifestation of
>>>> it so I decided not to debug USB specifically (it just doesn't make too much
>>>> sense).
>>>>
>>>> What I see is that something terribly wrong is going on but if Linus has no ideas
>>>> I, as an average Joe, don't have a slightest clue as to what I can do.
>>>>
>>>> The bug report with necessary, but seemingly useless information, can be
>>>> found here: https://bugzilla.kernel.org/show_bug.cgi?id=53551
>>>>
>>>> If anyone comes up with new ideas I can quickly try UEFI again now that I
>>>> have two HDDs at my disposal (the old one is formatted as GPT, the new one is
>>>> MBR).
>>>
>>>The ideas I saw are:
>>>
>>>1) Figure out whether it ever worked.  If an older kernel worked
>>>correctly and a newer one is broken, bisection is at least a
>>>possibility.  You mentioned that it did work before (Feb 12), but in
>>>the past you never suspended twice in one boot session, whereas maybe
>>>you did when seeing the problem?
>>
>> This is difficult to say since the first kernel I tried to run in EUFI mode was
>> 3.7.x, so I've no idea if any previous ones ever worked.
>>
>>>
>>>2) Try the "setpci" to set the MSI address back to the original value
>>>to see if it makes a difference (see my Feb 12 message).
>>
>> I will try it soon and report back to you.
>>
>>>
>>>3) Collect "lspci -vvv -xxxx" output to investigate the XHCI
>>>Unsupported Request errors.
>>>
>>>4) Use usbmon to collect traces before and after the suspend.
>>
>> Likewise. Still I don't quite understand why you are persistent in your
>> desire to investigate USB controllers specifically - my problem affects
>> all storage devices that I have.
>
> Well, in the absence of good ideas about what's going on, I guess we
> have to pursue even the bad ideas that don't seem like they'd be
> related :)  Speaking of bad ideas, any news on 2) and 3) above?
>
> You mentioned in the bugzilla that Windows complains about MTRRs being
> changed across the S4 sleep state transition.  I don't think Linux
> looks for such a change.  You could try looking at /proc/mtrr before
> and after the suspend/resume to see if anything changed there.  It
> looks like there's even support for *writing* the MTRRs via
> /proc/mtrr, so if anything did change, you could also try changing it
> back.

Did this problem ever get resolved?

Bjorn

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-04-26 21:36                       ` Bjorn Helgaas
@ 2013-04-27 10:10                         ` Artem S. Tashkinov
  2013-04-30  4:47                           ` Bjorn Helgaas
  0 siblings, 1 reply; 33+ messages in thread
From: Artem S. Tashkinov @ 2013-04-27 10:10 UTC (permalink / raw)
  To: bhelgaas; +Cc: stern, torvalds, linux-kernel, linux-pci, rjw

>
>Did this problem ever get resolved?
>

Hello,

Unfortunately, no. Out of curiosity I've tried booting kernel
3.9-rc8 in EUFI mode but it exhibits the same problem. 

Right after the boot:

[root@localhost ~]# dd if=/dev/zero of=test bs=64M count=3
3+0 records in
3+0 records out
201326592 bytes (201 MB) copied, 1.08544 s, 185 MB/s

After suspend/resume:

# dd if=/dev/zero of=test bs=64M count=3
3+0 records in
3+0 records out
201326592 bytes (201 MB) copied, 66.5392 s, 3.0 MB/s

That's for my primary SATA-3 HDD.

Forgive me my impudence but I believe debugging the USB stack is
tangential to this problem. Something far deeper than USB support
breaks, but so far no one has come even with the slightest clue of
what that might be.

And like I mentioned before this problem doesn't affect Windows - once
I suspended it seven times in a row and it kept on chugging happily.

According to hdparm nothing changes after suspend/resume:

Capabilities:
        LBA, IORDY(can be disabled)
        Queue depth: 32
        Standby timer values: spec'd by Standard, no device specific minimum
        R/W multiple sector transfer: Max = 16  Current = ?
        Advanced power management level: disabled
        Recommended acoustic management value: 208, current value: 0
        DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
             Cycle time: min=120ns recommended=120ns
        PIO: pio0 pio1 pio2 pio3 pio4
             Cycle time: no flow control=120ns  IORDY flow control=120ns

3MB/sec matches PIO mode 0 which is ridiculous and implausible given
than this HDD is attached via SATA.

Besides hdparm says that:

# hdparm -tT --direct /dev/sda

/dev/sda:
 Timing O_DIRECT cached reads:   862 MB in  2.00 seconds = 430.77 MB/sec
 Timing O_DIRECT disk reads:  520 MB in  3.01 seconds = 173.03 MB/sec

So, only writes are affected.

My dmesg is here: http://ompldr.org/vaThpcA/dmesg

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-04-27 10:10                         ` Artem S. Tashkinov
@ 2013-04-30  4:47                           ` Bjorn Helgaas
  2013-05-01  4:19                             ` Robert Hancock
  0 siblings, 1 reply; 33+ messages in thread
From: Bjorn Helgaas @ 2013-04-30  4:47 UTC (permalink / raw)
  To: Artem S. Tashkinov
  Cc: Alan Stern, Linus Torvalds, linux-kernel, linux-pci, Rafael J. Wysocki

On Sat, Apr 27, 2013 at 4:10 AM, Artem S. Tashkinov <t.artem@lycos.com> wrote:
>>
>>Did this problem ever get resolved?
>>
>
> Hello,
>
> Unfortunately, no. Out of curiosity I've tried booting kernel
> 3.9-rc8 in EUFI mode but it exhibits the same problem.
>
> Right after the boot:
>
> [root@localhost ~]# dd if=/dev/zero of=test bs=64M count=3
> 3+0 records in
> 3+0 records out
> 201326592 bytes (201 MB) copied, 1.08544 s, 185 MB/s
>
> After suspend/resume:
>
> # dd if=/dev/zero of=test bs=64M count=3
> 3+0 records in
> 3+0 records out
> 201326592 bytes (201 MB) copied, 66.5392 s, 3.0 MB/s
>
> That's for my primary SATA-3 HDD.
>
> Forgive me my impudence but I believe debugging the USB stack is
> tangential to this problem. Something far deeper than USB support
> breaks, but so far no one has come even with the slightest clue of
> what that might be.

I tend to agree that it sounds like something deeper than USB is
broken.  I admit I'm just grasping at straws because I don't have any
good ideas yet.

Here are three easy things you can try:

1) Collect "lspci -vvv -xxxx" output before and after the
suspend/resume to investigate the XHCI Unsupported Request errors.

2) Collect the contents of /proc/mtrr before and after the suspend/resume.

3) After the suspend/resume, try the "setpci" to set the MSI address
back to the original value to see if it makes a difference (see my Feb
12 message).

Bjorn

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-04-30  4:47                           ` Bjorn Helgaas
@ 2013-05-01  4:19                             ` Robert Hancock
  2013-05-07 15:25                               ` Bjorn Helgaas
  0 siblings, 1 reply; 33+ messages in thread
From: Robert Hancock @ 2013-05-01  4:19 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Artem S. Tashkinov, Alan Stern, Linus Torvalds, linux-kernel,
	linux-pci, Rafael J. Wysocki

On 04/29/2013 10:47 PM, Bjorn Helgaas wrote:
> On Sat, Apr 27, 2013 at 4:10 AM, Artem S. Tashkinov <t.artem@lycos.com> wrote:
>>>
>>> Did this problem ever get resolved?
>>>
>>
>> Hello,
>>
>> Unfortunately, no. Out of curiosity I've tried booting kernel
>> 3.9-rc8 in EUFI mode but it exhibits the same problem.
>>
>> Right after the boot:
>>
>> [root@localhost ~]# dd if=/dev/zero of=test bs=64M count=3
>> 3+0 records in
>> 3+0 records out
>> 201326592 bytes (201 MB) copied, 1.08544 s, 185 MB/s
>>
>> After suspend/resume:
>>
>> # dd if=/dev/zero of=test bs=64M count=3
>> 3+0 records in
>> 3+0 records out
>> 201326592 bytes (201 MB) copied, 66.5392 s, 3.0 MB/s
>>
>> That's for my primary SATA-3 HDD.
>>
>> Forgive me my impudence but I believe debugging the USB stack is
>> tangential to this problem. Something far deeper than USB support
>> breaks, but so far no one has come even with the slightest clue of
>> what that might be.
>
> I tend to agree that it sounds like something deeper than USB is
> broken.  I admit I'm just grasping at straws because I don't have any
> good ideas yet.
>
> Here are three easy things you can try:
>
> 1) Collect "lspci -vvv -xxxx" output before and after the
> suspend/resume to investigate the XHCI Unsupported Request errors.
>
> 2) Collect the contents of /proc/mtrr before and after the suspend/resume.
>
> 3) After the suspend/resume, try the "setpci" to set the MSI address
> back to the original value to see if it makes a difference (see my Feb
> 12 message).

I would suspect that Windows' complaint about the BIOS mucking up the 
MTRRs is likely the best hint. Likely Windows is detecting the problem 
and fixing it up on resume, thus it only complains about "reduced resume 
performance". If the MTRRs are messed up, then quite likely parts of RAM 
have become uncacheable, causing performance to get randomly slaughtered 
in various ways.

 From looking at the code it's not clear if we are checking/restoring 
the MTRR contents after resume. If not, maybe we should be.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-05-01  4:19                             ` Robert Hancock
@ 2013-05-07 15:25                               ` Bjorn Helgaas
  2013-05-07 15:59                                 ` Artem S. Tashkinov
  2013-05-07 16:12                                 ` Phillip Susi
  0 siblings, 2 replies; 33+ messages in thread
From: Bjorn Helgaas @ 2013-05-07 15:25 UTC (permalink / raw)
  To: Robert Hancock
  Cc: Artem S. Tashkinov, Alan Stern, Linus Torvalds, linux-kernel,
	linux-pci, Rafael J. Wysocki, Phillip Susi

[+cc Phillip]

On Tue, Apr 30, 2013 at 9:19 PM, Robert Hancock <hancockrwd@gmail.com> wrote:
> On 04/29/2013 10:47 PM, Bjorn Helgaas wrote:
>>
>> On Sat, Apr 27, 2013 at 4:10 AM, Artem S. Tashkinov <t.artem@lycos.com>
>> wrote:
>>>>
>>>>
>>>> Did this problem ever get resolved?
>>>>
>>>
>>> Hello,
>>>
>>> Unfortunately, no. Out of curiosity I've tried booting kernel
>>> 3.9-rc8 in EUFI mode but it exhibits the same problem.
>>>
>>> Right after the boot:
>>>
>>> [root@localhost ~]# dd if=/dev/zero of=test bs=64M count=3
>>> 3+0 records in
>>> 3+0 records out
>>> 201326592 bytes (201 MB) copied, 1.08544 s, 185 MB/s
>>>
>>> After suspend/resume:
>>>
>>> # dd if=/dev/zero of=test bs=64M count=3
>>> 3+0 records in
>>> 3+0 records out
>>> 201326592 bytes (201 MB) copied, 66.5392 s, 3.0 MB/s
>>>
>>> That's for my primary SATA-3 HDD.
>>>
>>> Forgive me my impudence but I believe debugging the USB stack is
>>> tangential to this problem. Something far deeper than USB support
>>> breaks, but so far no one has come even with the slightest clue of
>>> what that might be.
>>
>>
>> I tend to agree that it sounds like something deeper than USB is
>> broken.  I admit I'm just grasping at straws because I don't have any
>> good ideas yet.
>>
>> Here are three easy things you can try:
>>
>> 1) Collect "lspci -vvv -xxxx" output before and after the
>> suspend/resume to investigate the XHCI Unsupported Request errors.
>>
>> 2) Collect the contents of /proc/mtrr before and after the suspend/resume.
>>
>> 3) After the suspend/resume, try the "setpci" to set the MSI address
>> back to the original value to see if it makes a difference (see my Feb
>> 12 message).
>
>
> I would suspect that Windows' complaint about the BIOS mucking up the MTRRs
> is likely the best hint. Likely Windows is detecting the problem and fixing
> it up on resume, thus it only complains about "reduced resume performance".
> If the MTRRs are messed up, then quite likely parts of RAM have become
> uncacheable, causing performance to get randomly slaughtered in various
> ways.
>
> From looking at the code it's not clear if we are checking/restoring the
> MTRR contents after resume. If not, maybe we should be.

I agree; the MTRR warning is a good hint.  Artem?

Phillip, I cc'd you because you have similar hardware and your
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1131468 report is
slightly similar.  Have you seen anything like this "reduced
performance after resume" issue?  If so, can you collect /proc/mtrr
contents before and after suspending?

Bjorn

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-05-07 15:25                               ` Bjorn Helgaas
@ 2013-05-07 15:59                                 ` Artem S. Tashkinov
  2013-05-07 16:27                                   ` Bjorn Helgaas
  2013-05-07 19:05                                   ` Robert Hancock
  2013-05-07 16:12                                 ` Phillip Susi
  1 sibling, 2 replies; 33+ messages in thread
From: Artem S. Tashkinov @ 2013-05-07 15:59 UTC (permalink / raw)
  To: bhelgaas; +Cc: hancockrwd, stern, torvalds, linux-kernel, linux-pci, rjw, psusi

May 7, 2013 09:25:40 PM, 	Bjorn Helgaas  wrote:
> [+cc Phillip]
>
>> I would suspect that Windows' complaint about the BIOS mucking up the MTRRs
>> is likely the best hint. Likely Windows is detecting the problem and fixing
>> it up on resume, thus it only complains about "reduced resume performance".
>> If the MTRRs are messed up, then quite likely parts of RAM have become
>> uncacheable, causing performance to get randomly slaughtered in various
>> ways.
>>
>> From looking at the code it's not clear if we are checking/restoring the
>> MTRR contents after resume. If not, maybe we should be.
>
>I agree; the MTRR warning is a good hint.  Artem?
>
>Phillip, I cc'd you because you have similar hardware and your
>https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1131468 report is
>slightly similar.  Have you seen anything like this "reduced
>performance after resume" issue?  If so, can you collect /proc/mtrr
>contents before and after suspending?
>

Like Robert Hancock correctly noted the Linux kernel lacks the code to check
for MTTR changes after resume - I'm not a kernel hacker to write such a code ;-)

Likewise there's no code to see if RAM pages have become uncacheable - i.e
I've no idea how to check it either.

According to /proc/mttr nothing changes on resume - only Windows detects
the discrepancy between MTTR regions on resume. dmesg contains no warnings
or errors (aside from usual ACPI SATA warnings - but they happen right on
boot - so I highly doubt the ACPI or SATA layers can be the culprit, since USB
exhibits a similar performance degradation).

In short, there's little to nothing that I can check.

That bug report has nothing to do with my problem - my PC suspends and
resumes more or less correctly - everything works (albeit some parts don't
work as they should). That person also has a very outdated BIOS -  1904 from
08/15/2011. I wouldn't be surprised if BIOS update solved his problem.

Best regards,

Artem

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-05-07 15:25                               ` Bjorn Helgaas
  2013-05-07 15:59                                 ` Artem S. Tashkinov
@ 2013-05-07 16:12                                 ` Phillip Susi
  1 sibling, 0 replies; 33+ messages in thread
From: Phillip Susi @ 2013-05-07 16:12 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Robert Hancock, Artem S. Tashkinov, Alan Stern, Linus Torvalds,
	linux-kernel, linux-pci, Rafael J. Wysocki

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 5/7/2013 11:25 AM, Bjorn Helgaas wrote:
> Phillip, I cc'd you because you have similar hardware and your 
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1131468 report
> is slightly similar.  Have you seen anything like this "reduced 
> performance after resume" issue?  If so, can you collect
> /proc/mtrr contents before and after suspending?

Nope, not seen that issue.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJRiSgCAAoJEJrBOlT6nu75e0IH/1tqoTRyAMfVgcWTfhdcSAVi
kBnvTpfGqlwD1ThxF3AZ+kHFPykI7TNUEvPR+syBFIi6BLHDoCZJMyCnKWwrY3jW
62lpgBZPZNejK+Yms3wjt6bZs81g38FKhWqm/IGruo7u79j/CS6puUypQMZ7WkC4
8y3SjBfiVy3ncQAOr7akCJzCv4fgqY+vtpIOHOXknfUxwgHqVOo3Pa0rMeat2TrN
8KHLkzYjML7Z+vN9DvPnqnRYFwFmkRZl01wRITo3OaFFJFhH70uqnwZ3ES+H0oh5
OpAJqEZ1PqiSAn+P7nI8RJ8lt7c5bta6Mvv0ev4+aDOQxnL3AmipOMNGfI37Ubk=
=fiiM
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-05-07 15:59                                 ` Artem S. Tashkinov
@ 2013-05-07 16:27                                   ` Bjorn Helgaas
  2013-05-07 18:50                                     ` Artem S. Tashkinov
  2013-05-07 19:05                                   ` Robert Hancock
  1 sibling, 1 reply; 33+ messages in thread
From: Bjorn Helgaas @ 2013-05-07 16:27 UTC (permalink / raw)
  To: Artem S. Tashkinov
  Cc: Robert Hancock, Alan Stern, Linus Torvalds, linux-kernel,
	linux-pci, Rafael J. Wysocki, psusi

On Tue, May 7, 2013 at 8:59 AM, Artem S. Tashkinov <t.artem@lycos.com> wrote:
> May 7, 2013 09:25:40 PM,        Bjorn Helgaas  wrote:
>> [+cc Phillip]
>>
>>> I would suspect that Windows' complaint about the BIOS mucking up the MTRRs
>>> is likely the best hint. Likely Windows is detecting the problem and fixing
>>> it up on resume, thus it only complains about "reduced resume performance".
>>> If the MTRRs are messed up, then quite likely parts of RAM have become
>>> uncacheable, causing performance to get randomly slaughtered in various
>>> ways.
>>>
>>> From looking at the code it's not clear if we are checking/restoring the
>>> MTRR contents after resume. If not, maybe we should be.
>>
>>I agree; the MTRR warning is a good hint.  Artem?
>>
>>Phillip, I cc'd you because you have similar hardware and your
>>https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1131468 report is
>>slightly similar.  Have you seen anything like this "reduced
>>performance after resume" issue?  If so, can you collect /proc/mtrr
>>contents before and after suspending?
>>
>
> Like Robert Hancock correctly noted the Linux kernel lacks the code to check
> for MTTR changes after resume - I'm not a kernel hacker to write such a code ;-)
>
> Likewise there's no code to see if RAM pages have become uncacheable - i.e
> I've no idea how to check it either.
>
> According to /proc/mttr nothing changes on resume - only Windows detects
> the discrepancy between MTTR regions on resume. dmesg contains no warnings
> or errors (aside from usual ACPI SATA warnings - but they happen right on
> boot - so I highly doubt the ACPI or SATA layers can be the culprit, since USB
> exhibits a similar performance degradation).
>
> In short, there's little to nothing that I can check.

I'm not trying to be ungrateful, but maybe you could actually collect
the info we've asked for and attach it to the bugzilla.  It's hard for
me to get excited about digging into this when all I see is "nothing
changes in MTRR" and "it's probably not X."  I really need some
concrete data to help rule things out and suggest other things to
investigate.

Maybe we won't be able to make progress on this until other people
start hitting similar issues and we can find patterns.

Bjorn

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-05-07 16:27                                   ` Bjorn Helgaas
@ 2013-05-07 18:50                                     ` Artem S. Tashkinov
  2013-05-07 18:54                                       ` Bjorn Helgaas
  0 siblings, 1 reply; 33+ messages in thread
From: Artem S. Tashkinov @ 2013-05-07 18:50 UTC (permalink / raw)
  To: bhelgaas; +Cc: hancockrwd, stern, torvalds, linux-kernel, linux-pci, rjw, psusi

May 7, 2013 10:27:30 PM, Bjorn Helgaas wrote:
On Tue, May 7, 2013 at 8:59 AM, Artem S. Tashkinov  wrote:
>> May 7, 2013 09:25:40 PM,        Bjorn Helgaas  wrote:
>>> [+cc Phillip]
>>>
>>>> I would suspect that Windows' complaint about the BIOS mucking up the MTRRs
>>>> is likely the best hint. Likely Windows is detecting the problem and fixing
>>>> it up on resume, thus it only complains about "reduced resume performance".
>>>> If the MTRRs are messed up, then quite likely parts of RAM have become
>>>> uncacheable, causing performance to get randomly slaughtered in various
>>>> ways.
>>>>
>>>> From looking at the code it's not clear if we are checking/restoring the
>>>> MTRR contents after resume. If not, maybe we should be.
>>>
>>>I agree; the MTRR warning is a good hint.  Artem?
>>>
>>>Phillip, I cc'd you because you have similar hardware and your
>>>https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1131468 report is
>>>slightly similar.  Have you seen anything like this "reduced
>>>performance after resume" issue?  If so, can you collect /proc/mtrr
>>>contents before and after suspending?
>>>
>>
>> Like Robert Hancock correctly noted the Linux kernel lacks the code to check
>> for MTTR changes after resume - I'm not a kernel hacker to write such a code ;-)
>>
>> Likewise there's no code to see if RAM pages have become uncacheable - i.e
>> I've no idea how to check it either.
>>
>> According to /proc/mttr nothing changes on resume - only Windows detects
>> the discrepancy between MTTR regions on resume. dmesg contains no warnings
>> or errors (aside from usual ACPI SATA warnings - but they happen right on
>> boot - so I highly doubt the ACPI or SATA layers can be the culprit, since USB
>> exhibits a similar performance degradation).
>>
>> In short, there's little to nothing that I can check.
>
>I'm not trying to be ungrateful, but maybe you could actually collect
>the info we've asked for and attach it to the bugzilla.  It's hard for
>me to get excited about digging into this when all I see is "nothing
>changes in MTRR" and "it's probably not X."  I really need some
>concrete data to help rule things out and suggest other things to
>investigate.
>
>Maybe we won't be able to make progress on this until other people
>start hitting similar issues and we can find patterns.

The pattern is very easy to spot - Linus once told that desktop PCs are
not meant to work properly with suspend. That's kinda strange for me
as I have yet to encounter a PC where Windows fails to work properly
after resume - maybe I'm lucky - who knows.

Taking into consideration that only few people use Linux, most Linux
users avoid UEFI, very few of them actually use suspend/resume then
it gets very easy to understand why such bug reports are vanishingly
rare.

Asus themselves could have easily debugged this issue if they were
slightly interested in fixing it, yet their policy is that they only support
Windows, and Linux is not their concern.

Best regards

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-05-07 18:50                                     ` Artem S. Tashkinov
@ 2013-05-07 18:54                                       ` Bjorn Helgaas
  0 siblings, 0 replies; 33+ messages in thread
From: Bjorn Helgaas @ 2013-05-07 18:54 UTC (permalink / raw)
  To: Artem S. Tashkinov
  Cc: Robert Hancock, Alan Stern, Linus Torvalds, linux-kernel,
	linux-pci, Rafael J. Wysocki, Phillip Susi

On Tue, May 7, 2013 at 11:50 AM, Artem S. Tashkinov <t.artem@lycos.com> wrote:
> May 7, 2013 10:27:30 PM, Bjorn Helgaas wrote:
> On Tue, May 7, 2013 at 8:59 AM, Artem S. Tashkinov  wrote:
>>> May 7, 2013 09:25:40 PM,        Bjorn Helgaas  wrote:
>>>> [+cc Phillip]
>>>>
>>>>> I would suspect that Windows' complaint about the BIOS mucking up the MTRRs
>>>>> is likely the best hint. Likely Windows is detecting the problem and fixing
>>>>> it up on resume, thus it only complains about "reduced resume performance".
>>>>> If the MTRRs are messed up, then quite likely parts of RAM have become
>>>>> uncacheable, causing performance to get randomly slaughtered in various
>>>>> ways.
>>>>>
>>>>> From looking at the code it's not clear if we are checking/restoring the
>>>>> MTRR contents after resume. If not, maybe we should be.
>>>>
>>>>I agree; the MTRR warning is a good hint.  Artem?
>>>>
>>>>Phillip, I cc'd you because you have similar hardware and your
>>>>https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1131468 report is
>>>>slightly similar.  Have you seen anything like this "reduced
>>>>performance after resume" issue?  If so, can you collect /proc/mtrr
>>>>contents before and after suspending?
>>>>
>>>
>>> Like Robert Hancock correctly noted the Linux kernel lacks the code to check
>>> for MTTR changes after resume - I'm not a kernel hacker to write such a code ;-)
>>>
>>> Likewise there's no code to see if RAM pages have become uncacheable - i.e
>>> I've no idea how to check it either.
>>>
>>> According to /proc/mttr nothing changes on resume - only Windows detects
>>> the discrepancy between MTTR regions on resume. dmesg contains no warnings
>>> or errors (aside from usual ACPI SATA warnings - but they happen right on
>>> boot - so I highly doubt the ACPI or SATA layers can be the culprit, since USB
>>> exhibits a similar performance degradation).
>>>
>>> In short, there's little to nothing that I can check.
>>
>>I'm not trying to be ungrateful, but maybe you could actually collect
>>the info we've asked for and attach it to the bugzilla.  It's hard for
>>me to get excited about digging into this when all I see is "nothing
>>changes in MTRR" and "it's probably not X."  I really need some
>>concrete data to help rule things out and suggest other things to
>>investigate.
>>
>>Maybe we won't be able to make progress on this until other people
>>start hitting similar issues and we can find patterns.
>
> The pattern is very easy to spot - Linus once told that desktop PCs are
> not meant to work properly with suspend. That's kinda strange for me
> as I have yet to encounter a PC where Windows fails to work properly
> after resume - maybe I'm lucky - who knows.
>
> Taking into consideration that only few people use Linux, most Linux
> users avoid UEFI, very few of them actually use suspend/resume then
> it gets very easy to understand why such bug reports are vanishingly
> rare.
>
> Asus themselves could have easily debugged this issue if they were
> slightly interested in fixing it, yet their policy is that they only support
> Windows, and Linux is not their concern.

I can't intuit what the problem is.  If you or others can collect
data, we can try to fix this.  Otherwise, we can't.

Bjorn

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-05-07 15:59                                 ` Artem S. Tashkinov
  2013-05-07 16:27                                   ` Bjorn Helgaas
@ 2013-05-07 19:05                                   ` Robert Hancock
  2013-05-07 20:20                                     ` Bjorn Helgaas
  1 sibling, 1 reply; 33+ messages in thread
From: Robert Hancock @ 2013-05-07 19:05 UTC (permalink / raw)
  To: Artem S. Tashkinov
  Cc: Bjorn Helgaas, Alan Stern, Linus Torvalds, linux-kernel,
	linux-pci, Rafael J. Wysocki, psusi

On Tue, May 7, 2013 at 9:59 AM, Artem S. Tashkinov <t.artem@lycos.com> wrote:
> May 7, 2013 09:25:40 PM,        Bjorn Helgaas  wrote:
>> [+cc Phillip]
>>
>>> I would suspect that Windows' complaint about the BIOS mucking up the MTRRs
>>> is likely the best hint. Likely Windows is detecting the problem and fixing
>>> it up on resume, thus it only complains about "reduced resume performance".
>>> If the MTRRs are messed up, then quite likely parts of RAM have become
>>> uncacheable, causing performance to get randomly slaughtered in various
>>> ways.
>>>
>>> From looking at the code it's not clear if we are checking/restoring the
>>> MTRR contents after resume. If not, maybe we should be.
>>
>>I agree; the MTRR warning is a good hint.  Artem?
>>
>>Phillip, I cc'd you because you have similar hardware and your
>>https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1131468 report is
>>slightly similar.  Have you seen anything like this "reduced
>>performance after resume" issue?  If so, can you collect /proc/mtrr
>>contents before and after suspending?
>>
>
> Like Robert Hancock correctly noted the Linux kernel lacks the code to check
> for MTTR changes after resume - I'm not a kernel hacker to write such a code ;-)
>
> Likewise there's no code to see if RAM pages have become uncacheable - i.e
> I've no idea how to check it either.
>
> According to /proc/mttr nothing changes on resume - only Windows detects
> the discrepancy between MTTR regions on resume. dmesg contains no warnings
> or errors (aside from usual ACPI SATA warnings - but they happen right on
> boot - so I highly doubt the ACPI or SATA layers can be the culprit, since USB
> exhibits a similar performance degradation).

I'm not sure if reading /proc/mtrr actually reads the registers out of
the CPU each time, or whether we just return the cached values we read
out during initial boot-up. If the latter, then this output isn't
really useful as there's no guarantee the values are still intact.

>
> In short, there's little to nothing that I can check.
>
> That bug report has nothing to do with my problem - my PC suspends and
> resumes more or less correctly - everything works (albeit some parts don't
> work as they should). That person also has a very outdated BIOS -  1904 from
> 08/15/2011. I wouldn't be surprised if BIOS update solved his problem.
>
> Best regards,
>
> Artem

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-05-07 19:05                                   ` Robert Hancock
@ 2013-05-07 20:20                                     ` Bjorn Helgaas
  2013-05-07 21:48                                       ` Patrik Jakobsson
  0 siblings, 1 reply; 33+ messages in thread
From: Bjorn Helgaas @ 2013-05-07 20:20 UTC (permalink / raw)
  To: Robert Hancock
  Cc: Artem S. Tashkinov, Alan Stern, Linus Torvalds, linux-kernel,
	linux-pci, Rafael J. Wysocki, Phillip Susi

On Tue, May 7, 2013 at 12:05 PM, Robert Hancock <hancockrwd@gmail.com> wrote:
> On Tue, May 7, 2013 at 9:59 AM, Artem S. Tashkinov <t.artem@lycos.com> wrote:
>> May 7, 2013 09:25:40 PM,        Bjorn Helgaas  wrote:
>>> [+cc Phillip]
>>>
>>>> I would suspect that Windows' complaint about the BIOS mucking up the MTRRs
>>>> is likely the best hint. Likely Windows is detecting the problem and fixing
>>>> it up on resume, thus it only complains about "reduced resume performance".
>>>> If the MTRRs are messed up, then quite likely parts of RAM have become
>>>> uncacheable, causing performance to get randomly slaughtered in various
>>>> ways.
>>>>
>>>> From looking at the code it's not clear if we are checking/restoring the
>>>> MTRR contents after resume. If not, maybe we should be.
>>>
>>>I agree; the MTRR warning is a good hint.  Artem?
>>>
>>>Phillip, I cc'd you because you have similar hardware and your
>>>https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1131468 report is
>>>slightly similar.  Have you seen anything like this "reduced
>>>performance after resume" issue?  If so, can you collect /proc/mtrr
>>>contents before and after suspending?
>>>
>>
>> Like Robert Hancock correctly noted the Linux kernel lacks the code to check
>> for MTTR changes after resume - I'm not a kernel hacker to write such a code ;-)
>>
>> Likewise there's no code to see if RAM pages have become uncacheable - i.e
>> I've no idea how to check it either.
>>
>> According to /proc/mttr nothing changes on resume - only Windows detects
>> the discrepancy between MTTR regions on resume. dmesg contains no warnings
>> or errors (aside from usual ACPI SATA warnings - but they happen right on
>> boot - so I highly doubt the ACPI or SATA layers can be the culprit, since USB
>> exhibits a similar performance degradation).
>
> I'm not sure if reading /proc/mtrr actually reads the registers out of
> the CPU each time, or whether we just return the cached values we read
> out during initial boot-up. If the latter, then this output isn't
> really useful as there's no guarantee the values are still intact.

Good point.  From what I can tell, on Artem's system with "CPU0:
Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz," we would be using
generic_mtrr_ops, and generic_get_mtrr() appears to read from the
MSRs, so I think it should be useful.

Bjorn

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-05-07 20:20                                     ` Bjorn Helgaas
@ 2013-05-07 21:48                                       ` Patrik Jakobsson
  2013-05-07 22:02                                         ` Bjorn Helgaas
  0 siblings, 1 reply; 33+ messages in thread
From: Patrik Jakobsson @ 2013-05-07 21:48 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Robert Hancock, Artem S. Tashkinov, Alan Stern, Linus Torvalds,
	linux-kernel, linux-pci, Rafael J. Wysocki, Phillip Susi

On Tue, May 7, 2013 at 10:20 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>> I'm not sure if reading /proc/mtrr actually reads the registers out of
>> the CPU each time, or whether we just return the cached values we read
>> out during initial boot-up. If the latter, then this output isn't
>> really useful as there's no guarantee the values are still intact.
>
> Good point.  From what I can tell, on Artem's system with "CPU0:
> Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz," we would be using
> generic_mtrr_ops, and generic_get_mtrr() appears to read from the
> MSRs, so I think it should be useful.

FWIW, that motherboard suffers from a PCI to PCIE bridge problem. It might
have been fixed by bios upgrades by now but not sure.

It might also suffer (depending on the revision) from the Sandy bridge SATA
issue. So if affected, SATA controller is a ticking bomb.

I have a P8H67-V motherboard but I haven't seen any suspend related issues.

If this is totally unrelated I'm sorry for wasting your time. Just thought it
might be good to know.

Thanks
Patrik Jakobsson

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-05-07 21:48                                       ` Patrik Jakobsson
@ 2013-05-07 22:02                                         ` Bjorn Helgaas
  2013-05-07 22:25                                           ` Patrik Jakobsson
  2013-05-08  8:31                                           ` Artem S. Tashkinov
  0 siblings, 2 replies; 33+ messages in thread
From: Bjorn Helgaas @ 2013-05-07 22:02 UTC (permalink / raw)
  To: Patrik Jakobsson
  Cc: Robert Hancock, Artem S. Tashkinov, Alan Stern, Linus Torvalds,
	linux-kernel, linux-pci, Rafael J. Wysocki, Phillip Susi

On Tue, May 7, 2013 at 2:48 PM, Patrik Jakobsson
<patrik.r.jakobsson@gmail.com> wrote:
> On Tue, May 7, 2013 at 10:20 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>>> I'm not sure if reading /proc/mtrr actually reads the registers out of
>>> the CPU each time, or whether we just return the cached values we read
>>> out during initial boot-up. If the latter, then this output isn't
>>> really useful as there's no guarantee the values are still intact.
>>
>> Good point.  From what I can tell, on Artem's system with "CPU0:
>> Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz," we would be using
>> generic_mtrr_ops, and generic_get_mtrr() appears to read from the
>> MSRs, so I think it should be useful.
>
> FWIW, that motherboard suffers from a PCI to PCIE bridge problem. It might
> have been fixed by bios upgrades by now but not sure.
>
> It might also suffer (depending on the revision) from the Sandy bridge SATA
> issue. So if affected, SATA controller is a ticking bomb.
>
> I have a P8H67-V motherboard but I haven't seen any suspend related issues.
>
> If this is totally unrelated I'm sorry for wasting your time. Just thought it
> might be good to know.

Thanks for chiming in.  I'm not familiar with either of the issues you
mentioned.  Do you have any references where I could read up on them?

Artem's system has a PCIe-to-PCI bridge (not a PCI-to-PCIe bridge) at
05:00.0, but it leads to [bus 06] and there's nothing on bus 06, so I
don't think that's the problem.

And the issue affects both USB and a hard drive, so I suspect it's
more than just SATA.  Artem, did you identify the PCI devices leading
to your USB and hard drive?  I can't remember if I've actually seen
that.

Bjorn

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-05-07 22:02                                         ` Bjorn Helgaas
@ 2013-05-07 22:25                                           ` Patrik Jakobsson
  2013-05-08  8:37                                             ` Artem S. Tashkinov
  2013-05-08  8:31                                           ` Artem S. Tashkinov
  1 sibling, 1 reply; 33+ messages in thread
From: Patrik Jakobsson @ 2013-05-07 22:25 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Robert Hancock, Artem S. Tashkinov, Alan Stern, Linus Torvalds,
	linux-kernel, linux-pci, Rafael J. Wysocki, Phillip Susi

On Wed, May 8, 2013 at 12:02 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> On Tue, May 7, 2013 at 2:48 PM, Patrik Jakobsson
> <patrik.r.jakobsson@gmail.com> wrote:
>> On Tue, May 7, 2013 at 10:20 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>>>> I'm not sure if reading /proc/mtrr actually reads the registers out of
>>>> the CPU each time, or whether we just return the cached values we read
>>>> out during initial boot-up. If the latter, then this output isn't
>>>> really useful as there's no guarantee the values are still intact.
>>>
>>> Good point.  From what I can tell, on Artem's system with "CPU0:
>>> Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz," we would be using
>>> generic_mtrr_ops, and generic_get_mtrr() appears to read from the
>>> MSRs, so I think it should be useful.
>>
>> FWIW, that motherboard suffers from a PCI to PCIE bridge problem. It might
>> have been fixed by bios upgrades by now but not sure.
>>
>> It might also suffer (depending on the revision) from the Sandy bridge SATA
>> issue. So if affected, SATA controller is a ticking bomb.
>>
>> I have a P8H67-V motherboard but I haven't seen any suspend related issues.
>>
>> If this is totally unrelated I'm sorry for wasting your time. Just thought it
>> might be good to know.
>
> Thanks for chiming in.  I'm not familiar with either of the issues you
> mentioned.  Do you have any references where I could read up on them?

I think this is the official statement from Intel on the SATA issue:
http://newsroom.intel.com/community/intel_newsroom/blog/2011/01/31/intel-identifies-chipset-design-error-implementing-solution

And here's a link to a discussion about the PCIe-to-PCI bridge stuff:
https://lkml.org/lkml/2012/1/30/216

> Artem's system has a PCIe-to-PCI bridge (not a PCI-to-PCIe bridge) at
> 05:00.0, but it leads to [bus 06] and there's nothing on bus 06, so I
> don't think that's the problem.

I meant what you said ;) and yes, it seems unrelated. Both my P8H67 and a
P8P67 I've built behave nicely if nothing is connected.

> And the issue affects both USB and a hard drive, so I suspect it's
> more than just SATA.  Artem, did you identify the PCI devices leading
> to your USB and hard drive?  I can't remember if I've actually seen
> that.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-05-07 22:02                                         ` Bjorn Helgaas
  2013-05-07 22:25                                           ` Patrik Jakobsson
@ 2013-05-08  8:31                                           ` Artem S. Tashkinov
  1 sibling, 0 replies; 33+ messages in thread
From: Artem S. Tashkinov @ 2013-05-08  8:31 UTC (permalink / raw)
  To: bhelgaas
  Cc: patrik.r.jakobsson, hancockrwd, stern, torvalds, linux-kernel,
	linux-pci, rjw, psusi

May 8, 2013 04:03:18 AM, Bjorn Helgaas wrote:
On Tue, May 7, 2013 at 2:48 PM, Patrik Jakobsson
> wrote:
>> On Tue, May 7, 2013 at 10:20 PM, Bjorn Helgaas 
 wrote:
>>>> I'm not sure if reading /proc/mtrr actually reads the registers out of
>>>> the CPU each time, or whether we just return the cached values we read
>>>> out during initial boot-up. If the latter, then this output isn't
>>>> really useful as there's no guarantee the values are still intact.
>>>
>>> Good point.  From what I can tell, on Artem's system with "CPU0:
>>> Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz," we would be using
>>> generic_mtrr_ops, and generic_get_mtrr() appears to read from the
>>> MSRs, so I think it should be useful.
>>
>> FWIW, that motherboard suffers from a PCI to PCIE bridge problem. It might
>> have been fixed by bios upgrades by now but not sure.
>>
>> It might also suffer (depending on the revision) from the Sandy bridge SATA
>> issue. So if affected, SATA controller is a ticking bomb.
>>
>> I have a P8H67-V motherboard but I haven't seen any suspend related issues.
>>
>> If this is totally unrelated I'm sorry for wasting your time. Just thought it
>> might be good to know.
>
>Thanks for chiming in.  I'm not familiar with either of the issues you
>mentioned.  Do you have any references where I could read up on them?
>
>Artem's system has a PCIe-to-PCI bridge (not a PCI-to-PCIe bridge) at
>05:00.0, but it leads to [bus 06] and there's nothing on bus 06, so I
>don't think that's the problem.
>
>And the issue affects both USB and a hard drive, so I suspect it's
>more than just SATA.  Artem, did you identify the PCI devices leading
>to your USB and hard drive?  I can't remember if I've actually seen
>that.

I posted my lspci information here https://bugzilla.kernel.org/show_bug.cgi?id=53551

If that's not enough, please tell how can I collect this information.

The SATA issue is discussed here: https://bugzilla.kernel.org/show_bug.cgi?id=43229

According to Intel and Linux kernel developers it poses no threat.

Best regards,

Artem

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-05-07 22:25                                           ` Patrik Jakobsson
@ 2013-05-08  8:37                                             ` Artem S. Tashkinov
  2013-05-08  8:54                                               ` Patrik Jakobsson
  2013-05-08 13:43                                               ` Phillip Susi
  0 siblings, 2 replies; 33+ messages in thread
From: Artem S. Tashkinov @ 2013-05-08  8:37 UTC (permalink / raw)
  To: patrik.r.jakobsson
  Cc: bhelgaas, hancockrwd, stern, torvalds, linux-kernel, linux-pci,
	rjw, psusi

May 8, 2013 04:25:43 AM, Patrik Jakobsson wrote:
On Wed, May 8, 2013 at 12:02 AM, Bjorn Helgaas wrote:
>> On Tue, May 7, 2013 at 2:48 PM, Patrik Jakobsson wrote:
>>> On Tue, May 7, 2013 at 10:20 PM, Bjorn Helgaas  wrote:
>>>>> I'm not sure if reading /proc/mtrr actually reads the registers out of
>>>>> the CPU each time, or whether we just return the cached values we read
>>>>> out during initial boot-up. If the latter, then this output isn't
>>>>> really useful as there's no guarantee the values are still intact.
>>>>
>>>> Good point.  From what I can tell, on Artem's system with "CPU0:
>>>> Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz," we would be using
>>>> generic_mtrr_ops, and generic_get_mtrr() appears to read from the
>>>> MSRs, so I think it should be useful.
>>>
>>> FWIW, that motherboard suffers from a PCI to PCIE bridge problem. It might
>>> have been fixed by bios upgrades by now but not sure.
>>>
>>> It might also suffer (depending on the revision) from the Sandy bridge SATA
>>> issue. So if affected, SATA controller is a ticking bomb.
>>>
>>> I have a P8H67-V motherboard but I haven't seen any suspend related issues.
>>>
>>> If this is totally unrelated I'm sorry for wasting your time. Just thought it
>>> might be good to know.
>>
>> Thanks for chiming in.  I'm not familiar with either of the issues you
>> mentioned.  Do you have any references where I could read up on them?
>
>I think this is the official statement from Intel on the SATA issue:
>http://newsroom.intel.com/community/intel_newsroom/blog/2011/01/31/intel-identifies-chipset-design-error-implementing-solution

My motherboard has a new fixed B3 revision so this issue doesn't affect me.
Besides this SATA ports degradation issue is constantly present - it has no
relationship to suspend.

>
>And here's a link to a discussion about the PCIe-to-PCI bridge stuff:
>https://lkml.org/lkml/2012/1/30/216
>
>> Artem's system has a PCIe-to-PCI bridge (not a PCI-to-PCIe bridge) at
>> 05:00.0, but it leads to [bus 06] and there's nothing on bus 06, so I
>> don't think that's the problem.
>
>I meant what you said ;) and yes, it seems unrelated. Both my P8H67 and a
>P8P67 I've built behave nicely if nothing is connected.

Have you tried suspending more than three times? In the absence of UEFI
boot this bug emerges only on a third or even fourth resume attempt. UEFI
boot triggers it immediately on a first resume though.

>> And the issue affects both USB and a hard drive, so I suspect it's
>> more than just SATA.  Artem, did you identify the PCI devices leading
>> to your USB and hard drive?  I can't remember if I've actually seen
>> that.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-05-08  8:37                                             ` Artem S. Tashkinov
@ 2013-05-08  8:54                                               ` Patrik Jakobsson
  2013-05-08 13:43                                               ` Phillip Susi
  1 sibling, 0 replies; 33+ messages in thread
From: Patrik Jakobsson @ 2013-05-08  8:54 UTC (permalink / raw)
  To: Artem S. Tashkinov
  Cc: bhelgaas, hancockrwd, stern, torvalds, linux-kernel, linux-pci,
	rjw, psusi

On Wed, May 8, 2013 at 10:37 AM, Artem S. Tashkinov <t.artem@lycos.com> wrote:
>>I think this is the official statement from Intel on the SATA issue:
>>http://newsroom.intel.com/community/intel_newsroom/blog/2011/01/31/intel-identifies-chipset-design-error-implementing-solution
>
> My motherboard has a new fixed B3 revision so this issue doesn't affect me.
> Besides this SATA ports degradation issue is constantly present - it has no
> relationship to suspend.

Yes, Rev. B3 should be fine.

>>And here's a link to a discussion about the PCIe-to-PCI bridge stuff:
>>https://lkml.org/lkml/2012/1/30/216
>>
>>> Artem's system has a PCIe-to-PCI bridge (not a PCI-to-PCIe bridge) at
>>> 05:00.0, but it leads to [bus 06] and there's nothing on bus 06, so I
>>> don't think that's the problem.
>>
>>I meant what you said ;) and yes, it seems unrelated. Both my P8H67 and a
>>P8P67 I've built behave nicely if nothing is connected.
>
> Have you tried suspending more than three times? In the absence of UEFI
> boot this bug emerges only on a third or even fourth resume attempt. UEFI
> boot triggers it immediately on a first resume though.

I haven't enabled UEFI boot but did ~10 suspend/resume cycles with no issues.
I'm on 3.9-rc5 if that makes a difference. I'll do some more testing with
various kernel versions to see if I can trigger it.

-Patrik

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-05-08  8:37                                             ` Artem S. Tashkinov
  2013-05-08  8:54                                               ` Patrik Jakobsson
@ 2013-05-08 13:43                                               ` Phillip Susi
  1 sibling, 0 replies; 33+ messages in thread
From: Phillip Susi @ 2013-05-08 13:43 UTC (permalink / raw)
  To: Artem S. Tashkinov
  Cc: patrik.r.jakobsson, bhelgaas, hancockrwd, stern, torvalds,
	linux-kernel, linux-pci, rjw

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 5/8/2013 4:37 AM, Artem S. Tashkinov wrote:
> Have you tried suspending more than three times? In the absence of
> UEFI boot this bug emerges only on a third or even fourth resume
> attempt. UEFI boot triggers it immediately on a first resume
> though.

I suspend my P8P67 every night.  One thing that does come to mind now
though, is that when I first built it, there was a problem involving
suspend and the firewire driver, but IIRC, it manifested as a failure
to suspend with an error in dmesg, so I disabled the firewire
controller since I have no use for it.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJRilZ2AAoJEJrBOlT6nu75IbkH/RdaAwGoBbJbkTxh4vhR3l/+
nILa2X83WwEYADozNL7zi2w8AExTBHzfKZeE7uzXLPzsryPxJxAQY+LtXwRdCHQy
GXVKZL2TpxPsNQTv1uhdCRiSTgIm9U1Y/INPZ2ugn+WbiH9iXzhRzLRgKH3kALO4
OvReW/XQeZ77RP6IaffnoLbStpORAXH+Ttnt5nMdLvm/rGuBlUsyDvT9TAAcF3W1
1muRJnzpDdMj+Ibwn/IW5smp9RBm2EJb2aP2N+KOV7WgiwPC+8T7omWV6Fjl+NI7
xrc8BQTGVfbZJXpeg2KH7v5Ty+M4xFYfdCJjLocL3rFJCYR+3WCDqpRB+I55Yvo=
=eIDL
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-02-12 17:29 ` Abysmal HDD/USB write speed after sleep on a UEFI system Linus Torvalds
  2013-02-12 18:29   ` Artem S. Tashkinov
@ 2013-07-10 17:25   ` hyphop
  2013-07-10 20:50     ` Bjorn Helgaas
  1 sibling, 1 reply; 33+ messages in thread
From: hyphop @ 2013-07-10 17:25 UTC (permalink / raw)
  To: linux-kernel

hello
i have same problem. low write speed after system sleep 

kernel 3.9.9

i can see it HDD SATA & USB disks to

i try to make another test  

before sleep i make  file /tmp/test ( /tmp mounted as tmpfs size=8G, i have
16G memory in my system )

dd if=/dev/zero bs=1M count=1000 of=/tmp/test 
 ~ 3,2 GB/s
cryptsetup luksFormat /tmp/test
cryptsetup luksOpen /tmp/test test

dd if=/dev/zero bs=1M count=1000 of=/dev/mapper/test
 ~ 465 GB/s good speed 

after sleep

dd if=/dev/zero bs=1M count=1000 of=/dev/mapper/test
 ~ 5MB/s ooops (((( very slow

but if write directly in /tmp i can see 

dd if=/dev/zero bs=1M count=1000 of=/tmp/test2
 ~ 3,2 GB/s

I can see this is not hardware problem (NOT SATA OR USB) i think is kernel
BUG, i i dont have this problem on previous kernel 3.4

Best regards, 

Tema






--
View this message in context: http://linux-kernel.2935.n7.nabble.com/Abysmal-HDD-USB-write-speed-after-sleep-on-a-UEFI-system-tp598586p681610.html
Sent from the Linux Kernel mailing list archive at Nabble.com.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Abysmal HDD/USB write speed after sleep on a UEFI system
  2013-07-10 17:25   ` hyphop
@ 2013-07-10 20:50     ` Bjorn Helgaas
  0 siblings, 0 replies; 33+ messages in thread
From: Bjorn Helgaas @ 2013-07-10 20:50 UTC (permalink / raw)
  To: hyphop
  Cc: linux-kernel, Artem S. Tashkinov, Phillip Susi, Patrik Jakobsson,
	Robert Hancock, Alan Stern, Linus Torvalds, linux-pci,
	Rafael J. Wysocki, hendrik.haddorp

[+cc previous cc list from lkml]

On Wed, Jul 10, 2013 at 11:25 AM, hyphop <email2tema@gmail.com> wrote:
> hello
> i have same problem. low write speed after system sleep
>
> kernel 3.9.9
>
> i can see it HDD SATA & USB disks to
>
> i try to make another test
>
> before sleep i make  file /tmp/test ( /tmp mounted as tmpfs size=8G, i have
> 16G memory in my system )
>
> dd if=/dev/zero bs=1M count=1000 of=/tmp/test
>  ~ 3,2 GB/s
> cryptsetup luksFormat /tmp/test
> cryptsetup luksOpen /tmp/test test
>
> dd if=/dev/zero bs=1M count=1000 of=/dev/mapper/test
>  ~ 465 GB/s good speed
>
> after sleep
>
> dd if=/dev/zero bs=1M count=1000 of=/dev/mapper/test
>  ~ 5MB/s ooops (((( very slow
>
> but if write directly in /tmp i can see
>
> dd if=/dev/zero bs=1M count=1000 of=/tmp/test2
>  ~ 3,2 GB/s
>
> I can see this is not hardware problem (NOT SATA OR USB) i think is kernel
> BUG, i i dont have this problem on previous kernel 3.4

Thanks for this report.  Artem collected some of his info here:
https://bugzilla.kernel.org/show_bug.cgi?id=53551 .  Hendrik Haddorp
also reported seeing this issue there.

Artem reported that Windows complains "The system firmware has changed
the processor's memory type range registers (MTRRs) across a sleep
state transition (S4). This can result in reduced resume performance."
 If you have Windows on your system, does it complain the same way?

Can you collect and attach complete dmesg and "lspci -vvv" logs, from
both working and broken kernels, to the bugzilla?  Collect lspci logs
both before and after the sleep; I think Artem saw some differences
between those, and I'm not sure we completely ruled those out.

If anybody can reproduce this reliably enough to bisect it, that would
be a huge help.

Bjorn

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2013-07-10 20:50 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <587312497.6453.1360650312498.JavaMail.mail@webmail01>
2013-02-12 17:29 ` Abysmal HDD/USB write speed after sleep on a UEFI system Linus Torvalds
2013-02-12 18:29   ` Artem S. Tashkinov
2013-02-12 19:32     ` Linus Torvalds
2013-02-12 20:13       ` Artem S. Tashkinov
2013-02-13  4:26         ` Bjorn Helgaas
2013-02-19 16:22           ` Alan Stern
2013-02-25 21:57             ` Bjorn Helgaas
2013-02-26  6:35               ` Artem S. Tashkinov
2013-02-26 18:46                 ` Bjorn Helgaas
2013-02-26 19:14                   ` Artem S. Tashkinov
2013-03-07  0:17                     ` Bjorn Helgaas
2013-04-26 21:36                       ` Bjorn Helgaas
2013-04-27 10:10                         ` Artem S. Tashkinov
2013-04-30  4:47                           ` Bjorn Helgaas
2013-05-01  4:19                             ` Robert Hancock
2013-05-07 15:25                               ` Bjorn Helgaas
2013-05-07 15:59                                 ` Artem S. Tashkinov
2013-05-07 16:27                                   ` Bjorn Helgaas
2013-05-07 18:50                                     ` Artem S. Tashkinov
2013-05-07 18:54                                       ` Bjorn Helgaas
2013-05-07 19:05                                   ` Robert Hancock
2013-05-07 20:20                                     ` Bjorn Helgaas
2013-05-07 21:48                                       ` Patrik Jakobsson
2013-05-07 22:02                                         ` Bjorn Helgaas
2013-05-07 22:25                                           ` Patrik Jakobsson
2013-05-08  8:37                                             ` Artem S. Tashkinov
2013-05-08  8:54                                               ` Patrik Jakobsson
2013-05-08 13:43                                               ` Phillip Susi
2013-05-08  8:31                                           ` Artem S. Tashkinov
2013-05-07 16:12                                 ` Phillip Susi
2013-07-10 17:25   ` hyphop
2013-07-10 20:50     ` Bjorn Helgaas
2013-02-10 10:43 Artem S. Tashkinov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).