All of lore.kernel.org
 help / color / mirror / Atom feed
* AMD FX CPU bug, not fixed by latest microcode?
@ 2012-06-10 19:24 Boszormenyi Zoltan
  2012-06-11  7:52 ` Clemens Ladisch
  2012-06-11  8:43 ` Borislav Petkov
  0 siblings, 2 replies; 14+ messages in thread
From: Boszormenyi Zoltan @ 2012-06-10 19:24 UTC (permalink / raw)
  To: linux-kernel

Hi,

I have an AMD FX-8120 boxed CPU in an ASUS M5A99X-EVO mainboard
with 32GB DDR3/1600 memory, running Fedora 17, upgraded from 16.
memtest86+ show no problems.

Still, I get occasional crashes and signal 11 during kernel compilation even
with single-job make. Sometimes the compiler jumps out with a strange
error message, like "stray \NNN character in the source". When re-running
make, the error doesn't happen in the same file and the source file doesn't
contain the character being complained about when inspecting with
an editor or hexdump.

Now, a few minutes ago I was able to catch this bug when I copied the
kernel GIT tree to apply a patch manually and did "git commit -a".
Strangely, the commit contained one extra file that I didn't touch.
git diff showed this for the extra file:

==============================
--- a/drivers/usb/gadget/fsl_usb2_udc.h
+++ b/drivers/usb/gadget/fsl_usb2_udc.h
@@ -427,7 +427,7 @@ struct ep_td_struct {
  #define  DTD_ADDR_MASK                        0xFFFFFFE0
  #define  DTD_PACKET_SIZE                      0x7FFF0000
  #define  DTD_LENGTH_BIT_POS                   16
-#define  DTD_ERROR_MASK                       (DTD_STATUS_HALTED | \
+#define  DTD_ERROR_MASK                       (DTD_STATUS_HALTED | ^Z
                                                 DTD_STATUS_DATA_BUFF_ERR | \
                                                 DTD_STATUS_TRANSACTION_ERR)
  /* Alignment requirements; must be a power of two */
==============================

The "^Z" is a 0-character in the file and is not present in the
original source tree, only in the copy.

Similar errors happened during copying large files on the same
machine but it seems it's enough to trigger if the total amount
of data read is large enough.

The mainboard has the latest (UEFI) firmware flashed which
contains the latest AMD microcode, so microcode_ctl doesn't
need to apply it anymore. Previously, I used amd-ucode-2012-01-17.tar
from www.amd64.org/support/microcode.html which is now
part of microcode_ctl in Fedora.

Since the error happens during compiling a source file and not only
copying, the bug seems to happens during *reading* data.

Does anyone know whether it's a known problem in AMD FX CPUs?
Does AMD have a newer microcode to fix this bug, or should I apply
for warranty?

Thanks in advance,
Zoltán Böszörményi


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: AMD FX CPU bug, not fixed by latest microcode?
  2012-06-10 19:24 AMD FX CPU bug, not fixed by latest microcode? Boszormenyi Zoltan
@ 2012-06-11  7:52 ` Clemens Ladisch
  2012-06-11  8:13   ` Boszormenyi Zoltan
  2012-06-11  8:43 ` Borislav Petkov
  1 sibling, 1 reply; 14+ messages in thread
From: Clemens Ladisch @ 2012-06-11  7:52 UTC (permalink / raw)
  To: Boszormenyi Zoltan; +Cc: linux-kernel

Boszormenyi Zoltan wrote:
> I have an AMD FX-8120 boxed CPU in an ASUS M5A99X-EVO mainboard
> with 32GB DDR3/1600 memory, running Fedora 17, upgraded from 16.
>
> I get occasional crashes and signal 11 during kernel compilation even
> with single-job make. Sometimes the compiler jumps out with a strange
> error message, like "stray \NNN character in the source". When re-running
> make, the error doesn't happen in the same file and the source file doesn't
> contain the character being complained about when inspecting with
> an editor or hexdump.
>
> Now, a few minutes ago I was able to catch this bug when I copied the
> kernel GIT tree to apply a patch manually and did "git commit -a".
> Strangely, the commit contained one extra file that I didn't touch.
> git diff showed this for the extra file:
>
> ==============================
> --- a/drivers/usb/gadget/fsl_usb2_udc.h
> +++ b/drivers/usb/gadget/fsl_usb2_udc.h
> @@ -427,7 +427,7 @@ struct ep_td_struct {
>  #define  DTD_ADDR_MASK                        0xFFFFFFE0
>  #define  DTD_PACKET_SIZE                      0x7FFF0000
>  #define  DTD_LENGTH_BIT_POS                   16
> -#define  DTD_ERROR_MASK                       (DTD_STATUS_HALTED | \
> +#define  DTD_ERROR_MASK                       (DTD_STATUS_HALTED | ^Z
>                                                 DTD_STATUS_DATA_BUFF_ERR | \
>                                                 DTD_STATUS_TRANSACTION_ERR)
>  /* Alignment requirements; must be a power of two */
> ==============================
>
> The "^Z" is a 0-character in the file and is not present in the
> original source tree, only in the copy.

Is it always a zero, or other invalids characters?
(The (number of) changed bits might tell something.)

> Similar errors happened during copying large files on the same
> machine but it seems it's enough to trigger if the total amount
> of data read is large enough.

Does "large enough" mean "large enough so that they are not in the file
cache"?

All caches and your memory are ECC protected, so I think it is unlikely
that the problem is with these.  If I had to guess, I'd point to your
disk (firmware) or the SATA controller.  (A bad or loose SATA cable
would throw CRC errors into the kernel log.  Are there any?)

What is the exact offset of the changed byte in the file?  (It might be
at a cacheline, sector, or page boundary.)

> Does anyone know whether it's a known problem in AMD FX CPUs?

http://support.amd.com/us/Processor_TechDocs/48063_15h_Mod_00h-0Fh_Rev_Guide.pdf


Regards,
Clemens

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: AMD FX CPU bug, not fixed by latest microcode?
  2012-06-11  7:52 ` Clemens Ladisch
@ 2012-06-11  8:13   ` Boszormenyi Zoltan
  2012-06-11 10:21     ` Clemens Ladisch
  0 siblings, 1 reply; 14+ messages in thread
From: Boszormenyi Zoltan @ 2012-06-11  8:13 UTC (permalink / raw)
  To: Clemens Ladisch; +Cc: linux-kernel

2012-06-11 09:52 keltezéssel, Clemens Ladisch írta:
> Boszormenyi Zoltan wrote:
>> I have an AMD FX-8120 boxed CPU in an ASUS M5A99X-EVO mainboard
>> with 32GB DDR3/1600 memory, running Fedora 17, upgraded from 16.
>>
>> I get occasional crashes and signal 11 during kernel compilation even
>> with single-job make. Sometimes the compiler jumps out with a strange
>> error message, like "stray \NNN character in the source". When re-running
>> make, the error doesn't happen in the same file and the source file doesn't
>> contain the character being complained about when inspecting with
>> an editor or hexdump.
>>
>> Now, a few minutes ago I was able to catch this bug when I copied the
>> kernel GIT tree to apply a patch manually and did "git commit -a".
>> Strangely, the commit contained one extra file that I didn't touch.
>> git diff showed this for the extra file:
>>
>> ==============================
>> --- a/drivers/usb/gadget/fsl_usb2_udc.h
>> +++ b/drivers/usb/gadget/fsl_usb2_udc.h
>> @@ -427,7 +427,7 @@ struct ep_td_struct {
>>   #define  DTD_ADDR_MASK                        0xFFFFFFE0
>>   #define  DTD_PACKET_SIZE                      0x7FFF0000
>>   #define  DTD_LENGTH_BIT_POS                   16
>> -#define  DTD_ERROR_MASK                       (DTD_STATUS_HALTED | \
>> +#define  DTD_ERROR_MASK                       (DTD_STATUS_HALTED | ^Z
>>                                                  DTD_STATUS_DATA_BUFF_ERR | \
>>                                                  DTD_STATUS_TRANSACTION_ERR)
>>   /* Alignment requirements; must be a power of two */
>> ==============================
>>
>> The "^Z" is a 0-character in the file and is not present in the
>> original source tree, only in the copy.

Actually, the "^Z" there is 0x1a. It should be 0x5c, the backslash character.

> Is it always a zero, or other invalids characters?
> (The (number of) changed bits might tell something.)

IIRC, GCC has a different error for a 0-character and "stray \NNN character"
(that's not inside a string literal) and both happened at some time.
Sorry, I didn't bother to make a note of the error messages.

>
>> Similar errors happened during copying large files on the same
>> machine but it seems it's enough to trigger if the total amount
>> of data read is large enough.
> Does "large enough" mean "large enough so that they are not in the file
> cache"?
>
> All caches and your memory are ECC protected,

Unfortunately the memory is not with ECC. "Large enough" means it's
usually not in file system cache

>   so I think it is unlikely
> that the problem is with these.  If I had to guess, I'd point to your
> disk (firmware) or the SATA controller.  (A bad or loose SATA cable
> would throw CRC errors into the kernel log.  Are there any?)

The disks (8 of them) are attached to 3ware 9650SE-8LPML in RAID10.
tw_cli reports no problems.

> What is the exact offset of the changed byte in the file?  (It might be
> at a cacheline, sector, or page boundary.)

The bad character is at offset 0x4b74.

>> Does anyone know whether it's a known problem in AMD FX CPUs?
> http://support.amd.com/us/Processor_TechDocs/48063_15h_Mod_00h-0Fh_Rev_Guide.pdf

Thanks but I have seen this file already. The "no fix planned" for every
errata is saddening...

>
>
> Regards,
> Clemens
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: AMD FX CPU bug, not fixed by latest microcode?
  2012-06-10 19:24 AMD FX CPU bug, not fixed by latest microcode? Boszormenyi Zoltan
  2012-06-11  7:52 ` Clemens Ladisch
@ 2012-06-11  8:43 ` Borislav Petkov
  2012-06-11  9:49   ` Borislav Petkov
  1 sibling, 1 reply; 14+ messages in thread
From: Borislav Petkov @ 2012-06-11  8:43 UTC (permalink / raw)
  To: Boszormenyi Zoltan; +Cc: linux-kernel, Andreas Herrmann

(leaving in the full text)

On Sun, Jun 10, 2012 at 09:24:13PM +0200, Boszormenyi Zoltan wrote:
> Hi,
> 
> I have an AMD FX-8120 boxed CPU in an ASUS M5A99X-EVO mainboard
> with 32GB DDR3/1600 memory, running Fedora 17, upgraded from 16.
> memtest86+ show no problems.

Did you have the same issue with Fedora 16? Also, could you test with
another distro whether the same thing happens?

> Still, I get occasional crashes and signal 11 during kernel
> compilation even with single-job make. Sometimes the compiler jumps
> out with a strange error message, like "stray \NNN character in the
> source".

Is that the same ^Z character as below? Is that character with ascii
number \026? Or do you get different characters each time?

> When re-running
> make, the error doesn't happen in the same file and the source file doesn't
> contain the character being complained about when inspecting with
> an editor or hexdump.
> 
> Now, a few minutes ago I was able to catch this bug when I copied the
> kernel GIT tree to apply a patch manually and did "git commit -a".
> Strangely, the commit contained one extra file that I didn't touch.
> git diff showed this for the extra file:
> 
> ==============================
> --- a/drivers/usb/gadget/fsl_usb2_udc.h
> +++ b/drivers/usb/gadget/fsl_usb2_udc.h
> @@ -427,7 +427,7 @@ struct ep_td_struct {
>  #define  DTD_ADDR_MASK                        0xFFFFFFE0
>  #define  DTD_PACKET_SIZE                      0x7FFF0000
>  #define  DTD_LENGTH_BIT_POS                   16
> -#define  DTD_ERROR_MASK                       (DTD_STATUS_HALTED | \
> +#define  DTD_ERROR_MASK                       (DTD_STATUS_HALTED | ^Z
>                                                 DTD_STATUS_DATA_BUFF_ERR | \
>                                                 DTD_STATUS_TRANSACTION_ERR)
>  /* Alignment requirements; must be a power of two */
> ==============================
> 
> The "^Z" is a 0-character in the file and is not present in the
> original source tree, only in the copy.
> 
> Similar errors happened during copying large files on the same
> machine but it seems it's enough to trigger if the total amount
> of data read is large enough.
> 
> The mainboard has the latest (UEFI) firmware flashed which
> contains the latest AMD microcode, so microcode_ctl doesn't
> need to apply it anymore. Previously, I used amd-ucode-2012-01-17.tar
> from www.amd64.org/support/microcode.html which is now
> part of microcode_ctl in Fedora.

Can you send /proc/cpuinfo?

Also, a dmesg from a recent kernel?

> Since the error happens during compiling a source file and not only
> copying, the bug seems to happens during *reading* data.
> 
> Does anyone know whether it's a known problem in AMD FX CPUs?
> Does AMD have a newer microcode to fix this bug, or should I apply
> for warranty?

Thanks.

-- 
Regards/Gruss,
    Boris.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: AMD FX CPU bug, not fixed by latest microcode?
  2012-06-11  8:43 ` Borislav Petkov
@ 2012-06-11  9:49   ` Borislav Petkov
  2012-06-11 11:05     ` Johannes Stezenbach
  0 siblings, 1 reply; 14+ messages in thread
From: Borislav Petkov @ 2012-06-11  9:49 UTC (permalink / raw)
  To: Boszormenyi Zoltan, linux-kernel, Andreas Herrmann

On Mon, Jun 11, 2012 at 10:43:18AM +0200, Borislav Petkov wrote:
> On Sun, Jun 10, 2012 at 09:24:13PM +0200, Boszormenyi Zoltan wrote:
> > I have an AMD FX-8120 boxed CPU in an ASUS M5A99X-EVO mainboard
> > with 32GB DDR3/1600 memory, running Fedora 17, upgraded from 16.
> > memtest86+ show no problems.

Ohe other thing: if there's an option in the BIOS to disable the IOMMU,
can you do that and try reproducing the issue with IOMMU disabled?

Thanks.

-- 
Regards/Gruss,
    Boris.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: AMD FX CPU bug, not fixed by latest microcode?
  2012-06-11  8:13   ` Boszormenyi Zoltan
@ 2012-06-11 10:21     ` Clemens Ladisch
  2012-06-11 10:57       ` Boszormenyi Zoltan
  0 siblings, 1 reply; 14+ messages in thread
From: Clemens Ladisch @ 2012-06-11 10:21 UTC (permalink / raw)
  To: Boszormenyi Zoltan; +Cc: linux-kernel

Boszormenyi Zoltan wrote:
> 2012-06-11 09:52 keltezéssel, Clemens Ladisch írta:
>>> Similar errors happened during copying large files on the same
>>> machine but it seems it's enough to trigger if the total amount
>>> of data read is large enough.
>>
>> Does "large enough" mean "large enough so that they are not in the file
>> cache"?
>>
> "Large enough" means it's usually not in file system cache

If you could see a change while it's in the cache, you could rule
out the disks.

>> All caches and your memory are ECC protected,
>
> Unfortunately the memory is not with ECC.

Sorry, I misread your mail.

This means that you cannot rule out bad memory.

>> so I think it is unlikely
>> that the problem is with these.  If I had to guess, I'd point to your
>> disk (firmware) or the SATA controller.  (A bad or loose SATA cable
>> would throw CRC errors into the kernel log.  Are there any?)
>
> The disks (8 of them) are attached to 3ware 9650SE-8LPML in RAID10.
> tw_cli reports no problems.

Could you check whether the same happens with some disk connected to
the on-board SATA controller?  Or while copying around lots of data
inside a RAM disk?

>> What is the exact offset of the changed byte in the file?  (It might be
>> at a cacheline, sector, or page boundary.)
>
> The bad character is at offset 0x4b74.

That's completely random, i.e., probably an hardware error.

>> http://support.amd.com/us/Processor_TechDocs/48063_15h_Mod_00h-0Fh_Rev_Guide.pdf
>
> The "no fix planned" for every errata is saddening...

It's good news, because none of them actually matter.


Regards,
Clemens

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: AMD FX CPU bug, not fixed by latest microcode?
  2012-06-11 10:21     ` Clemens Ladisch
@ 2012-06-11 10:57       ` Boszormenyi Zoltan
  0 siblings, 0 replies; 14+ messages in thread
From: Boszormenyi Zoltan @ 2012-06-11 10:57 UTC (permalink / raw)
  To: Clemens Ladisch; +Cc: linux-kernel

2012-06-11 12:21 keltezéssel, Clemens Ladisch írta:
> All caches and your memory are ECC protected,
>> Unfortunately the memory is not with ECC.
> Sorry, I misread your mail.
>
> This means that you cannot rule out bad memory.

And there were two storms recently with lightings that struck nearby. :-(
I will retest with memtest86+.

Thanks for everyone who replied.

Best regards,
Zoltán Böszörményi


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: AMD FX CPU bug, not fixed by latest microcode?
  2012-06-11  9:49   ` Borislav Petkov
@ 2012-06-11 11:05     ` Johannes Stezenbach
  2012-06-13  7:30       ` Boszormenyi Zoltan
  0 siblings, 1 reply; 14+ messages in thread
From: Johannes Stezenbach @ 2012-06-11 11:05 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Boszormenyi Zoltan, linux-kernel, Andreas Herrmann

On Mon, Jun 11, 2012 at 11:49:05AM +0200, Borislav Petkov wrote:
> On Mon, Jun 11, 2012 at 10:43:18AM +0200, Borislav Petkov wrote:
> > On Sun, Jun 10, 2012 at 09:24:13PM +0200, Boszormenyi Zoltan wrote:
> > > I have an AMD FX-8120 boxed CPU in an ASUS M5A99X-EVO mainboard
> > > with 32GB DDR3/1600 memory, running Fedora 17, upgraded from 16.
> > > memtest86+ show no problems.
> 
> Ohe other thing: if there's an option in the BIOS to disable the IOMMU,
> can you do that and try reproducing the issue with IOMMU disabled?

Maybe not related, but I had bad memory in my Intel Core-i5
based system some months ago which resulted in rare crashes,
usually manifested itself as g++ ICEs when compiling a
mid-sized  C++ project -- compiling a kernel with make -p4 showed
no problem.  Also memtest86+ didn't show the issue,
so I tried memtest86-4.0a which claims to find more errors
due to SMP support.  An overnight run left me with a screen
full of garbage and a crashed memtest86-4.0.  I replaced
the RAM anyway and the box was stable since then.

memtest86-4.0a is at
http://memtest86.com/

The page claims:
  With a single CPU it is not possible to drive multi-channel memory
  controllers at full speed making it impossible to detect some types of errors

Maybe someone knowledgable could comment if this is true.


Johannes

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: AMD FX CPU bug, not fixed by latest microcode?
  2012-06-11 11:05     ` Johannes Stezenbach
@ 2012-06-13  7:30       ` Boszormenyi Zoltan
  2012-06-13 15:57         ` Borislav Petkov
  0 siblings, 1 reply; 14+ messages in thread
From: Boszormenyi Zoltan @ 2012-06-13  7:30 UTC (permalink / raw)
  To: Johannes Stezenbach; +Cc: Borislav Petkov, linux-kernel, Andreas Herrmann

2012-06-11 13:05 keltezéssel, Johannes Stezenbach írta:
> On Mon, Jun 11, 2012 at 11:49:05AM +0200, Borislav Petkov wrote:
>> On Mon, Jun 11, 2012 at 10:43:18AM +0200, Borislav Petkov wrote:
>>> On Sun, Jun 10, 2012 at 09:24:13PM +0200, Boszormenyi Zoltan wrote:
>>>> I have an AMD FX-8120 boxed CPU in an ASUS M5A99X-EVO mainboard
>>>> with 32GB DDR3/1600 memory, running Fedora 17, upgraded from 16.
>>>> memtest86+ show no problems.
>> Ohe other thing: if there's an option in the BIOS to disable the IOMMU,
>> can you do that and try reproducing the issue with IOMMU disabled?
> Maybe not related, but I had bad memory in my Intel Core-i5
> based system some months ago which resulted in rare crashes,
> usually manifested itself as g++ ICEs when compiling a
> mid-sized  C++ project -- compiling a kernel with make -p4 showed
> no problem.  Also memtest86+ didn't show the issue,
> so I tried memtest86-4.0a which claims to find more errors
> due to SMP support.  An overnight run left me with a screen
> full of garbage and a crashed memtest86-4.0.  I replaced
> the RAM anyway and the box was stable since then.
>
> memtest86-4.0a is at
> http://memtest86.com/
>
> The page claims:
>    With a single CPU it is not possible to drive multi-channel memory
>    controllers at full speed making it impossible to detect some types of errors
>
> Maybe someone knowledgable could comment if this is true.

This one locked up on my machine but memtest86+ 4.20 detected
12 different addresses with faulty bits in the lower 16GB. Applying
for warranty.

With only two modules, "make -j8" succeeded a lot of times.

Thanks for everyone who tried to help.

Best regards,
Zoltán Böszörményi


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: AMD FX CPU bug, not fixed by latest microcode?
  2012-06-13  7:30       ` Boszormenyi Zoltan
@ 2012-06-13 15:57         ` Borislav Petkov
  2012-06-13 18:26           ` Boszormenyi Zoltan
  0 siblings, 1 reply; 14+ messages in thread
From: Borislav Petkov @ 2012-06-13 15:57 UTC (permalink / raw)
  To: Boszormenyi Zoltan; +Cc: Johannes Stezenbach, linux-kernel, Andreas Herrmann

On Wed, Jun 13, 2012 at 09:30:03AM +0200, Boszormenyi Zoltan wrote:
> This one locked up on my machine but memtest86+ 4.20 detected 12
> different addresses with faulty bits in the lower 16GB. Applying for
> warranty.

If you have multiple DIMMs, you can also take out the DIMM which
contains the 16GB and rerun memtest to confirm it really is the culprit.

-- 
Regards/Gruss,
    Boris.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: AMD FX CPU bug, not fixed by latest microcode?
  2012-06-13 15:57         ` Borislav Petkov
@ 2012-06-13 18:26           ` Boszormenyi Zoltan
  2012-06-13 22:06             ` Borislav Petkov
  0 siblings, 1 reply; 14+ messages in thread
From: Boszormenyi Zoltan @ 2012-06-13 18:26 UTC (permalink / raw)
  To: Borislav Petkov, Johannes Stezenbach, linux-kernel, Andreas Herrmann

2012-06-13 17:57 keltezéssel, Borislav Petkov írta:
> On Wed, Jun 13, 2012 at 09:30:03AM +0200, Boszormenyi Zoltan wrote:
>> This one locked up on my machine but memtest86+ 4.20 detected 12
>> different addresses with faulty bits in the lower 16GB. Applying for
>> warranty.
> If you have multiple DIMMs, you can also take out the DIMM which
> contains the 16GB and rerun memtest to confirm it really is the culprit.
>

I did exactly that, the remaining two 8GB modules don't have faults
according to memtest86+ and the machine is stable with "make -j8"
in the kernel tree. Neither thunderbird nor firefox crashed for a day,
these are the usual victims when hitting the bad memory address.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: AMD FX CPU bug, not fixed by latest microcode?
  2012-06-13 18:26           ` Boszormenyi Zoltan
@ 2012-06-13 22:06             ` Borislav Petkov
  2012-06-14  4:23               ` Boszormenyi Zoltan
  0 siblings, 1 reply; 14+ messages in thread
From: Borislav Petkov @ 2012-06-13 22:06 UTC (permalink / raw)
  To: Boszormenyi Zoltan; +Cc: Johannes Stezenbach, linux-kernel, Andreas Herrmann

On Wed, Jun 13, 2012 at 08:26:07PM +0200, Boszormenyi Zoltan wrote:
> I did exactly that, the remaining two 8GB modules don't have faults
> according to memtest86+ and the machine is stable with "make -j8" in
> the kernel tree. Neither thunderbird nor firefox crashed for a day,
> these are the usual victims when hitting the bad memory address.

Cool.

I guess you could try to limit it even further by taking a known-good
8GB module and pairing it with one of the "bad" ones to see whether one
of the "bad" 8GB modules is faulty or both of them are.

Or you could stop wasting time and go buy two new ones :-)

-- 
Regards/Gruss,
    Boris.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: AMD FX CPU bug, not fixed by latest microcode?
  2012-06-13 22:06             ` Borislav Petkov
@ 2012-06-14  4:23               ` Boszormenyi Zoltan
  0 siblings, 0 replies; 14+ messages in thread
From: Boszormenyi Zoltan @ 2012-06-14  4:23 UTC (permalink / raw)
  To: Borislav Petkov, Johannes Stezenbach, linux-kernel, Andreas Herrmann

2012-06-14 00:06 keltezéssel, Borislav Petkov írta:
> On Wed, Jun 13, 2012 at 08:26:07PM +0200, Boszormenyi Zoltan wrote:
>> I did exactly that, the remaining two 8GB modules don't have faults
>> according to memtest86+ and the machine is stable with "make -j8" in
>> the kernel tree. Neither thunderbird nor firefox crashed for a day,
>> these are the usual victims when hitting the bad memory address.
> Cool.
>
> I guess you could try to limit it even further by taking a known-good
> 8GB module and pairing it with one of the "bad" ones to see whether one
> of the "bad" 8GB modules is faulty or both of them are.
>
> Or you could stop wasting time and go buy two new ones :-)

Yesterday I took the bad pair back to the shop with a screenshot
showing memtest86+ results and they accepted it for warranty.
I will get a new pair or RAMs for no extra fee.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: AMD FX CPU bug, not fixed by latest microcode?
@ 2012-06-11  3:45 Rus
  0 siblings, 0 replies; 14+ messages in thread
From: Rus @ 2012-06-11  3:45 UTC (permalink / raw)
  To: zboszor; +Cc: linux-kernel

:Does anyone know whether it's a known problem in AMD FX CPUs?
:Does AMD have a newer microcode to fix this bug, or should I apply
:for warranty?

This is do not related to AMD microcode. The main problem is that Asus
do not tests their motherboards against supported CPU list. The next
problem  is m/b BIOS and last problem - ASUS is ignoring Linux users
and Linux as oficially supported OS on their m/b. I'm having the same
problems with FX-8150 and M5A97-PRO m/b. This m/b do not work by
default with FX-* CPU because of wrong power mode selected for this
CPU by BIOS. To make it work somehow you need to :

1. Flash the latest BIOS
2. Disable turbo core in BIOS setup
3. Enable extreme EPU mode in BIOS setup
4. Disable EPU Power Saving mode in BIOS setup
5. Use the >= 3.5 kernels (rc1 or rc2 for now)
6. Play with IOMMU mode in BIOS setup (try enable, try disable)

P.S. I'm not subscribed, so cc.

Rus

-- 
SfinxSoft
http://sfinxsoft.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2012-06-14  4:23 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-10 19:24 AMD FX CPU bug, not fixed by latest microcode? Boszormenyi Zoltan
2012-06-11  7:52 ` Clemens Ladisch
2012-06-11  8:13   ` Boszormenyi Zoltan
2012-06-11 10:21     ` Clemens Ladisch
2012-06-11 10:57       ` Boszormenyi Zoltan
2012-06-11  8:43 ` Borislav Petkov
2012-06-11  9:49   ` Borislav Petkov
2012-06-11 11:05     ` Johannes Stezenbach
2012-06-13  7:30       ` Boszormenyi Zoltan
2012-06-13 15:57         ` Borislav Petkov
2012-06-13 18:26           ` Boszormenyi Zoltan
2012-06-13 22:06             ` Borislav Petkov
2012-06-14  4:23               ` Boszormenyi Zoltan
2012-06-11  3:45 Rus

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.