linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Bricked x86 CPU with software?
@ 2018-01-04  0:47 Tim Mouraveiko
  2018-01-04 20:06 ` Pavel Machek
  2018-01-06 15:50 ` Nikolay Borisov
  0 siblings, 2 replies; 22+ messages in thread
From: Tim Mouraveiko @ 2018-01-04  0:47 UTC (permalink / raw)
  To: linux-kernel

Hi,

In all my years of extensive experience writing drivers and kernels, I never came across a situation 
where you could brick an x86 CPU. Not until recently, when I was working on debugging a piece of 
code and I bricked an Intel CPU. I am not talking about an experimental motherboard or anything 
exotic or an electrical issue where the CPU got fried, but before the software code execution the CPU 
was fine and then it´s dead. There were signs that something was not right, that the code was causing 
unusual behavior, which is what I was debugging.

Has anyone else ever experienced a bricked CPU after executing software code? I just wanted to get 
input from the community to see if anyone had had any experience like that, since it seems rather 
unusual to me.

Tim

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bricked x86 CPU with software?
  2018-01-04  0:47 Bricked x86 CPU with software? Tim Mouraveiko
@ 2018-01-04 20:06 ` Pavel Machek
  2018-01-04 21:00   ` Tim Mouraveiko
  2018-01-06 15:50 ` Nikolay Borisov
  1 sibling, 1 reply; 22+ messages in thread
From: Pavel Machek @ 2018-01-04 20:06 UTC (permalink / raw)
  To: Tim Mouraveiko; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1126 bytes --]

Hi!

> In all my years of extensive experience writing drivers and kernels, I never came across a situation 
> where you could brick an x86 CPU. Not until recently, when I was working on debugging a piece of 
> code and I bricked an Intel CPU. I am not talking about an experimental motherboard or anything 
> exotic or an electrical issue where the CPU got fried, but before the software code execution the CPU 
> was fine and then it´s dead. There were signs that something was not right, that the code was causing 
> unusual behavior, which is what I was debugging.
> 
> Has anyone else ever experienced a bricked CPU after executing software code? I just wanted to get 
> input from the community to see if anyone had had any experience like that, since it seems rather 
> unusual to me.

Never seen that before. Can you try to brick another one? :-).

You may want to remove AC power and battery, wait for half an hour,
then attempt to boot it...

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bricked x86 CPU with software?
  2018-01-04 20:06 ` Pavel Machek
@ 2018-01-04 21:00   ` Tim Mouraveiko
  2018-01-04 21:04     ` Andy Shevchenko
                       ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Tim Mouraveiko @ 2018-01-04 21:00 UTC (permalink / raw)
  To: Pavel Machek; +Cc: linux-kernel

Pavel,

As I mentioned before, I repeatedly and fully power-cycled the motherboard and reset BIOS 
and etc. It made no difference. I can see that the processor was not drawing any power. The 
software code behaved in a similar fashion on other processors, until I fixed it so that it would 
not kill any more processors.

In case you are curious there was no overheating, no 100% utilization, no tampering with 
hardware (GPIO pins or anything of that sort), no overclocking and etc. No hardware issues 
or changes at all.

Tim

> Hi!
> 
> > In all my years of extensive experience writing drivers and kernels, I never came across a situation 
> > where you could brick an x86 CPU. Not until recently, when I was working on debugging a piece of 
> > code and I bricked an Intel CPU. I am not talking about an experimental motherboard or anything 
> > exotic or an electrical issue where the CPU got fried, but before the software code execution the CPU 
> > was fine and then it´s dead. There were signs that something was not right, that the code was causing 
> > unusual behavior, which is what I was debugging.
> > 
> > Has anyone else ever experienced a bricked CPU after executing software code? I just wanted to get 
> > input from the community to see if anyone had had any experience like that, since it seems rather 
> > unusual to me.
> 
> Never seen that before. Can you try to brick another one? :-).
> 
> You may want to remove AC power and battery, wait for half an hour,
> then attempt to boot it...
> 
> 									Pavel
> -- 
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bricked x86 CPU with software?
  2018-01-04 21:00   ` Tim Mouraveiko
@ 2018-01-04 21:04     ` Andy Shevchenko
  2018-01-04 21:31       ` Tim Mouraveiko
  2018-01-04 21:23     ` Pavel Machek
  2018-01-05  1:51     ` james harvey
  2 siblings, 1 reply; 22+ messages in thread
From: Andy Shevchenko @ 2018-01-04 21:04 UTC (permalink / raw)
  To: Tim Mouraveiko; +Cc: Pavel Machek, Linux Kernel Mailing List

On Thu, Jan 4, 2018 at 11:00 PM, Tim Mouraveiko <tim.ml@ipcopper.com> wrote:
> Pavel,
>
> As I mentioned before, I repeatedly and fully power-cycled the motherboard and reset BIOS
> and etc. It made no difference. I can see that the processor was not drawing any power. The
> software code behaved in a similar fashion on other processors, until I fixed it so that it would
> not kill any more processors.
>
> In case you are curious there was no overheating, no 100% utilization, no tampering with
> hardware (GPIO pins or anything of that sort), no overclocking and etc. No hardware issues
> or changes at all.

Please, do not top post.

Just to be sure, have you checked same CPU on different motherboard?
It might be that voltage regulators on it just died.


>> > In all my years of extensive experience writing drivers and kernels, I never came across a situation
>> > where you could brick an x86 CPU. Not until recently, when I was working on debugging a piece of
>> > code and I bricked an Intel CPU. I am not talking about an experimental motherboard or anything
>> > exotic or an electrical issue where the CPU got fried, but before the software code execution the CPU
>> > was fine and then it愀 dead. There were signs that something was not right, that the code was causing
>> > unusual behavior, which is what I was debugging.
>> >
>> > Has anyone else ever experienced a bricked CPU after executing software code? I just wanted to get
>> > input from the community to see if anyone had had any experience like that, since it seems rather
>> > unusual to me.
>>
>> Never seen that before. Can you try to brick another one? :-).
>>
>> You may want to remove AC power and battery, wait for half an hour,
>> then attempt to boot it...
>>
>>                                                                       Pavel
>> --
>> (english) http://www.livejournal.com/~pavelmachek
>> (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
>>
>
>



-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bricked x86 CPU with software?
  2018-01-04 21:00   ` Tim Mouraveiko
  2018-01-04 21:04     ` Andy Shevchenko
@ 2018-01-04 21:23     ` Pavel Machek
  2018-01-04 22:13       ` Tim Mouraveiko
  2018-01-05  1:51     ` james harvey
  2 siblings, 1 reply; 22+ messages in thread
From: Pavel Machek @ 2018-01-04 21:23 UTC (permalink / raw)
  To: Tim Mouraveiko; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 631 bytes --]

Hi!

> As I mentioned before, I repeatedly and fully power-cycled the motherboard and reset BIOS
> and etc. It made no difference. I can see that the processor was not drawing any power. The
> software code behaved in a similar fashion on other processors, until I fixed it so that it would
> not kill any more processors.
> 

So you have code that killed more than one processor? Save it! We want
a copy.

Do you have model numbers of affected CPUs?
									Pavel
									

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bricked x86 CPU with software?
  2018-01-04 21:04     ` Andy Shevchenko
@ 2018-01-04 21:31       ` Tim Mouraveiko
  0 siblings, 0 replies; 22+ messages in thread
From: Tim Mouraveiko @ 2018-01-04 21:31 UTC (permalink / raw)
  To: Andy Shevchenko; +Cc: Pavel Machek, Linux Kernel Mailing List

> On Thu, Jan 4, 2018 at 11:00 PM, Tim Mouraveiko <tim.ml@ipcopper.com> wrote:
> > Pavel,
> >
> > As I mentioned before, I repeatedly and fully power-cycled the motherboard and reset BIOS
> > and etc. It made no difference. I can see that the processor was not drawing any power. The
> > software code behaved in a similar fashion on other processors, until I fixed it so that it would
> > not kill any more processors.
> >
> > In case you are curious there was no overheating, no 100% utilization, no tampering with
> > hardware (GPIO pins or anything of that sort), no overclocking and etc. No hardware issues
> > or changes at all.
> 
> Please, do not top post.
> 
> Just to be sure, have you checked same CPU on different motherboard?
> It might be that voltage regulators on it just died.

I did not check the same CPU on a different motherboard, but I did test the code on both the 
same type of CPU and a different type of CPU.

> 
> 
> >> > In all my years of extensive experience writing drivers and kernels, I never came across a situation
> >> > where you could brick an x86 CPU. Not until recently, when I was working on debugging a piece of
> >> > code and I bricked an Intel CPU. I am not talking about an experimental motherboard or anything
> >> > exotic or an electrical issue where the CPU got fried, but before the software code execution the CPU
> >> > was fine and then it dead. There were signs that something was not right, that the code was causing
> >> > unusual behavior, which is what I was debugging.
> >> >
> >> > Has anyone else ever experienced a bricked CPU after executing software code? I just wanted to get
> >> > input from the community to see if anyone had had any experience like that, since it seems rather
> >> > unusual to me.
> >>
> >> Never seen that before. Can you try to brick another one? :-).
> >>
> >> You may want to remove AC power and battery, wait for half an hour,
> >> then attempt to boot it...
> >>
> >>                                                                       Pavel
> >> --
> >> (english) http://www.livejournal.com/~pavelmachek
> >> (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
> >>
> >
> >
> 
> 
> 
> -- 
> With Best Regards,
> Andy Shevchenko

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bricked x86 CPU with software?
  2018-01-04 21:23     ` Pavel Machek
@ 2018-01-04 22:13       ` Tim Mouraveiko
  2018-01-04 22:40         ` Pavel Machek
  0 siblings, 1 reply; 22+ messages in thread
From: Tim Mouraveiko @ 2018-01-04 22:13 UTC (permalink / raw)
  To: Pavel Machek; +Cc: linux-kernel

> > As I mentioned before, I repeatedly and fully power-cycled the motherboard and reset BIOS
> > and etc. It made no difference. I can see that the processor was not drawing any power. The
> > software code behaved in a similar fashion on other processors, until I fixed it so that it would
> > not kill any more processors.
> > 
> 
> So you have code that killed more than one processor? Save it! We want
> a copy.
> 
> Do you have model numbers of affected CPUs?


Why would you want a copy? Last time I checked bricked CPUs do not work well, even as 
decorations.

I believe the processors were Intel Xeon series. The code would likely run on others too.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bricked x86 CPU with software?
  2018-01-04 22:13       ` Tim Mouraveiko
@ 2018-01-04 22:40         ` Pavel Machek
  2018-01-05  1:21           ` Tim Mouraveiko
  0 siblings, 1 reply; 22+ messages in thread
From: Pavel Machek @ 2018-01-04 22:40 UTC (permalink / raw)
  To: Tim Mouraveiko; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1174 bytes --]

On Thu 2018-01-04 14:13:56, Tim Mouraveiko wrote:
> > > As I mentioned before, I repeatedly and fully power-cycled the motherboard and reset BIOS
> > > and etc. It made no difference. I can see that the processor was not drawing any power. The
> > > software code behaved in a similar fashion on other processors, until I fixed it so that it would
> > > not kill any more processors.
> > > 
> > 
> > So you have code that killed more than one processor? Save it! We want
> > a copy.
> > 
> > Do you have model numbers of affected CPUs?
> 
> 
> Why would you want a copy? Last time I checked bricked CPUs do not work well, even as 
> decorations.
> 
> I believe the processors were Intel Xeon series. The code would likely run on others too.

Well... Intel's shares are overpriced, and you have code to fix that
:-).

Actually... I don't think your code works. That's why I'm curious. But
if it works, its rather a big news... and I'm sure Intel and cloud
providers are going to be interested.

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bricked x86 CPU with software?
  2018-01-04 22:40         ` Pavel Machek
@ 2018-01-05  1:21           ` Tim Mouraveiko
  2018-01-05  1:29             ` Hector Martin 'marcan'
  2018-01-05  9:28             ` Pavel Machek
  0 siblings, 2 replies; 22+ messages in thread
From: Tim Mouraveiko @ 2018-01-05  1:21 UTC (permalink / raw)
  To: Pavel Machek; +Cc: linux-kernel

> On Thu 2018-01-04 14:13:56, Tim Mouraveiko wrote:
> > > > As I mentioned before, I repeatedly and fully power-cycled the motherboard and reset BIOS
> > > > and etc. It made no difference. I can see that the processor was not drawing any power. The
> > > > software code behaved in a similar fashion on other processors, until I fixed it so that it would
> > > > not kill any more processors.
> > > > 
> > > 
> > > So you have code that killed more than one processor? Save it! We want
> > > a copy.
> > > 
> > > Do you have model numbers of affected CPUs?
> > 
> > 
> > Why would you want a copy? Last time I checked bricked CPUs do not work well, even as 
> > decorations.
> > 
> > I believe the processors were Intel Xeon series. The code would likely run on others too.
> 
> Well... Intel's shares are overpriced, and you have code to fix that
> :-).
> 
> Actually... I don't think your code works. That's why I'm curious. But
> if it works, its rather a big news... and I'm sure Intel and cloud
> providers are going to be interested.
> 

I first discovered this issue over a year ago, quite by accident. I changed the code I was 
working on so as not to kill the CPU (as that is not what I was trying to). We made Intel aware 
of it. They didn´t care much, one of their personnel suggesting that they already knew about it 
(whether this is true or not I couldn´t say). It popped up again later, so I had to fix the code 
again. It could be a buggy implementation of a certain x86 functionality, but I left it at that 
because I had better things to do with my time.

Now this news came up about meltdown and spectre and I was curious if anyone else had 
experienced a dead CPU by software, too. Meltdown and spectre are undeniably a problem, 
but the magnitude and practicality of it is questionable.

I suspect that what I discovered is either a kill switch, an unintentional flaw that was 
implemented at the time the original feature was built into x86 functionality and kept 
propagating through successive generations of processors, or could well be that I have a 
very destructive and targeted solar flare that is after my CPUs. So, I figured I would put the 
question out there, to see if anyone else had a similar experience. Putting the solar flare idea 
aside, I can´t conclusively say whether it is a flaw or a feature. Both options are supported at 
this time by my observations of the CPU behavior.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bricked x86 CPU with software?
  2018-01-05  1:21           ` Tim Mouraveiko
@ 2018-01-05  1:29             ` Hector Martin 'marcan'
  2018-01-05 18:54               ` Tim Mouraveiko
  2018-01-05  9:28             ` Pavel Machek
  1 sibling, 1 reply; 22+ messages in thread
From: Hector Martin 'marcan' @ 2018-01-05  1:29 UTC (permalink / raw)
  To: Tim Mouraveiko, Pavel Machek; +Cc: linux-kernel

On 2018-01-05 10:21, Tim Mouraveiko wrote:
>> On Thu 2018-01-04 14:13:56, Tim Mouraveiko wrote:
>> Actually... I don't think your code works. That's why I'm curious. But
>> if it works, its rather a big news... and I'm sure Intel and cloud
>> providers are going to be interested.
>>
> 
> I first discovered this issue over a year ago, quite by accident. I changed the code I was 
> working on so as not to kill the CPU (as that is not what I was trying to). We made Intel aware 
> of it. They didn´t care much, one of their personnel suggesting that they already knew about it 
> (whether this is true or not I couldn´t say). It popped up again later, so I had to fix the code 
> again. It could be a buggy implementation of a certain x86 functionality, but I left it at that 
> because I had better things to do with my time.
> 
> Now this news came up about meltdown and spectre and I was curious if anyone else had 
> experienced a dead CPU by software, too. Meltdown and spectre are undeniably a problem, 
> but the magnitude and practicality of it is questionable.
> 
> I suspect that what I discovered is either a kill switch, an unintentional flaw that was 
> implemented at the time the original feature was built into x86 functionality and kept 
> propagating through successive generations of processors, or could well be that I have a 
> very destructive and targeted solar flare that is after my CPUs. So, I figured I would put the 
> question out there, to see if anyone else had a similar experience. Putting the solar flare idea 
> aside, I can´t conclusively say whether it is a flaw or a feature. Both options are supported at 
> this time by my observations of the CPU behavior.
> 

If you made Intel aware of the issue a year ago, and they weren't
interested, then the responsible thing to do is disclose the problem
publicly. This is a security issue (if trusted code can brick a CPU,
it's an issue for bare metal hosting providers; if untrusted code can
brick a CPU, it's a *huge* issue for every cloud provider and many, many
others who run code in various sandboxes). If the vendor is not
receptive to coordinated disclosure, the only option is public
disclosure to at least make people aware of the problem and allow for
mitigations to be developed, if possible.

Personally, I would be very interested in seeing such code. We've seen
several ways to brick nonvolatile firmware (writable BIOSes, bad CMOS
data, etc.), but bricking a CPU is a first. The only way that can happen
is either blowing a kill fuse, or causing actual hardware damage, since
CPUs have no nonvolatile memory other than fuses. Either way this would
be a very interesting result.

-- 
Hector Martin "marcan" (marcan@marcan.st)
Public Key: https://mrcn.st/pub

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bricked x86 CPU with software?
  2018-01-04 21:00   ` Tim Mouraveiko
  2018-01-04 21:04     ` Andy Shevchenko
  2018-01-04 21:23     ` Pavel Machek
@ 2018-01-05  1:51     ` james harvey
  2018-01-06  1:00       ` Tim Mouraveiko
  2 siblings, 1 reply; 22+ messages in thread
From: james harvey @ 2018-01-05  1:51 UTC (permalink / raw)
  To: Tim Mouraveiko; +Cc: Pavel Machek, kernel list

On Thu, Jan 4, 2018 at 4:00 PM, Tim Mouraveiko <tim.ml@ipcopper.com> wrote:
> Pavel,
>
> As I mentioned before, I repeatedly and fully power-cycled the motherboard and reset BIOS
> and etc. It made no difference. I can see that the processor was not drawing any power. The
> software code behaved in a similar fashion on other processors, until I fixed it so that it would
> not kill any more processors.
>
> In case you are curious there was no overheating, no 100% utilization, no tampering with
> hardware (GPIO pins or anything of that sort), no overclocking and etc. No hardware issues
> or changes at all.
>
> Tim

To clarify, by "in a similar fashion on other processors", do you
actually mean you consistently bricked multiple CPUs using the same
code?  Or, was it just this one CPU that bricked, and it was just
acting buggy on other processors?

Unless you consistently bricked multiples, my bet is coincidence.  In
your original post, "There were signs that something was not right,
that the code was causing unusual behavior, which is what I was
debugging." makes me think it was a defective CPU but still
functional, and died as you were debugging/running the buggy code.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bricked x86 CPU with software?
  2018-01-05  1:21           ` Tim Mouraveiko
  2018-01-05  1:29             ` Hector Martin 'marcan'
@ 2018-01-05  9:28             ` Pavel Machek
  2018-01-06  1:08               ` Tim Mouraveiko
  1 sibling, 1 reply; 22+ messages in thread
From: Pavel Machek @ 2018-01-05  9:28 UTC (permalink / raw)
  To: Tim Mouraveiko; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2000 bytes --]

Hi!

On Thu 2018-01-04 17:21:36, Tim Mouraveiko wrote:
> > On Thu 2018-01-04 14:13:56, Tim Mouraveiko wrote:
> > > > > As I mentioned before, I repeatedly and fully power-cycled the motherboard and reset BIOS
> > > > > and etc. It made no difference. I can see that the processor was not drawing any power. The
> > > > > software code behaved in a similar fashion on other processors, until I fixed it so that it would
> > > > > not kill any more processors.
> > > > >
> > > >
> > > > So you have code that killed more than one processor? Save it! We want
> > > > a copy.
> > > >
> > > > Do you have model numbers of affected CPUs?
> > >
> > >
> > > Why would you want a copy? Last time I checked bricked CPUs do not work well, even as
> > > decorations.
> > >
> > > I believe the processors were Intel Xeon series. The code would likely run on others too.
> >
> > Well... Intel's shares are overpriced, and you have code to fix that
> > :-).
> >
> > Actually... I don't think your code works. That's why I'm curious. But
> > if it works, its rather a big news... and I'm sure Intel and cloud
> > providers are going to be interested.
> >
> 
> I first discovered this issue over a year ago, quite by accident. I changed the code I was
> working on so as not to kill the CPU (as that is not what I was trying to). We made Intel aware
> of it. They didn´t care much, one of their personnel suggesting that they already knew about it
> (whether this is true or not I couldn´t say). It popped up again later, so I had to fix the code
> again. It could be a buggy implementation of a certain x86 functionality, but I left it at that
> because I had better things to do with my time.
>

Is the sequence available from ring 3, or does it need ring 0?

Can we get the code? Extraordinary claims and all that...

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bricked x86 CPU with software?
  2018-01-05  1:29             ` Hector Martin 'marcan'
@ 2018-01-05 18:54               ` Tim Mouraveiko
  0 siblings, 0 replies; 22+ messages in thread
From: Tim Mouraveiko @ 2018-01-05 18:54 UTC (permalink / raw)
  To: Hector Martin 'marcan'; +Cc: linux-kernel

> On 2018-01-05 10:21, Tim Mouraveiko wrote:
> >> On Thu 2018-01-04 14:13:56, Tim Mouraveiko wrote:
> >> Actually... I don't think your code works. That's why I'm curious. But
> >> if it works, its rather a big news... and I'm sure Intel and cloud
> >> providers are going to be interested.
> >>
> > 
> > I first discovered this issue over a year ago, quite by accident. I changed the code I was 
> > working on so as not to kill the CPU (as that is not what I was trying to). We made Intel aware 
> > of it. They didn´t care much, one of their personnel suggesting that they already knew about it 
> > (whether this is true or not I couldn´t say). It popped up again later, so I had to fix the code 
> > again. It could be a buggy implementation of a certain x86 functionality, but I left it at that 
> > because I had better things to do with my time.
> > 
> > Now this news came up about meltdown and spectre and I was curious if anyone else had 
> > experienced a dead CPU by software, too. Meltdown and spectre are undeniably a problem, 
> > but the magnitude and practicality of it is questionable.
> > 
> > I suspect that what I discovered is either a kill switch, an unintentional flaw that was 
> > implemented at the time the original feature was built into x86 functionality and kept 
> > propagating through successive generations of processors, or could well be that I have a 
> > very destructive and targeted solar flare that is after my CPUs. So, I figured I would put the 
> > question out there, to see if anyone else had a similar experience. Putting the solar flare idea 
> > aside, I can´t conclusively say whether it is a flaw or a feature. Both options are supported at 
> > this time by my observations of the CPU behavior.
> > 
> 
> If you made Intel aware of the issue a year ago, and they weren't
> interested, then the responsible thing to do is disclose the problem
> publicly. This is a security issue (if trusted code can brick a CPU,
> it's an issue for bare metal hosting providers; if untrusted code can
> brick a CPU, it's a *huge* issue for every cloud provider and many, many
> others who run code in various sandboxes). If the vendor is not
> receptive to coordinated disclosure, the only option is public
> disclosure to at least make people aware of the problem and allow for
> mitigations to be developed, if possible.
> 
> Personally, I would be very interested in seeing such code. We've seen
> several ways to brick nonvolatile firmware (writable BIOSes, bad CMOS
> data, etc.), but bricking a CPU is a first. The only way that can happen
> is either blowing a kill fuse, or causing actual hardware damage, since
> CPUs have no nonvolatile memory other than fuses. Either way this would
> be a very interesting result.

We discovered the issue but chose not to distill the code into a standalone CPU-killing app. 
Once we realized that the CPU had been killed by the software and that the code caused 
other CPUs to behave the same way and once Intel said what they said, I made my pitch to 
pursue it, but the decision was made not to. I wasn´t to test the existing code beyond 
removing the offending part of it. Granted, I snuck a few tests in while removing it and a few 
times, for a few seconds, I held my breath. A few months later I had to fix it again.

Among the considerations was the question of what the possible purpose of designing such 
an application would be. Is this a kill switch or unintentional flaw? Particularly in light of Intel´s 
position. The consequences of a successful execution on a compatible CPU is a loss of 
physical property. Intel must have had good reasons to take the position that they did.

This issue would be a non-starter prior to the Pentium FDIV story. Since then Atmel 
popularized storable fuses, and things have gone on from there.

I did consider and investigate the electrical issue as a possible cause. I ruled it out before I 
tested other CPUs and different motherboards. Our OS is not a derivative of linux/freebsd, 
neither in concept nor design. In relevant parts all mainstream operating systems are the 
same design carried over from a long time ago and I dare to say most if not all non-
mainstream copied over the relevant part as well (maybe not exactly), as there was/is no 
good reason not to. In our case we did not have certain features in the OS as there was no 
good reason to have them, until I needed a way to catch a bug. In the end I did find the bug, 
albeit without using the feature.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bricked x86 CPU with software?
  2018-01-05  1:51     ` james harvey
@ 2018-01-06  1:00       ` Tim Mouraveiko
  0 siblings, 0 replies; 22+ messages in thread
From: Tim Mouraveiko @ 2018-01-06  1:00 UTC (permalink / raw)
  To: james harvey; +Cc: Pavel Machek, kernel list

> On Thu, Jan 4, 2018 at 4:00 PM, Tim Mouraveiko <tim.ml@ipcopper.com> wrote:
> > Pavel,
> >
> > As I mentioned before, I repeatedly and fully power-cycled the motherboard and reset BIOS
> > and etc. It made no difference. I can see that the processor was not drawing any power. The
> > software code behaved in a similar fashion on other processors, until I fixed it so that it would
> > not kill any more processors.
> >
> > In case you are curious there was no overheating, no 100% utilization, no tampering with
> > hardware (GPIO pins or anything of that sort), no overclocking and etc. No hardware issues
> > or changes at all.
> >
> > Tim
> 
> To clarify, by "in a similar fashion on other processors", do you
> actually mean you consistently bricked multiple CPUs using the same
> code?  Or, was it just this one CPU that bricked, and it was just
> acting buggy on other processors?
> 
> Unless you consistently bricked multiples, my bet is coincidence.  In
> your original post, "There were signs that something was not right,
> that the code was causing unusual behavior, which is what I was
> debugging." makes me think it was a defective CPU but still
> functional, and died as you were debugging/running the buggy code.

We live and we die by coincidence.

The processor was functioning fine without the code. It showed no signs of any problems. I 
had run a prior version of the code, then ran it without any of that code and it was fine. As I 
launched the nth version of the code, I thought of something and made another change. As I 
turned around to install it, the screen was showing that it had just executed that nth version of 
the code and then didn´t progress any further.

I was actually glad it froze because I was able to gather the results of the execution of the 
code, which I needed for fine-tuning. It was only after hitting the reset button several times 
that it occurred to me that there was something wrong because the screen remained static.

I had added the code in hopes of speeding up the catching of a bug (that I caught later 
without that code). The code made other processors behave the same way.

I did not mean that I consistently bricked processors - I removed the code entirely to avoid 
exactly that.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bricked x86 CPU with software?
  2018-01-05  9:28             ` Pavel Machek
@ 2018-01-06  1:08               ` Tim Mouraveiko
  2018-01-06 10:19                 ` Pavel Machek
  0 siblings, 1 reply; 22+ messages in thread
From: Tim Mouraveiko @ 2018-01-06  1:08 UTC (permalink / raw)
  To: Pavel Machek; +Cc: linux-kernel

> Hi!
> 
> On Thu 2018-01-04 17:21:36, Tim Mouraveiko wrote:
> > > On Thu 2018-01-04 14:13:56, Tim Mouraveiko wrote:
> > > > > > As I mentioned before, I repeatedly and fully power-cycled the motherboard and reset BIOS
> > > > > > and etc. It made no difference. I can see that the processor was not drawing any power. The
> > > > > > software code behaved in a similar fashion on other processors, until I fixed it so that it would
> > > > > > not kill any more processors.
> > > > > >
> > > > >
> > > > > So you have code that killed more than one processor? Save it! We want
> > > > > a copy.
> > > > >
> > > > > Do you have model numbers of affected CPUs?
> > > >
> > > >
> > > > Why would you want a copy? Last time I checked bricked CPUs do not work well, even as
> > > > decorations.
> > > >
> > > > I believe the processors were Intel Xeon series. The code would likely run on others too.
> > >
> > > Well... Intel's shares are overpriced, and you have code to fix that
> > > :-).
> > >
> > > Actually... I don't think your code works. That's why I'm curious. But
> > > if it works, its rather a big news... and I'm sure Intel and cloud
> > > providers are going to be interested.
> > >
> > 
> > I first discovered this issue over a year ago, quite by accident. I changed the code I was
> > working on so as not to kill the CPU (as that is not what I was trying to). We made Intel aware
> > of it. They didn´t care much, one of their personnel suggesting that they already knew about it
> > (whether this is true or not I couldn´t say). It popped up again later, so I had to fix the code
> > again. It could be a buggy implementation of a certain x86 functionality, but I left it at that
> > because I had better things to do with my time.
> >
> 
> Is the sequence available from ring 3, or does it need ring 0?
> 
> Can we get the code? Extraordinary claims and all that...
> 

I did not test privilege level. Are you suggesting that I put the code out there for everyone to 
see or what?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bricked x86 CPU with software?
  2018-01-06  1:08               ` Tim Mouraveiko
@ 2018-01-06 10:19                 ` Pavel Machek
  2018-01-08 15:58                   ` Tim Mouraveiko
  0 siblings, 1 reply; 22+ messages in thread
From: Pavel Machek @ 2018-01-06 10:19 UTC (permalink / raw)
  To: Tim Mouraveiko; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 461 bytes --]

Hi!

> > Is the sequence available from ring 3, or does it need ring 0?
> >
> > Can we get the code? Extraordinary claims and all that...
> >
> 
> I did not test privilege level. Are you suggesting that I put the code out there for everyone to
> see or what?

Yes, that's what I'm suggesting.

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bricked x86 CPU with software?
  2018-01-04  0:47 Bricked x86 CPU with software? Tim Mouraveiko
  2018-01-04 20:06 ` Pavel Machek
@ 2018-01-06 15:50 ` Nikolay Borisov
  1 sibling, 0 replies; 22+ messages in thread
From: Nikolay Borisov @ 2018-01-06 15:50 UTC (permalink / raw)
  To: Tim Mouraveiko, linux-kernel



On  4.01.2018 02:47, Tim Mouraveiko wrote:
> Hi,
> 
> In all my years of extensive experience writing drivers and kernels, I never came across a situation 
> where you could brick an x86 CPU. Not until recently, when I was working on debugging a piece of 
> code and I bricked an Intel CPU. I am not talking about an experimental motherboard or anything 
> exotic or an electrical issue where the CPU got fried, but before the software code execution the CPU 
> was fine and then it´s dead. There were signs that something was not right, that the code was causing 
> unusual behavior, which is what I was debugging.
> 
> Has anyone else ever experienced a bricked CPU after executing software code? I just wanted to get 
> input from the community to see if anyone had had any experience like that, since it seems rather 
> unusual to me.
> 
> Tim
> 

"Code talks, bullshit walks". Unless you share the code I don't think
anyone has any reason to believe anything you said.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bricked x86 CPU with software?
  2018-01-06 10:19                 ` Pavel Machek
@ 2018-01-08 15:58                   ` Tim Mouraveiko
       [not found]                     ` <201801081920.21922.arekm@maven.pl>
  2018-01-08 23:32                     ` Pavel Machek
  0 siblings, 2 replies; 22+ messages in thread
From: Tim Mouraveiko @ 2018-01-08 15:58 UTC (permalink / raw)
  To: Pavel Machek; +Cc: linux-kernel

> Hi!
> 
> > > Is the sequence available from ring 3, or does it need ring 0?
> > >
> > > Can we get the code? Extraordinary claims and all that...
> > >
> > 
> > I did not test privilege level. Are you suggesting that I put the code out there for everyone to
> > see or what?
> 
> Yes, that's what I'm suggesting.
> 

That would be neither prudent nor practical.

Perhaps you did not consider the consequences. What if it is compatible with your 
processor? Would you send me a handwritten thank you card if that processor stops 
processing? Would you be a happy replacement-sale customer of Intel? I think you did not 
put much thought into why we are talking about it a year later or at all.

Unlike the now-oh-so-scary feature that was in existence for decades, that is only so scary 
because of a "clever" idea to "cloud" host different customers on bare metal, without any 
consideration to their security, this could affect real people not just oh-so-clever computer 
farmers.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bricked x86 CPU with software?
       [not found]                     ` <201801081920.21922.arekm@maven.pl>
@ 2018-01-08 19:08                       ` Tim Mouraveiko
       [not found]                         ` <1515456557.4423.67.camel@infradead.org>
  0 siblings, 1 reply; 22+ messages in thread
From: Tim Mouraveiko @ 2018-01-08 19:08 UTC (permalink / raw)
  To: Arkadiusz Miskiewicz; +Cc: linux-kernel

> On Monday 08 of January 2018, Tim Mouraveiko wrote:
> > > Hi!
> > > 
> > > > > Is the sequence available from ring 3, or does it need ring 0?
> > > > > 
> > > > > Can we get the code? Extraordinary claims and all that...
> > > > 
> > > > I did not test privilege level. Are you suggesting that I put the code
> > > > out there for everyone to see or what?
> > > 
> > > Yes, that's what I'm suggesting.
> > 
> > That would be neither prudent nor practical.
> 
> Then send the code to proper people only
> 
> <security@kernel.org>
> 
> https://www.kernel.org/doc/html/v4.10/admin-guide/security-bugs.html

What would the purpose of that be?

I think you missed one of my posts from last week. The code has nothing to do with linux.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bricked x86 CPU with software?
  2018-01-08 15:58                   ` Tim Mouraveiko
       [not found]                     ` <201801081920.21922.arekm@maven.pl>
@ 2018-01-08 23:32                     ` Pavel Machek
  2018-01-09  0:35                       ` Tim Mouraveiko
  1 sibling, 1 reply; 22+ messages in thread
From: Pavel Machek @ 2018-01-08 23:32 UTC (permalink / raw)
  To: Tim Mouraveiko; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2166 bytes --]

On Mon 2018-01-08 07:58:33, Tim Mouraveiko wrote:
> > Hi!
> >
> > > > Is the sequence available from ring 3, or does it need ring 0?
> > > >
> > > > Can we get the code? Extraordinary claims and all that...
> > > >
> > >
> > > I did not test privilege level. Are you suggesting that I put the code out there for everyone to
> > > see or what?
> >
> > Yes, that's what I'm suggesting.
> >
> 
> That would be neither prudent nor practical.
> 
> Perhaps you did not consider the consequences. What if it is compatible with your
> processor? Would you send me a handwritten thank you card if that processor stops
> processing? Would you be a happy replacement-sale customer of Intel? I think you did not
> put much thought into why we are talking about it a year later or at
> all.

Actually, yes, thank you card. Not handwritten -- plenty of CPUs here :-).

> Unlike the now-oh-so-scary feature that was in existence for decades, that is only so scary
> because of a "clever" idea to "cloud" host different customers on bare metal, without any
> consideration to their security, this could affect real people not just oh-so-clever computer
> farmers.

I don't believe you actually have a way to brick CPUs.

Yes, it is possible to brick some computers -- overwriting BIOS will
do the trick, for example; doable from ring 0. There is more firmware
that can be overwritten... That's old news. Worth mentioning on
bugtraq, so manufacturer can fix it, but...

If you had something that worked directly on CPU, that would be news;
and yes, there are fuses there, but I really doubt they can be
manipulated by software. And I believe it would be news worth more
than price of a CPU... Still not exactly dangerous. Usually data are
worth more than hardware.

Ouch, and if it worked from ring 3... That would be newsworthy. That
would be actually quite dangerous. OTOH... people did try to fuzz CPU
instruction sets, so my bet is someone would have noticed.


Best regards,
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bricked x86 CPU with software?
  2018-01-08 23:32                     ` Pavel Machek
@ 2018-01-09  0:35                       ` Tim Mouraveiko
  0 siblings, 0 replies; 22+ messages in thread
From: Tim Mouraveiko @ 2018-01-09  0:35 UTC (permalink / raw)
  To: Pavel Machek; +Cc: linux-kernel

> On Mon 2018-01-08 07:58:33, Tim Mouraveiko wrote:
> > > Hi!
> > >
> > > > > Is the sequence available from ring 3, or does it need ring 0?
> > > > >
> > > > > Can we get the code? Extraordinary claims and all that...
> > > > >
> > > >
> > > > I did not test privilege level. Are you suggesting that I put the code out there for everyone to
> > > > see or what?
> > >
> > > Yes, that's what I'm suggesting.
> > >
> > 
> > That would be neither prudent nor practical.
> > 
> > Perhaps you did not consider the consequences. What if it is compatible with your
> > processor? Would you send me a handwritten thank you card if that processor stops
> > processing? Would you be a happy replacement-sale customer of Intel? I think you did not
> > put much thought into why we are talking about it a year later or at
> > all.
> 
> Actually, yes, thank you card. Not handwritten -- plenty of CPUs here :-).
> 
> > Unlike the now-oh-so-scary feature that was in existence for decades, that is only so scary
> > because of a "clever" idea to "cloud" host different customers on bare metal, without any
> > consideration to their security, this could affect real people not just oh-so-clever computer
> > farmers.
> 
> I don't believe you actually have a way to brick CPUs.
> 
> Yes, it is possible to brick some computers -- overwriting BIOS will
> do the trick, for example; doable from ring 0. There is more firmware
> that can be overwritten... That's old news. Worth mentioning on
> bugtraq, so manufacturer can fix it, but...
> 
> If you had something that worked directly on CPU, that would be news;
> and yes, there are fuses there, but I really doubt they can be
> manipulated by software. And I believe it would be news worth more
> than price of a CPU... Still not exactly dangerous. Usually data are
> worth more than hardware.
> 
> Ouch, and if it worked from ring 3... That would be newsworthy. That
> would be actually quite dangerous. OTOH... people did try to fuzz CPU
> instruction sets, so my bet is someone would have noticed.

You already mentioned the news part previously.

Early versions of the code would require disabling the OS by the delivery system to avoid 
interference.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Bricked x86 CPU with software?
       [not found]                         ` <1515456557.4423.67.camel@infradead.org>
@ 2018-01-09 21:48                           ` Tim Mouraveiko
  0 siblings, 0 replies; 22+ messages in thread
From: Tim Mouraveiko @ 2018-01-09 21:48 UTC (permalink / raw)
  To: David Woodhouse; +Cc: linux-kernel

> On Mon, 2018-01-08 at 11:08 -0800, Tim Mouraveiko wrote:
> > 
> > 
> > I think you missed one of my posts from last week. The code has
> > nothing to do with linux.
> 
> Like the 'f00f' bug in the Pentium days, there may well be a way that a
> kernel can *prevent* the code sequence from killing the machine.
> 

Obviously preventing execution of the code or interfering with it could be a possible solution.

F00F bug, good old days, consider it from a historical perspective.

A major fear of all manufacturers is warranty and recalls. Software technology companies 
successfully killed all warranty claims through disclaimers and patches. Chip manufacturers 
had a solution, too - the OEM computer manufacturers to whom they supply just parts and 
then the OEMs interface to the customers. But, that limits their profits. Now sell to mom-and-
pop shops and end-users directly - more sales and more profits. The complexity of the chips 
is surging. Sooner or later they will have a costly recall. Nothing to fear, the solution is simple 
- patch it in the field -engineers are saying it is dangerous. What if by accident other 
engineers discovers it? Wait a moment, this open source novelty offers an exciting 
opportunity - softly convince everyone to not waste time inventing - just copy it and follow 
our path. It works and marketing is happy too! Now marketing wants more products, but 
manufacturing says too expensive. Engineering to the rescue - all we need to do is enable 
and disable features to have a whole bunch of new part numbers. Problem solved! 

The memory on the processor is a low hanging fruit to increasing profitability.

They could make it harder to access it by having different protocols for different types of 
processors, but that costs more. 

Now this is an interesting question: is this a feature that opens access to just one type of 
processor or a bug that provides access to many?

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2018-01-09 21:46 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-04  0:47 Bricked x86 CPU with software? Tim Mouraveiko
2018-01-04 20:06 ` Pavel Machek
2018-01-04 21:00   ` Tim Mouraveiko
2018-01-04 21:04     ` Andy Shevchenko
2018-01-04 21:31       ` Tim Mouraveiko
2018-01-04 21:23     ` Pavel Machek
2018-01-04 22:13       ` Tim Mouraveiko
2018-01-04 22:40         ` Pavel Machek
2018-01-05  1:21           ` Tim Mouraveiko
2018-01-05  1:29             ` Hector Martin 'marcan'
2018-01-05 18:54               ` Tim Mouraveiko
2018-01-05  9:28             ` Pavel Machek
2018-01-06  1:08               ` Tim Mouraveiko
2018-01-06 10:19                 ` Pavel Machek
2018-01-08 15:58                   ` Tim Mouraveiko
     [not found]                     ` <201801081920.21922.arekm@maven.pl>
2018-01-08 19:08                       ` Tim Mouraveiko
     [not found]                         ` <1515456557.4423.67.camel@infradead.org>
2018-01-09 21:48                           ` Tim Mouraveiko
2018-01-08 23:32                     ` Pavel Machek
2018-01-09  0:35                       ` Tim Mouraveiko
2018-01-05  1:51     ` james harvey
2018-01-06  1:00       ` Tim Mouraveiko
2018-01-06 15:50 ` Nikolay Borisov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).