2.6.24-rc4: pci_remove_bus_device() => pci_scan_child_bus() => pci_bus_add

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* 2.6.24-rc4: pci_remove_bus_device() => pci_scan_child_bus() => pci_bus_add_devices bug?
@ 2008-06-18 14:18 Eran Liberty
  2008-07-20 10:31 ` [PATCH 2.6.24-rc4] PCI: refuse to re-add a device to a bus upon pci_scan_child_bus() Eran Liberty
                   ` (2 more replies)
  0 siblings, 3 replies; 81+ messages in thread
From: Eran Liberty @ 2008-06-18 14:18 UTC (permalink / raw)
  To: linux-kernel

Short:
1. Calling pci_remove_bus_device() on some of the devices on a bus will 
remove them. (ok)
2. Calling pci_scan_child_bus() on that bus will scan the bus for 
devices and add them all. even the ones already in the bus device list, 
which will now be duplicated in the bus device list! (no so ok)
3. Calling pci_bus_add_devices() Will try to add the device, assign 
resource, and create procfs files. The devices which were not originally 
removed are were being assigned resource they don't need and procfs 
files already exist and will collide (Bug)

Background:
I have a programmable component  which implements a pci device. Upon 
warm update i need to remove my devices from the bus, upgrade, and add 
them back.
I have bashed my head against pci.h API for the last week trying to make 
this work with no success. Right now I have devised a hideous hack to 
circumvent pci.h and directly remap the pci device registers.

demo code:

reload_my_hardware_design() {
    struct pci_dev *dev = NULL;
    struct pci_bus *bus = NULL;

    while ((dev = pci_get_device(PCI_VENDOR_ID_MY_ID,PCI_ANY_ID,NULL))) {
         pci_remove_bus_device(dev);
    }

    ...
    reload the hardware
    ...

    bus = pci_find_bus(0,0); /* though hard coded in this example this 
is the right bus*/
    pci_scan_child_bus(bus);
    pci_bus_add_devices(bus);
}

Is this a bug or am I doing something wrong?

Liberty

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 2.6.24-rc4] PCI: refuse to re-add a device to a bus upon pci_scan_child_bus()
  2008-06-18 14:18 2.6.24-rc4: pci_remove_bus_device() => pci_scan_child_bus() => pci_bus_add_devices bug? Eran Liberty
@ 2008-07-20 10:31 ` Eran Liberty
  2008-07-20 16:48   ` Eran Liberty
  2008-07-21 19:18 ` [PATCH 2.6.26] " Eran Liberty
  2008-08-18  8:08 ` ftrace introduces instability into kernel 2.6.27(-rc2,-rc3) Eran Liberty
  2 siblings, 1 reply; 81+ messages in thread
From: Eran Liberty @ 2008-07-20 10:31 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1158 bytes --]

Dear Penguins,

As  a follow up on my own post ( http://lkml.org/lkml/2008/6/18/195 ), I 
have  solved a small bug/misbehavior within the pci_scan_single_device().

The misbehavior manifest itself upon calling pci_scan_child_bus() with a 
bus which already contain some devices. After scanning is done devices 
that already existed will be present twice in the PCI bus devise list. 
Once with is_added indication and once without. Trying to add this bus 
will cause a resource conflict as the same device is already present and 
initialized.

This patch will simply prevent a device to be added to a bus list if it 
is already there.

Points to consider:
1. I am not a PCI Guru and might have over looked the bigger picture. Is 
it OK to prevent a device of being re-added?
2. I have decided that two devices are, in fact, the same instance if it 
has the same: vendor, device, and devfn. Is there a finer test for a 
device identity?

p.s. As per http://www.kernel.org/pub/linux/docs/lkml/#s1-10 it is 
preferred for Mozilla Firefox clients to attach the patch, hence my 
patch is attached.

Liberty

Signed-off-by: Eran Liberty <liberty@extrictom.org>
---

[-- Attachment #2: refuse_readd_pci_device.patch --]
[-- Type: text/x-patch, Size: 675 bytes --]

Index: drivers/pci/probe.c
===================================================================
--- drivers/pci/probe.c	(revision 102)
+++ drivers/pci/probe.c	(revision 103)
@@ -996,11 +996,21 @@
 struct pci_dev *__ref pci_scan_single_device(struct pci_bus *bus, int devfn)
 {
 	struct pci_dev *dev;
+	struct list_head *ln;

 	dev = pci_scan_device(bus, devfn);
 	if (!dev)
 		return NULL;

+	/* refuse to re-add a device */
+	list_for_each(ln, &bus->devices) {
+		struct pci_dev *d = pci_dev_b(ln);
+		if ((d->vendor == dev->vendor) &&
+		    (d->device == dev->device) &&
+		    (d->devfn == dev->devfn))
+			return NULL;
+	}
+
 	pci_device_add(dev, bus);

 	return dev;

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 2.6.24-rc4] PCI: refuse to re-add a device to a bus upon pci_scan_child_bus()
  2008-07-20 10:31 ` [PATCH 2.6.24-rc4] PCI: refuse to re-add a device to a bus upon pci_scan_child_bus() Eran Liberty
@ 2008-07-20 16:48   ` Eran Liberty
  0 siblings, 0 replies; 81+ messages in thread
From: Eran Liberty @ 2008-07-20 16:48 UTC (permalink / raw)
  Cc: linux-kernel

The subject is wrong! it is 2.6.26-rc4.

I will re-email with a corrected subject for automated scripts sake.

Liberty

Eran Liberty wrote:
> Dear Penguins,
>
> As  a follow up on my own post ( http://lkml.org/lkml/2008/6/18/195 ), 
> I have  solved a small bug/misbehavior within the 
> pci_scan_single_device().
>
> The misbehavior manifest itself upon calling pci_scan_child_bus() with 
> a bus which already contain some devices. After scanning is done 
> devices that already existed will be present twice in the PCI bus 
> devise list. Once with is_added indication and once without. Trying to 
> add this bus will cause a resource conflict as the same device is 
> already present and initialized.
>
> This patch will simply prevent a device to be added to a bus list if 
> it is already there.
>
> Points to consider:
> 1. I am not a PCI Guru and might have over looked the bigger picture. 
> Is it OK to prevent a device of being re-added?
> 2. I have decided that two devices are, in fact, the same instance if 
> it has the same: vendor, device, and devfn. Is there a finer test for 
> a device identity?
>
> p.s. As per http://www.kernel.org/pub/linux/docs/lkml/#s1-10 it is 
> preferred for Mozilla Firefox clients to attach the patch, hence my 
> patch is attached.
>
> Liberty
>
> Signed-off-by: Eran Liberty <liberty@extrictom.org>
> ---


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 2.6.26] PCI: refuse to re-add a device to a bus upon pci_scan_child_bus()
  2008-06-18 14:18 2.6.24-rc4: pci_remove_bus_device() => pci_scan_child_bus() => pci_bus_add_devices bug? Eran Liberty
  2008-07-20 10:31 ` [PATCH 2.6.24-rc4] PCI: refuse to re-add a device to a bus upon pci_scan_child_bus() Eran Liberty
@ 2008-07-21 19:18 ` Eran Liberty
  2008-07-21 19:49   ` Matthew Wilcox
  2008-08-18  8:08 ` ftrace introduces instability into kernel 2.6.27(-rc2,-rc3) Eran Liberty
  2 siblings, 1 reply; 81+ messages in thread
From: Eran Liberty @ 2008-07-21 19:18 UTC (permalink / raw)
  To: linux-kernel, linux-pci; +Cc: Jesse Barnes, eran.liberty

[-- Attachment #1: Type: text/plain, Size: 1150 bytes --]

Dear Penguins,

As a follow up on my own post ( http://lkml.org/lkml/2008/6/18/195 ), I
have  solved a small bug/misbehavior within the pci_scan_single_device().

The misbehavior manifest itself upon calling pci_scan_child_bus() with a
bus which already contain some devices. After scanning is done devices
that already existed will be present twice in the PCI bus devise list.
Once with is_added indication and once without. Trying to add this bus
will cause a resource conflict as the same device is already present and
initialized.

This patch will simply prevent a device to be added to a pci bus list if 
it is already there.

Points to consider:
1. I am not a PCI Guru and might have over looked the bigger picture. Is
it OK to prevent a device of being re-added?
2. I have decided that two devices are, in fact, the same instance if it
has the same: vendor, device, and devfn. Is there a finer test for a
device identity?

p.s. As per http://www.kernel.org/pub/linux/docs/lkml/#s1-10 it is
preferred for Mozilla Firefox clients to attach the patch, hence my
patch is attached.

Liberty

Signed-off-by: Eran Liberty <liberty@extricom.org>
---

[-- Attachment #2: refuse_readd_pci_device.patch --]
[-- Type: text/x-patch, Size: 794 bytes --]

 drivers/pci/probe.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index b1724cf..e3c3ee0 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1002,11 +1002,21 @@ void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
 struct pci_dev *__ref pci_scan_single_device(struct pci_bus *bus, int devfn)
 {
 	struct pci_dev *dev;
+	struct list_head *ln;

 	dev = pci_scan_device(bus, devfn);
 	if (!dev)
 		return NULL;

+	/* refuse to re-add a device */
+	list_for_each(ln, &bus->devices) {
+		struct pci_dev *d = pci_dev_b(ln);
+		if ((d->vendor == dev->vendor) &&
+		    (d->device == dev->device) &&
+		    (d->devfn == dev->devfn))
+			return NULL;
+	}
+
 	pci_device_add(dev, bus);

 	return dev;

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* Re: [PATCH 2.6.26] PCI: refuse to re-add a device to a bus upon pci_scan_child_bus()
  2008-07-21 19:18 ` [PATCH 2.6.26] " Eran Liberty
@ 2008-07-21 19:49   ` Matthew Wilcox
  2008-07-22  8:21     ` eran liberty
  0 siblings, 1 reply; 81+ messages in thread
From: Matthew Wilcox @ 2008-07-21 19:49 UTC (permalink / raw)
  To: Eran Liberty; +Cc: linux-kernel, linux-pci, Jesse Barnes, eran.liberty

On Mon, Jul 21, 2008 at 10:18:19PM +0300, Eran Liberty wrote:
> Dear Penguins,
> 
> As a follow up on my own post ( http://lkml.org/lkml/2008/6/18/195 ), I
> have  solved a small bug/misbehavior within the pci_scan_single_device().

To quote your original post here for ease of replying:

> 1. Calling pci_remove_bus_device() on some of the devices on a bus will 
> remove them. (ok)
> 2. Calling pci_scan_child_bus() on that bus will scan the bus for 
> devices and add them all. even the ones already in the bus device list, 
> which will now be duplicated in the bus device list! (no so ok)

I think this is your real problem, that you're rescanning the entire
bus.  I don't think that's the route we'd recommend taking.

Why don't you call pci_scan_slot() instead?  You won't get the benefit of
pcibios_fixup_bus(), but I'm not convinced that's safe to call on a bus
that's already been scanned.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 2.6.26] PCI: refuse to re-add a device to a bus upon pci_scan_child_bus()
  2008-07-21 19:49   ` Matthew Wilcox
@ 2008-07-22  8:21     ` eran liberty
  2008-07-22 11:49       ` Matthew Wilcox
  0 siblings, 1 reply; 81+ messages in thread
From: eran liberty @ 2008-07-22  8:21 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Eran Liberty, linux-kernel, linux-pci, Jesse Barnes

> I think this is your real problem, that you're rescanning the entire
> bus.  I don't think that's the route we'd recommend taking.
>

My stating point was that I have loaded a new design into a
programmable device which sits on the pci device. The new design can
implement numerous pci devices or non at all. I can think of an easy
way (or clean one) to scan only the programmable device. Scanning the
whole bus seemed reasonable.

> Why don't you call pci_scan_slot() instead?  You won't get the benefit of
> pcibios_fixup_bus(), but I'm not convinced that's safe to call on a bus
> that's already been scanned.
>

As said its not exactly a slot its more like a regular pci device that
someone suddenly welded into the pci bus. Its not a hotplug as well,
and I do not want to give up on the pcibios_fixup_bus()

As it is, with my patch applied i successfully go over the bus and
remove my own devices before I reprogram the
programmable device.

while ((dev = pci_get_device(PCI_VENDOR_ID_MYCOMP,PCI_DEVICE_ID_MYDEV,NULL))
!= NULL) {
	pci_remove_bus_device(dev);
	pci_dev_put(dev);
}

Load a new design into it.

Then scan the entire bus and add the newly discovered devices.

bus = null;
while ((bus = pci_find_next_bus(bus)) != NULL) {
	pci_scan_child_bus(bus);
	pci_bus_assign_resources(bus);
	pci_bus_add_devices(bus);
}

As seen here, this sequence of instructions seems very intuitive. It
will fail without the patch upon pci_bus_add_devices().

Liberty

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 2.6.26] PCI: refuse to re-add a device to a bus upon pci_scan_child_bus()
  2008-07-22  8:21     ` eran liberty
@ 2008-07-22 11:49       ` Matthew Wilcox
  2008-07-22 13:08         ` Eran Liberty
  2008-07-22 13:14         ` Eran Liberty
  0 siblings, 2 replies; 81+ messages in thread
From: Matthew Wilcox @ 2008-07-22 11:49 UTC (permalink / raw)
  To: eran liberty; +Cc: Eran Liberty, linux-kernel, linux-pci, Jesse Barnes

On Tue, Jul 22, 2008 at 11:21:06AM +0300, eran liberty wrote:
> > I think this is your real problem, that you're rescanning the entire
> > bus.  I don't think that's the route we'd recommend taking.
> 
> My stating point was that I have loaded a new design into a
> programmable device which sits on the pci device. The new design can
> implement numerous pci devices or non at all. I can think of an easy
> way (or clean one) to scan only the programmable device. Scanning the
> whole bus seemed reasonable.

That's what pci_scan_slot() is for.  It scans the first function at the
device number, then (if the header indicates it's a multifunction
device) scans the other functions associated with that device.  eg you
could call pci_scan_slot(bus, 0x30) and it will create function 06.0
(and potentially 06.1, 06.2, ...)

You presumably already have the devfn for the existing device since
you're able to call pci_remove_bus_device().

> > Why don't you call pci_scan_slot() instead?  You won't get the benefit of
> > pcibios_fixup_bus(), but I'm not convinced that's safe to call on a bus
> > that's already been scanned.
> 
> As said its not exactly a slot its more like a regular pci device that
> someone suddenly welded into the pci bus. Its not a hotplug as well,
> and I do not want to give up on the pcibios_fixup_bus()

Why not?  What architecture are you using?  What does
pcibios_fixup_bus() do for you?

(as a side-note, I'd like to reimplement the pcibios_fixup_*() routines;
I think a lot of what they do can be done more generically these days.
It'll take a while and isn't high on my priority list).

> As it is, with my patch applied i successfully go over the bus and
> remove my own devices before I reprogram the
> programmable device.
> 
> while ((dev = pci_get_device(PCI_VENDOR_ID_MYCOMP,PCI_DEVICE_ID_MYDEV,NULL))
> != NULL) {
> 	pci_remove_bus_device(dev);
> 	pci_dev_put(dev);
> }
> 
> Load a new design into it.
> 
> Then scan the entire bus and add the newly discovered devices.
> 
> bus = null;
> while ((bus = pci_find_next_bus(bus)) != NULL) {
> 	pci_scan_child_bus(bus);
> 	pci_bus_assign_resources(bus);
> 	pci_bus_add_devices(bus);
> }
> 
> As seen here, this sequence of instructions seems very intuitive. It
> will fail without the patch upon pci_bus_add_devices().

Seems utterly unintuitive to me.  You're doing a lot of unnecessary work
here, and if you have two cards in your machine, you'll take away both
of them when you reload either of them.

What you should do is cache the pci_bus and the devfn at startup:

static struct pci_bus *my_bus;
static int my_devfn;

	struct pci_dev *dev = pci_get_device(PCI_VENDOR_ID_MYCOMP,
					PCI_DEVICE_ID_MYDEV, NULL);
	if (!dev)
		return -ENODEV;
	my_bus = dev->bus;
	my_devfn = dev->devfn;
	pci_dev_put(dev);

when you want to remove it:

	for (func = 0; func < 8; func++)
		struct pci_dev *dev = pci_get_slot(my_bus, my_devfn + func);
		if (!dev)
			continue;
		pci_remove_bus_device(dev);
		pci_dev_put(dev);
	}

when you want to rescan it:

	pci_scan_slot(my_bus, my_devfn);

(this only handles one programmable card.  The basic idea could be
extended to handle multiple cards if you need to do that).

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 2.6.26] PCI: refuse to re-add a device to a bus upon pci_scan_child_bus()
  2008-07-22 11:49       ` Matthew Wilcox
@ 2008-07-22 13:08         ` Eran Liberty
  2008-07-22 13:14         ` Eran Liberty
  1 sibling, 0 replies; 81+ messages in thread
From: Eran Liberty @ 2008-07-22 13:08 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: eran liberty, linux-kernel, linux-pci, Jesse Barnes

Matthew,

You seem to have a finer grasp of the subject then I do, please correct/educate me on any of the points I raise in the following lines.

Matthew Wilcox wrote:

> On Tue, Jul 22, 2008 at 11:21:06AM +0300, eran liberty wrote:
>>> I think this is your real problem, that you're rescanning the entire
>>> bus.  I don't think that's the route we'd recommend taking.
>> My stating point was that I have loaded a new design into a
>> programmable device which sits on the pci device. The new design can
>> implement numerous pci devices or non at all. I can think of an easy
>> way (or clean one) to scan only the programmable device. Scanning the
>> whole bus seemed reasonable.
> That's what pci_scan_slot() is for.  It scans the first function at the
> device number, then (if the header indicates it's a multifunction
> device) scans the other functions associated with that device.  eg you
> could call pci_scan_slot(bus, 0x30) and it will create function 06.0
> (and potentially 06.1, 06.2, ...)
> You presumably already have the devfn for the existing device since
> you're able to call pci_remove_bus_device().
Each slot represent a single device which can have more then one function.  pci_scan_slot is aimed for scanning these multiple functions. 
I, on the other hand, have programmable device on the pci bus which is, for the sake of this discussion, a complete black box. 
This black box up on loading can implement more then one device, which can have more then one function each. 
So as far as I see it, now I need to scan all slots on the bus.

But to be honest, upon looking a way to make my device work I dismissed the "pci_scan_slot()" option as It did not reach the "fixup_resource <http://liberty/lxr/ident?v=e500-linux-2.6.26-rc4;i=fixup_resource>()" part.

>>> Why don't you call pci_scan_slot() instead?  You won't get the benefit of
>>> pcibios_fixup_bus(), but I'm not convinced that's safe to call on a bus
>>> that's already been scanned.
>>>       
>> As said its not exactly a slot its more like a regular pci device that
>> someone suddenly welded into the pci bus. Its not a hotplug as well,
>> and I do not want to give up on the pcibios_fixup_bus()
>>     
>
> Why not?  What architecture are you using?  What does
> pcibios_fixup_bus() do for you?
>   
I work with ARCH=powerpc. pcibios_fixup_bus() will deal with all the resource bars allocation. 
I needed Linux to renegotiate the resources bars on the PCI devices.

> (as a side-note, I'd like to reimplement the pcibios_fixup_*() routines;
> I think a lot of what they do can be done more generically these days.
> It'll take a while and isn't high on my priority list).
>   
If I can lend a hand there, let me know and I will try to squeeze it in somewhere.

>   
>> As it is, with my patch applied i successfully go over the bus and
>> remove my own devices before I reprogram the
>> programmable device.
>>
>> while ((dev = pci_get_device(PCI_VENDOR_ID_MYCOMP,PCI_DEVICE_ID_MYDEV,NULL))
>> != NULL) {
>> 	pci_remove_bus_device(dev);
>> 	pci_dev_put(dev);
>> }
>>
>> Load a new design into it.
>>
>> Then scan the entire bus and add the newly discovered devices.
>>
>> bus = null;
>> while ((bus = pci_find_next_bus(bus)) != NULL) {
>> 	pci_scan_child_bus(bus);
>> 	pci_bus_assign_resources(bus);
>> 	pci_bus_add_devices(bus);
>> }
>>
>> As seen here, this sequence of instructions seems very intuitive. It
>> will fail without the patch upon pci_bus_add_devices().
>>     
>
> Seems utterly unintuitive to me.  You're doing a lot of unnecessary work
> here, and if you have two cards in your machine, you'll take away both
> of them when you reload either of them.
>   
Hmmm, I do want to remove all the devices that are implemented by the programmable unit which is reloaded. 
I have not considered the possibility of having more then one programmable unit. 
I guess that the removing part can be more fine tuned as the need arises. 

> What you should do is cache the pci_bus and the devfn at startup:
>
> static struct pci_bus *my_bus;
> static int my_devfn;
>
> 	struct pci_dev *dev = pci_get_device(PCI_VENDOR_ID_MYCOMP,
> 					PCI_DEVICE_ID_MYDEV, NULL);
> 	if (!dev)
> 		return -ENODEV;
> 	my_bus = dev->bus;
> 	my_devfn = dev->devfn;
> 	pci_dev_put(dev);
>
> when you want to remove it:
>
> 	for (func = 0; func < 8; func++)
> 		struct pci_dev *dev = pci_get_slot(my_bus, my_devfn + func);
> 		if (!dev)
> 			continue;
> 		pci_remove_bus_device(dev);
> 		pci_dev_put(dev);
> 	}
>
> when you want to rescan it:
>
> 	pci_scan_slot(my_bus, my_devfn);
>
> (this only handles one programmable card.  The basic idea could be
> extended to handle multiple cards if you need to do that).
>   
I think there is a hidden assumption in this code, again please correct me if I missed the point. 
This code assumes that the devices which will re-appear after the programmable unit is loaded has the same devfn and bus as the devices which were present before the reload. 
This assumption might be wrong. 

For example, I have a basic programmable image which has no pci devices at all. 
upon unloading I do not remove any device (as non are present) and up on reloading I suddenly have two. What is their bus? their devfn?

Ultimately I would have expected to find a "int pci_scan_bus(struct  <http://liberty/lxr/ident?v=e500-linux-2.6.26-rc4;i=pci_scan_bus>pci_bus <http://liberty/lxr/ident?v=e500-linux-2.6.26-rc4;i=pci_bus> *bus <http://liberty/lxr/ident?v=e500-linux-2.6.26-rc4;i=bus>);" the "pci_scan_child_bus <http://liberty/lxr/ident?v=e500-linux-2.6.26-rc4;i=pci_scan_child_bus>()" was the closest to the mark

Liberty


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 2.6.26] PCI: refuse to re-add a device to a bus upon pci_scan_child_bus()
  2008-07-22 11:49       ` Matthew Wilcox
  2008-07-22 13:08         ` Eran Liberty
@ 2008-07-22 13:14         ` Eran Liberty
  2008-07-22 14:13           ` Matthew Wilcox
  1 sibling, 1 reply; 81+ messages in thread
From: Eran Liberty @ 2008-07-22 13:14 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: eran liberty, linux-kernel, linux-pci, Jesse Barnes

WOW, sorry for the html tags (though I had this Thunderbird under 
control). Here is the same mail without html tags (hopeully)


Matthew,
You seem to have a finer grasp of the subject then I do, please 
correct/educate me on any of the points I raise in the following lines.

Matthew Wilcox wrote:

> On Tue, Jul 22, 2008 at 11:21:06AM +0300, eran liberty wrote:
>>> I think this is your real problem, that you're rescanning the entire
>>> bus.  I don't think that's the route we'd recommend taking.
>> My stating point was that I have loaded a new design into a
>> programmable device which sits on the pci device. The new design can
>> implement numerous pci devices or non at all. I can think of an easy
>> way (or clean one) to scan only the programmable device. Scanning the
>> whole bus seemed reasonable.
> That's what pci_scan_slot() is for.  It scans the first function at the
> device number, then (if the header indicates it's a multifunction
> device) scans the other functions associated with that device.  eg you
> could call pci_scan_slot(bus, 0x30) and it will create function 06.0
> (and potentially 06.1, 06.2, ...)
> You presumably already have the devfn for the existing device since
> you're able to call pci_remove_bus_device().
Each slot represent a single device which can have more then one 
function.  pci_scan_slot is aimed for scanning these multiple functions.
I, on the other hand, have programmable device on the pci bus which is, 
for the sake of this discussion, a complete black box.
This black box up on loading can implement more then one device, which 
can have more then one function each.
So as far as I see it, now I need to scan all slots on the bus.

But to be honest, upon looking a way to make my device work I dismissed 
the "pci_scan_slot()" option as It did not reach the "fixup_resource ()" 
part.

>>> Why don't you call pci_scan_slot() instead?  You won't get the benefit of
>>> pcibios_fixup_bus(), but I'm not convinced that's safe to call on a bus
>>> that's already been scanned.
>>>       
>> As said its not exactly a slot its more like a regular pci device that
>> someone suddenly welded into the pci bus. Its not a hotplug as well,
>> and I do not want to give up on the pcibios_fixup_bus()
>>     
>
> Why not?  What architecture are you using?  What does
> pcibios_fixup_bus() do for you?
>   
I work with ARCH=powerpc. pcibios_fixup_bus() will deal with all the 
resource bars allocation.
I needed Linux to renegotiate the resources bars on the PCI devices.

> (as a side-note, I'd like to reimplement the pcibios_fixup_*() routines;
> I think a lot of what they do can be done more generically these days.
> It'll take a while and isn't high on my priority list).
>   
If I can lend a hand there, let me know and I will try to squeeze it in 
somewhere.

>   
>> As it is, with my patch applied i successfully go over the bus and
>> remove my own devices before I reprogram the
>> programmable device.
>>
>> while ((dev = pci_get_device(PCI_VENDOR_ID_MYCOMP,PCI_DEVICE_ID_MYDEV,NULL))
>> != NULL) {
>> 	pci_remove_bus_device(dev);
>> 	pci_dev_put(dev);
>> }
>>
>> Load a new design into it.
>>
>> Then scan the entire bus and add the newly discovered devices.
>>
>> bus = null;
>> while ((bus = pci_find_next_bus(bus)) != NULL) {
>> 	pci_scan_child_bus(bus);
>> 	pci_bus_assign_resources(bus);
>> 	pci_bus_add_devices(bus);
>> }
>>
>> As seen here, this sequence of instructions seems very intuitive. It
>> will fail without the patch upon pci_bus_add_devices().
>>     
>
> Seems utterly unintuitive to me.  You're doing a lot of unnecessary work
> here, and if you have two cards in your machine, you'll take away both
> of them when you reload either of them.
>   
Hmmm, I do want to remove all the devices that are implemented by the 
programmable unit which is reloaded.
I have not considered the possibility of having more then one 
programmable unit.
I guess that the removing part can be more fine tuned as the need arises.

> What you should do is cache the pci_bus and the devfn at startup:
>
> static struct pci_bus *my_bus;
> static int my_devfn;
>
> 	struct pci_dev *dev = pci_get_device(PCI_VENDOR_ID_MYCOMP,
> 					PCI_DEVICE_ID_MYDEV, NULL);
> 	if (!dev)
> 		return -ENODEV;
> 	my_bus = dev->bus;
> 	my_devfn = dev->devfn;
> 	pci_dev_put(dev);
>
> when you want to remove it:
>
> 	for (func = 0; func < 8; func++)
> 		struct pci_dev *dev = pci_get_slot(my_bus, my_devfn + func);
> 		if (!dev)
> 			continue;
> 		pci_remove_bus_device(dev);
> 		pci_dev_put(dev);
> 	}
>
> when you want to rescan it:
>
> 	pci_scan_slot(my_bus, my_devfn);
>
> (this only handles one programmable card.  The basic idea could be
> extended to handle multiple cards if you need to do that).
>   
I think there is a hidden assumption in this code, again please correct 
me if I missed the point.
This code assumes that the devices which will re-appear after the 
programmable unit is loaded has the same devfn and bus as the devices 
which were present before the reload.
This assumption might be wrong.

For example, I have a basic programmable image which has no pci devices 
at all.
upon unloading I do not remove any device (as non are present) and up on 
reloading I suddenly have two. What is their bus? their devfn?

Ultimately I would have expected to find a "int pci_scan_bus(struct 
pci_bus *bus );" the "pci_scan_child_bus ()" was the closest to the mark

Liberty


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 2.6.26] PCI: refuse to re-add a device to a bus upon pci_scan_child_bus()
  2008-07-22 13:14         ` Eran Liberty
@ 2008-07-22 14:13           ` Matthew Wilcox
  2008-07-22 15:25             ` Eran Liberty
  0 siblings, 1 reply; 81+ messages in thread
From: Matthew Wilcox @ 2008-07-22 14:13 UTC (permalink / raw)
  To: Eran Liberty; +Cc: eran liberty, linux-kernel, linux-pci, Jesse Barnes

On Tue, Jul 22, 2008 at 04:14:41PM +0300, Eran Liberty wrote:
> Each slot represent a single device which can have more then one 
> function.  pci_scan_slot is aimed for scanning these multiple functions.

Exactly.

> I, on the other hand, have programmable device on the pci bus which is, 
> for the sake of this discussion, a complete black box.
> This black box up on loading can implement more then one device, which 
> can have more then one function each.
> So as far as I see it, now I need to scan all slots on the bus.

That's quite interesting.  A normal PCI device has its IDSEL pin
connected to one of the AD lines, so it's not possible for it to respond
to more then one of the 32 possible devices.  I suppose by decoding
the configuration cycles on the bus, it's possible to make a
programmable device respond to more than one device.  I suppose you have
to be careful to program it to not respond to the same device numbers as
any other device on the bus.  Or are there no other devices on this bus
(in which case I have another solution for you)?

> But to be honest, upon looking a way to make my device work I dismissed 
> the "pci_scan_slot()" option as It did not reach the "fixup_resource ()" 
> part.
[...]
> I work with ARCH=powerpc. pcibios_fixup_bus() will deal with all the 
> resource bars allocation.
> I needed Linux to renegotiate the resources bars on the PCI devices.

OK.  I don't think it's safe to call powerpc's fixup_resource() more
than once for a given resource.

> >(as a side-note, I'd like to reimplement the pcibios_fixup_*() routines;
> >I think a lot of what they do can be done more generically these days.
> >It'll take a while and isn't high on my priority list).
> >  
> If I can lend a hand there, let me know and I will try to squeeze it in 
> somewhere.

What I want to do is get rid of fixup_resource() and its equivalent on
other architectures and call pcibios_bus_to_resource() from
drivers/pci/probe.c.

I've been working in this area recently; here's my current tree:

http://git.kernel.org/?p=linux/kernel/git/willy/misc.git;a=shortlog;h=pci-read-base

I've had a quick hack at doing what I want, but I haven't tested it.

> I think there is a hidden assumption in this code, again please correct 
> me if I missed the point.
> This code assumes that the devices which will re-appear after the 
> programmable unit is loaded has the same devfn and bus as the devices 
> which were present before the reload.
> This assumption might be wrong.

Yes, I did make that assumption ... I'm used to real PCI devices, not
this unusual box ;-)

> For example, I have a basic programmable image which has no pci devices 
> at all.
> upon unloading I do not remove any device (as non are present) and up on 
> reloading I suddenly have two. What is their bus? their devfn?
> 
> Ultimately I would have expected to find a "int pci_scan_bus(struct 
> pci_bus *bus );" the "pci_scan_child_bus ()" was the closest to the mark

If this is all alone on a bus of its own, we can certainly handle that.
Maybe we can make it a module parameter or something.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 2.6.26] PCI: refuse to re-add a device to a bus upon pci_scan_child_bus()
  2008-07-22 14:13           ` Matthew Wilcox
@ 2008-07-22 15:25             ` Eran Liberty
  2008-07-22 16:52               ` Matthew Wilcox
  0 siblings, 1 reply; 81+ messages in thread
From: Eran Liberty @ 2008-07-22 15:25 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: eran liberty, linux-kernel, linux-pci, Jesse Barnes

Matthew Wilcox wrote:
> If this is all alone on a bus of its own, we can certainly handle that.
> Maybe we can make it a module parameter or something.
>   
If my devices where the only one on the bus I would not have stumbled on 
this bug. I would have removed them all before reloading and would have 
scanned and empty PCI bus list after the reloading.

Alas, the CPU itself is a pci device. Upon scanning the bus it is it who 
is duplicated. Upon trying to "pci_bus_add_devices()" you get resource 
collision.

Do you perceive any down side to my suggested patch?
It seems (to me) really logical that a device can not and should not be 
added twice to a pci bus device list.

Liberty

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 2.6.26] PCI: refuse to re-add a device to a bus upon pci_scan_child_bus()
  2008-07-22 15:25             ` Eran Liberty
@ 2008-07-22 16:52               ` Matthew Wilcox
  2008-07-22 17:41                 ` Eran Liberty
  0 siblings, 1 reply; 81+ messages in thread
From: Matthew Wilcox @ 2008-07-22 16:52 UTC (permalink / raw)
  To: Eran Liberty; +Cc: eran liberty, linux-kernel, linux-pci, Jesse Barnes

On Tue, Jul 22, 2008 at 06:25:40PM +0300, Eran Liberty wrote:
> If my devices where the only one on the bus I would not have stumbled on 
> this bug. I would have removed them all before reloading and would have 
> scanned and empty PCI bus list after the reloading.

That's not true, the code you posted:

bus = null;
while ((bus = pci_find_next_bus(bus)) != NULL) {
        pci_scan_child_bus(bus);
        pci_bus_assign_resources(bus);
        pci_bus_add_devices(bus);
}

will scan all busses again.

> Alas, the CPU itself is a pci device. Upon scanning the bus it is it who 
> is duplicated. Upon trying to "pci_bus_add_devices()" you get resource 
> collision.

Does the CPU have BARs of its own?  If you have no other devices with
BARs, it would explain why you have not noticed the problem with
fixup_resource ;-)

> Do you perceive any down side to my suggested patch?
> It seems (to me) really logical that a device can not and should not be 
> added twice to a pci bus device list.

The problem is the other side-effects of what you propose.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 2.6.26] PCI: refuse to re-add a device to a bus upon pci_scan_child_bus()
  2008-07-22 16:52               ` Matthew Wilcox
@ 2008-07-22 17:41                 ` Eran Liberty
  2008-07-22 18:11                   ` Matthew Wilcox
  0 siblings, 1 reply; 81+ messages in thread
From: Eran Liberty @ 2008-07-22 17:41 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: eran liberty, linux-kernel, linux-pci, Jesse Barnes

Matthew Wilcox wrote:
> On Tue, Jul 22, 2008 at 06:25:40PM +0300, Eran Liberty wrote:
>   
>> If my devices where the only one on the bus I would not have stumbled on 
>> this bug. I would have removed them all before reloading and would have 
>> scanned and empty PCI bus list after the reloading.
>>     
>
> That's not true, the code you posted:
>
> bus = null;
> while ((bus = pci_find_next_bus(bus)) != NULL) {
>         pci_scan_child_bus(bus);
>         pci_bus_assign_resources(bus);
>         pci_bus_add_devices(bus);
> }
>
> will scan all busses again.
>   
You are correct! I have a single bus :)

>   
>> Alas, the CPU itself is a pci device. Upon scanning the bus it is it who 
>> is duplicated. Upon trying to "pci_bus_add_devices()" you get resource 
>> collision.
>>     
>
> Does the CPU have BARs of its own?  If you have no other devices with
> BARs, it would explain why you have not noticed the problem with
> fixup_resource ;-)
>   
I seriously doubt my CPU has any resources of its own.

>   
>> Do you perceive any down side to my suggested patch?
>> It seems (to me) really logical that a device can not and should not be 
>> added twice to a pci bus device list.
>>     
>
> The problem is the other side-effects of what you propose.
>   
Could you elaborate what ominous side-effects you predict?

Liberty

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 2.6.26] PCI: refuse to re-add a device to a bus upon pci_scan_child_bus()
  2008-07-22 17:41                 ` Eran Liberty
@ 2008-07-22 18:11                   ` Matthew Wilcox
  2008-07-23 18:31                     ` Eran Liberty
  0 siblings, 1 reply; 81+ messages in thread
From: Matthew Wilcox @ 2008-07-22 18:11 UTC (permalink / raw)
  To: Eran Liberty; +Cc: eran liberty, linux-kernel, linux-pci, Jesse Barnes

On Tue, Jul 22, 2008 at 08:41:00PM +0300, Eran Liberty wrote:
> >>Do you perceive any down side to my suggested patch?
> >>It seems (to me) really logical that a device can not and should not be 
> >>added twice to a pci bus device list.
> >>    
> >
> >The problem is the other side-effects of what you propose.
> >  
> Could you elaborate what ominous side-effects you predict?

Look at the consequences of calling fixup_resource() twice on the same
resource:

        res->start = (res->start + offset) & mask;
        res->end = (res->end + offset) & mask;

You'll add 'offset' to the other resources twice.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 2.6.26] PCI: refuse to re-add a device to a bus upon pci_scan_child_bus()
  2008-07-22 18:11                   ` Matthew Wilcox
@ 2008-07-23 18:31                     ` Eran Liberty
  2008-07-27 11:01                       ` Eran Liberty
  0 siblings, 1 reply; 81+ messages in thread
From: Eran Liberty @ 2008-07-23 18:31 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: eran liberty, linux-kernel, linux-pci, Jesse Barnes

Matthew Wilcox wrote:

> Look at the consequences of calling fixup_resource() twice on the same
> resource:
>
>         res->start = (res->start + offset) & mask;
>         res->end = (res->end + offset) & mask;
>
> You'll add 'offset' to the other resources twice.
>   
Tring to find heads and tails in the code, I dug in. This is what I understand from the overall flow... and my points of interests:

1. pci_scan_child_bus() starts with iterating over all the devfn and scanning for a device with that devfn

drivers/pci/probe.c
1056 unsigned int __devinit pci_scan_child_bus(struct pci_bus *bus)
1057 {
1063         /* Go find them, Rover! */
1064         for (devfn = 0; devfn < 0x100; devfn += 8)
1065                 pci_scan_slot(bus, devfn);

2. pci_scan_slot() will continue to scan all the functions that devfn might have to over
drivers/pci/probe.c
1019 int pci_scan_slot(struct pci_bus *bus, int devfn)
1020 {
1026         for (func = 0; func < 8; func++, devfn++) {
1029                 dev = pci_scan_single_device(bus, devfn);

3. pci_scan_single_device() will scan / test if this dev is valid.
drivers/pci/probe.c
996 struct pci_dev *__ref pci_scan_single_device(struct pci_bus *bus, int devfn)
997 {
998         struct pci_dev *dev;
999 
1000         dev = pci_scan_device(bus, devfn);

and add it. REGARDLESS if it is already on the pci bus list or not.
1001         if (!dev)
1002                 return NULL;
1003 
1004         pci_device_add(dev, bus);
1005 
1006         return dev;
1007 }

This is the first thing that needs to be fixed IMHO.

4. pci_scan_child_bus() will go on to fixup the bus whether it needs fixing or not.
drivers/pci/probe.c
1072         pcibios_fixup_bus(bus);

This might be the second thing that needs amending.

5.  pcibios_fixup_bus(bus) calls _pcibios_fixup_bus which will call fixup_resource() once per a bus resource (in contrast of per device on that bus) and fixes the resources of that bus.
arch/powerpc/kernel/pci-common.c	
784 static void __devinit __pcibios_fixup_bus(struct pci_bus *bus)
785 {
787         struct pci_dev *dev = bus->self;
798                 for (i = 0; i < PCI_BUS_NUM_RESOURCES; ++i) {
833                         fixup_resource(res, dev);
834                 }
835         }
850 }

As you have correctly observed without other changes this would cause trouble on a bus that has resources.
Since i am not pulling the plug on the whole bus just on selected devices I can defensively skip this part

6. Now I should be able to call pci_bus_assign_resources(bus) which will go over all the newly added device and assign resource to them. 

7. Last step I should be able to pci_bus_add_devices(bus) the, now, resource fixed devices.

Steps 6 and 7 are the one I miss most over which my device wont work after being re-born.

Soooo ...

If I really wanted to make it, working my way around the current code, I would have done something like this.

bus = null;
while ((bus = pci_find_next_bus(bus)) != NULL) {
	for (devfn = 0; devfn < 0x100; devfn += 8) {
		for (func = 0; func < 8; func++) {
			struct pci_dev *dev =  <http://liberty/lxr/ident?v=e500-linux-2.6.26-rc4;i=pci_get_slot>pci_get_slot(struct pci_bus *bus, unsigned int devfn);
			if (dev) 
				continue;
			dev = pci_scan_single_device(bus, devfn);
			if (!dev)
				continue;
			pci_device_add(dev, bus);
		}
	}
        pci_bus_assign_resources(bus);
        pci_bus_add_devices(bus);
}

WOW.... first time for me to code in Mozilla Thunderbird.

Had I was this smart to begin with I would have done exactly that and go home to sleep like a decent person. 
But since we are here, I feel there should be a either correction in the existing code to allow the:
pci_scan_child_bus(bus) -> pci_bus_assign_resources(bus) -> pci_bus_add_devices(bus) sequence.
Or a new helper function to the PCI that does more or less what I have just described.

If you want I can code it, patch it, and rip the glory :)

That was a long one :)

Liberty

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 2.6.26] PCI: refuse to re-add a device to a bus upon pci_scan_child_bus()
  2008-07-23 18:31                     ` Eran Liberty
@ 2008-07-27 11:01                       ` Eran Liberty
  2008-07-27 15:08                         ` Matthew Wilcox
  0 siblings, 1 reply; 81+ messages in thread
From: Eran Liberty @ 2008-07-27 11:01 UTC (permalink / raw)
  To: Eran Liberty
  Cc: Matthew Wilcox, eran liberty, linux-kernel, linux-pci, Jesse Barnes

Eran Liberty wrote:

> Soooo ...
>   
> If I really wanted to make it, working my way around the current code, I would have done something like this.
>   
> bus = null;
>   
> while ((bus = pci_find_next_bus(bus)) != NULL) {
>   
>     for (devfn = 0; devfn < 0x100; devfn += 8) {
>   
>         for (func = 0; func < 8; func++) {
>   
>             struct pci_dev *dev =  <http://liberty/lxr/ident?v=e500-linux-2.6.26-rc4;i=pci_get_slot>pci_get_slot(struct pci_bus *bus, unsigned int devfn);
>   
>             if (dev) 
>                 continue;
>   
>             dev = pci_scan_single_device(bus, devfn);
>   
>             if (!dev)
>   
>                 continue;
>   
>             pci_device_add(dev, bus);
>   
>         }
>   
>     }
>   
>        pci_bus_assign_resources(bus);
>   
>        pci_bus_add_devices(bus);
>   
> }
>   
Found some time to test my own idea... it seems to work. :)

Its final version looks like this:

Before reloading the programmable unit, remove all the device implemented by it

	struct pci_dev *dev;
	while ((dev = pci_get_device(PCI_VENDOR_ID_MYCOMP,PCI_DEVICE_ID_MYCOMP_MYDEV,NULL)) != NULL) {
		pci_remove_bus_device(dev);
		pci_dev_put(dev);
	}

After the programmable unit is loaded, scan and add only newly added device!

 	struct pci_bus *bus = NULL;
	while ((bus = pci_find_next_bus(bus)) != NULL) {
		int devfn;
		for (devfn = 0; devfn < 0x100; devfn++) {
			struct pci_dev *dev=pci_get_slot(bus,devfn);
			if (dev) 
				continue;  // <---- do not scan devices already present on the bus. Missing magic line :)
			(void)pci_scan_single_device(bus, devfn);
		}
		pci_bus_assign_resources(bus);
		pci_bus_add_devices(bus);
	}

So... no need for patch after all.
Unless... Hey whats this... behind it... on the side.... naaa, no need of patch... :(

Liberty


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 2.6.26] PCI: refuse to re-add a device to a bus upon pci_scan_child_bus()
  2008-07-27 11:01                       ` Eran Liberty
@ 2008-07-27 15:08                         ` Matthew Wilcox
  0 siblings, 0 replies; 81+ messages in thread
From: Matthew Wilcox @ 2008-07-27 15:08 UTC (permalink / raw)
  To: Eran Liberty; +Cc: eran liberty, linux-kernel, linux-pci, Jesse Barnes

On Sun, Jul 27, 2008 at 02:01:04PM +0300, Eran Liberty wrote:
> Found some time to test my own idea... it seems to work. :)

Is there a reason you haven't tested the tree I posted?

: I've been working in this area recently; here's my current tree:

: http://git.kernel.org/?p=linux/kernel/git/willy/misc.git;a=shortlog;h=pci-read-base

: I've had a quick hack at doing what I want, but I haven't tested it.


-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply	[flat|nested] 81+ messages in thread

* ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-06-18 14:18 2.6.24-rc4: pci_remove_bus_device() => pci_scan_child_bus() => pci_bus_add_devices bug? Eran Liberty
  2008-07-20 10:31 ` [PATCH 2.6.24-rc4] PCI: refuse to re-add a device to a bus upon pci_scan_child_bus() Eran Liberty
  2008-07-21 19:18 ` [PATCH 2.6.26] " Eran Liberty
@ 2008-08-18  8:08 ` Eran Liberty
  2008-08-18 15:07   ` Steven Rostedt
  2 siblings, 1 reply; 81+ messages in thread
From: Eran Liberty @ 2008-08-18  8:08 UTC (permalink / raw)
  To: srostedt; +Cc: linux-kernel

After compiling a kernel with ftrace I started to experience all sorts 
of crashes.
I have a powerpc env which closly resemble mpc8548cds_defconfig.

I have produced the crash in seconds by issuing:
 > while [ 1 ] ; do find / > /dev/null ; echo .  ; done

Liberty

here are some of the kernel crashes:
---
~ # find /proc/ -name kft
find: /proc/150/net/softnet_stat: No such file or direOops: Exception in 
kernel mode, sig: 11 [#1]
Exsw1600
Modules linked in:
NIP: c00bd6fc LR: c00bd6fc CTR: 00000000
REGS: ddfc3d50 TRAP: 0700   Not tainted  (2.6.27-rc2)
MSR: 00029000 <EE,ME>  CR: 20002284  XER: 20000000
TASK = ddcccde0[1699] 'find' THREAD: ddfc2000
GPR00: 00000000 ddfc3e00 ddcccde0 dd895580 ddfc3e30 c00b5da0 ddfc3e90 
00000003
GPR08: c00ece54 00000370 00145505 00000003 80002282 100ad874 10083ca0 
10083cd8
GPR16: 10083cd0 00000008 00000000 c00b5da0 ddfc3f18 c00ece54 00000000 
df465720
GPR24: d7431900 ddfc3e30 dd895580 c0380000 dd895580 ddfc3e30 00000000 
ddfc3e00
NIP [c00bd6fc] d_lookup+0x40/0x90
LR [c00bd6fc] d_lookup+0x40/0x90
Call Trace:
[ddfc3e00] [c006ade4] rcu_process_callbacks+0x48/0x60 (unreliable)
[ddfc3e20] [c00eb514] proc_fill_cache+0x94/0x1b0
[ddfc3e80] [c00ef720] proc_task_readdir+0x294/0x3c4
[ddfc3ee0] [c00b60fc] vfs_readdir+0xb8/0xd8
[ddfc3f10] [c00b619c] sys_getdents64+0x80/0x104
[ddfc3f40] [c0010554] ret_from_syscall+0x0/0x3c
Instruction dump:
7c0802a6 bf61000c 3f60c038 7c3f0b78 90010024 7c7c1b78 7c9d2378 83db52a0
73c00001 7f83e378 7fa4eb78 4082002f <00000000> 2f830000 409e0030 801b52a0
ctory
---[ end trace 6b059a7f5ba5a55a ]---

---

Oops: Exception in kernel mode, sig: 11 [#1]
Exsw1600
Modules linked in: gplonly_api
NIP: c00bd6fc LR: c00bd6fc CTR: 00000000
REGS: df52dc40 TRAP: 0700   Not tainted  (2.6.27-rc2)
MSR: 00029000 <EE,ME>  CR: 20082422  XER: 20000000
TASK = ddc2e060[2294] 'eeprom' THREAD: df52c000
GPR00: 00000000 df52dcf0 ddc2e060 dd82c700 df52dd58 df52dd48 dd82c75e 
00000000
GPR08: c0800000 000311f0 0000ffff df52dd10 20002488 1001e75c 100b0000 
1008e90c
GPR16: bfd15a50 df52de6c df52dd58 fffffff4 c0380000 df52dd50 df52dd48 
dd9303fc
GPR24: dc3a6800 dd930390 df52dd58 c0380000 dd82c700 df52dd58 00000006 
df52dcf0
NIP [c00bd6fc] d_lookup+0x40/0x90
LR [c00bd6fc] d_lookup+0x40/0x90
Call Trace:
[df52dcf0] [df52dd48] 0xdf52dd48 (unreliable)
[df52dd10] [c00b07a0] do_lookup+0xe8/0x220
[df52dd40] [c00b222c] __link_path_walk+0x174/0xd54
[df52ddb0] [c00b2e64] path_walk+0x58/0xe0
[df52dde0] [c00b2fd4] do_path_lookup+0x78/0x13c
[df52de10] [c00b3f3c] __path_lookup_intent_open+0x68/0xdc
[df52de40] [c00b3fd8] path_lookup_open+0x28/0x40
[df52de50] [c00b4234] do_filp_open+0xa0/0x798
[df52df00] [c00a3b80] do_sys_open+0x70/0x114
[df52df30] [c00a3c98] sys_open+0x38/0x50
[df52df40] [c0010554] ret_from_syscall+0x0/0x3c
Instruction dump:
7c0802a6 bf61000c 3f60c038 7c3f0b78 90010024 7c7c1b78 7c9d2378 83db52a0
73c00001 7f83e378 7fa4eb78 4082002f <00000000> 2f830000 409e0030 801b52a0
om_version(Canno---[ end trace 0cef3c8cb49d09de ]---

---

Oops: Exception in kernel mode, sig: 11 [#1]
Exsw1600
Modules linked in: bridge stp llc tun beacon_freescale(P) 80211(P) 
shm_freescale(P) ath_remote_regs_freescale(P) exsw 
load_local_fpga_freescale 8021q poe_fi
NIP: c00bd6fc LR: c00bd6fc CTR: 00000000
REGS: dd555c50 TRAP: 0700   Tainted: P           (2.6.27-rc2)
MSR: 00029000 <EE,ME>  CR: 24082282  XER: 20000000
TASK = df4669a0[4976] 'find' THREAD: dd554000
GPR00: 00000000 dd555d00 df4669a0 dd82ae00 dd555d68 dd555d58 dd82ae5b 
100234ec
GPR08: c0800000 0002dd0c 0000ffff dd555d20 24000288 100ad874 100936f8 
1008a1d0
GPR16: 10083f80 dd555e2c dd555d68 fffffff4 c0380000 dd555d60 dd555d58 
dd9565d4
GPR24: dc3db480 dd956568 dd555d68 c0380000 dd82ae00 dd555d68 00000000 
dd555d00
NIP [c00bd6fc] d_lookup+0x40/0x90
LR [c00bd6fc] d_lookup+0x40/0x90
Call Trace:
[dd555d00] [dd555d58] 0xdd555d58 (unreliable)
[dd555d20] [c00b07a0] do_lookup+0xe8/0x220
[dd555d50] [c00b265c] __link_path_walk+0x5a4/0xd54
[dd555dc0] [c00b2e64] path_walk+0x58/0xe0
[dd555df0] [c00b2fd4] do_path_lookup+0x78/0x13c
[dd555e20] [c00b3cd0] user_path_at+0x64/0xac
[dd555e90] [c00aac04] vfs_lstat_fd+0x34/0x74
[dd555ec0] [c00aacd8] vfs_lstat+0x30/0x48
[dd555ed0] [c00aad20] sys_lstat64+0x30/0x5c
[dd555f40] [c0010554] ret_from_syscall+0x0/0x3c
Instruction dump:
7c0802a6 bf61000c 3f60c038 7c3f0b78 90010024 7c7c1b78 7c9d2378 83db52a0
73c00001 7f83e378 7fa4eb78 4082002f <00000000> 2f830000 409e0030 801b52a0
76/auxv
/proc/4---[ end trace 0deef0827ce9df4b ]---

---
Oops: Exception in kernel mode, sig: 11 [#1]
Exsw1600
Modules linked in: bridge stp llc tun beacon_freescale(P) 80211(P) 
shm_freescale(P) ath_remote_regs_freescale(P) exsw 
load_local_fpga_freescale 8021q poe_fi
NIP: c00bd6fc LR: c00bd6fc CTR: 00000000
REGS: d57e7c40 TRAP: 0700   Tainted: P           (2.6.27-rc2)
MSR: 00029000 <EE,ME>  CR: 24482482  XER: 20000000
TASK = db3cc940[2054] 'eapd' THREAD: d57e6000
GPR00: 00000000 d57e7cf0 db3cc940 dd82a500 d57e7d58 d57e7d48 dd82a55a 
00000000
GPR08: c0800000 00016c64 0000ffff d57e7d10 24422488 1008f700 4802cd38 
100763e8
GPR16: 4801d818 d57e7e6c d57e7d58 fffffff4 c0380000 d57e7d50 d57e7d48 
dd956c74
GPR24: dc3db680 dd956c08 d57e7d58 c0380000 dd82a500 d57e7d58 00000000 
d57e7cf0
NIP [c00bd6fc] d_lookup+0x40/0x90
LR [c00bd6fc] d_lookup+0x40/0x90
Call Trace:
[d57e7cf0] [d57e7d48] 0xd57e7d48 (unreliable)
[d57e7d10] [c00b07a0] do_lookup+0xe8/0x220
[d57e7d40] [c00b265c] __link_path_walk+0x5a4/0xd54
[d57e7db0] [c00b2e64] path_walk+0x58/0xe0
[d57e7de0] [c00b2fd4] do_path_lookup+0x78/0x13c
[d57e7e10] [c00b3f3c] __path_lookup_intent_open+0x68/0xdc
[d57e7e40] [c00b3fd8] path_lookup_open+0x28/0x40
[d57e7e50] [c00b4234] do_filp_open+0xa0/0x798
[d57e7f00] [c00a3b80] do_sys_open+0x70/0x114
[d57e7f30] [c00a3c98] sys_open+0x38/0x50
[d57e7f40] [c0010554] ret_from_syscall+0x0/0x3c
Instruction dump:
7c0802a6 bf61000c 3f60c038 7c3f0b78 90010024 7c7c1b78 7c9d2378 83db52a0
73c00001 7f83e378 7fa4eb78 4082002f <00000000> 2f830000 409e0030 801b52a0
---[ end trace ffb4a6e37d78370a ]---

---
Oops: Exception in kernel mode, sig: 11 [#1]
Exsw1600
Modules linked in: bridge stp llc tun beacon_freescale(P) 80211(P) 
shm_freescale(P) ath_remote_regs_freescale(P) exsw 
load_local_fpga_freescale 8021q poe_fi
NIP: c00bd6fc LR: c00bd6fc CTR: c00bd6c8
REGS: decd5cf0 TRAP: 0700   Tainted: P           (2.6.27-rc2)
MSR: 00029000 <EE,ME>  CR: 22228882  XER: 20000000
TASK = d9f5a9a0[6897] 'sh' THREAD: decd4000
GPR00: 00000000 decd5da0 d9f5a9a0 dd801180 decd5de8 00000000 00000004 
fffffff9
GPR08: fffffffa 00000000 00d5ca28 decd5d90 2ccc1686 100ad874 1fffb200 
00000000
GPR16: 007fff00 decd5ec4 00000008 c02fe298 00200200 00000000 c031aa34 
decd5de8
GPR24: decd5df4 00001af2 dd507d00 c0380000 dd801180 decd5de8 00000000 
decd5da0
NIP [c00bd6fc] d_lookup+0x40/0x90
LR [c00bd6fc] d_lookup+0x40/0x90
Call Trace:
[decd5dc0] [c00bd7d8] d_hash_and_lookup+0x8c/0xc4
[decd5de0] [c00ed728] proc_flush_task+0xb4/0x268
[decd5e40] [c0032894] release_task+0x6c/0x39c
[decd5e70] [c0033238] wait_consider_task+0x674/0x920
[decd5eb0] [c0033610] do_wait+0x12c/0x3d4
[decd5f20] [c0033954] sys_wait4+0x9c/0xe8
[decd5f40] [c0010554] ret_from_syscall+0x0/0x3c
Instruction dump:
7c0802a6 bf61000c 3f60c038 7c3f0b78 90010024 7c7c1b78 7c9d2378 83db52a0
73c00001 7f83e378 7fa4eb78 4082002f <00000000> 2f830000 409e0030 801b52a0
---[ end trace 00106ff4ec3b9c22 ]---





^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-18  8:08 ` ftrace introduces instability into kernel 2.6.27(-rc2,-rc3) Eran Liberty
@ 2008-08-18 15:07   ` Steven Rostedt
  2008-08-18 15:47     ` Mathieu Desnoyers
                       ` (2 more replies)
  0 siblings, 3 replies; 81+ messages in thread
From: Steven Rostedt @ 2008-08-18 15:07 UTC (permalink / raw)
  To: Eran Liberty
  Cc: linux-kernel, Mathieu Desnoyers, Paul E. McKenney, linuxppc-dev, rostedt

Eran Liberty wrote:
> After compiling a kernel with ftrace I started to experience all sorts 
> of crashes.

Just to make sure...

ftrace enables markers too, and RCU has tracing with the markers. This 
may not be the problem, but I just want to eliminate as many variables 
as possible.
Could you disable ftrace, but keep the markers on too.  Also, could you 
enable ftrace again and turn on the FTRACE_STARTUP_TEST.

Thanks,

-- Steve

> I have a powerpc env which closly resemble mpc8548cds_defconfig.
>
> I have produced the crash in seconds by issuing:
> > while [ 1 ] ; do find / > /dev/null ; echo .  ; done
>
> Liberty
>
> here are some of the kernel crashes:
> ---
> ~ # find /proc/ -name kft
> find: /proc/150/net/softnet_stat: No such file or direOops: Exception 
> in kernel mode, sig: 11 [#1]
> Exsw1600
> Modules linked in:
> NIP: c00bd6fc LR: c00bd6fc CTR: 00000000
> REGS: ddfc3d50 TRAP: 0700   Not tainted  (2.6.27-rc2)
> MSR: 00029000 <EE,ME>  CR: 20002284  XER: 20000000
> TASK = ddcccde0[1699] 'find' THREAD: ddfc2000
> GPR00: 00000000 ddfc3e00 ddcccde0 dd895580 ddfc3e30 c00b5da0 ddfc3e90 
> 00000003
> GPR08: c00ece54 00000370 00145505 00000003 80002282 100ad874 10083ca0 
> 10083cd8
> GPR16: 10083cd0 00000008 00000000 c00b5da0 ddfc3f18 c00ece54 00000000 
> df465720
> GPR24: d7431900 ddfc3e30 dd895580 c0380000 dd895580 ddfc3e30 00000000 
> ddfc3e00
> NIP [c00bd6fc] d_lookup+0x40/0x90
> LR [c00bd6fc] d_lookup+0x40/0x90
> Call Trace:
> [ddfc3e00] [c006ade4] rcu_process_callbacks+0x48/0x60 (unreliable)
> [ddfc3e20] [c00eb514] proc_fill_cache+0x94/0x1b0
> [ddfc3e80] [c00ef720] proc_task_readdir+0x294/0x3c4
> [ddfc3ee0] [c00b60fc] vfs_readdir+0xb8/0xd8
> [ddfc3f10] [c00b619c] sys_getdents64+0x80/0x104
> [ddfc3f40] [c0010554] ret_from_syscall+0x0/0x3c
> Instruction dump:
> 7c0802a6 bf61000c 3f60c038 7c3f0b78 90010024 7c7c1b78 7c9d2378 83db52a0
> 73c00001 7f83e378 7fa4eb78 4082002f <00000000> 2f830000 409e0030 801b52a0
> ctory
> ---[ end trace 6b059a7f5ba5a55a ]---
>
> ---
>
> Oops: Exception in kernel mode, sig: 11 [#1]
> Exsw1600
> Modules linked in: gplonly_api
> NIP: c00bd6fc LR: c00bd6fc CTR: 00000000
> REGS: df52dc40 TRAP: 0700   Not tainted  (2.6.27-rc2)
> MSR: 00029000 <EE,ME>  CR: 20082422  XER: 20000000
> TASK = ddc2e060[2294] 'eeprom' THREAD: df52c000
> GPR00: 00000000 df52dcf0 ddc2e060 dd82c700 df52dd58 df52dd48 dd82c75e 
> 00000000
> GPR08: c0800000 000311f0 0000ffff df52dd10 20002488 1001e75c 100b0000 
> 1008e90c
> GPR16: bfd15a50 df52de6c df52dd58 fffffff4 c0380000 df52dd50 df52dd48 
> dd9303fc
> GPR24: dc3a6800 dd930390 df52dd58 c0380000 dd82c700 df52dd58 00000006 
> df52dcf0
> NIP [c00bd6fc] d_lookup+0x40/0x90
> LR [c00bd6fc] d_lookup+0x40/0x90
> Call Trace:
> [df52dcf0] [df52dd48] 0xdf52dd48 (unreliable)
> [df52dd10] [c00b07a0] do_lookup+0xe8/0x220
> [df52dd40] [c00b222c] __link_path_walk+0x174/0xd54
> [df52ddb0] [c00b2e64] path_walk+0x58/0xe0
> [df52dde0] [c00b2fd4] do_path_lookup+0x78/0x13c
> [df52de10] [c00b3f3c] __path_lookup_intent_open+0x68/0xdc
> [df52de40] [c00b3fd8] path_lookup_open+0x28/0x40
> [df52de50] [c00b4234] do_filp_open+0xa0/0x798
> [df52df00] [c00a3b80] do_sys_open+0x70/0x114
> [df52df30] [c00a3c98] sys_open+0x38/0x50
> [df52df40] [c0010554] ret_from_syscall+0x0/0x3c
> Instruction dump:
> 7c0802a6 bf61000c 3f60c038 7c3f0b78 90010024 7c7c1b78 7c9d2378 83db52a0
> 73c00001 7f83e378 7fa4eb78 4082002f <00000000> 2f830000 409e0030 801b52a0
> om_version(Canno---[ end trace 0cef3c8cb49d09de ]---
>
> ---
>
> Oops: Exception in kernel mode, sig: 11 [#1]
> Exsw1600
> Modules linked in: bridge stp llc tun beacon_freescale(P) 80211(P) 
> shm_freescale(P) ath_remote_regs_freescale(P) exsw 
> load_local_fpga_freescale 8021q poe_fi
> NIP: c00bd6fc LR: c00bd6fc CTR: 00000000
> REGS: dd555c50 TRAP: 0700   Tainted: P           (2.6.27-rc2)
> MSR: 00029000 <EE,ME>  CR: 24082282  XER: 20000000
> TASK = df4669a0[4976] 'find' THREAD: dd554000
> GPR00: 00000000 dd555d00 df4669a0 dd82ae00 dd555d68 dd555d58 dd82ae5b 
> 100234ec
> GPR08: c0800000 0002dd0c 0000ffff dd555d20 24000288 100ad874 100936f8 
> 1008a1d0
> GPR16: 10083f80 dd555e2c dd555d68 fffffff4 c0380000 dd555d60 dd555d58 
> dd9565d4
> GPR24: dc3db480 dd956568 dd555d68 c0380000 dd82ae00 dd555d68 00000000 
> dd555d00
> NIP [c00bd6fc] d_lookup+0x40/0x90
> LR [c00bd6fc] d_lookup+0x40/0x90
> Call Trace:
> [dd555d00] [dd555d58] 0xdd555d58 (unreliable)
> [dd555d20] [c00b07a0] do_lookup+0xe8/0x220
> [dd555d50] [c00b265c] __link_path_walk+0x5a4/0xd54
> [dd555dc0] [c00b2e64] path_walk+0x58/0xe0
> [dd555df0] [c00b2fd4] do_path_lookup+0x78/0x13c
> [dd555e20] [c00b3cd0] user_path_at+0x64/0xac
> [dd555e90] [c00aac04] vfs_lstat_fd+0x34/0x74
> [dd555ec0] [c00aacd8] vfs_lstat+0x30/0x48
> [dd555ed0] [c00aad20] sys_lstat64+0x30/0x5c
> [dd555f40] [c0010554] ret_from_syscall+0x0/0x3c
> Instruction dump:
> 7c0802a6 bf61000c 3f60c038 7c3f0b78 90010024 7c7c1b78 7c9d2378 83db52a0
> 73c00001 7f83e378 7fa4eb78 4082002f <00000000> 2f830000 409e0030 801b52a0
> 76/auxv
> /proc/4---[ end trace 0deef0827ce9df4b ]---
>
> ---
> Oops: Exception in kernel mode, sig: 11 [#1]
> Exsw1600
> Modules linked in: bridge stp llc tun beacon_freescale(P) 80211(P) 
> shm_freescale(P) ath_remote_regs_freescale(P) exsw 
> load_local_fpga_freescale 8021q poe_fi
> NIP: c00bd6fc LR: c00bd6fc CTR: 00000000
> REGS: d57e7c40 TRAP: 0700   Tainted: P           (2.6.27-rc2)
> MSR: 00029000 <EE,ME>  CR: 24482482  XER: 20000000
> TASK = db3cc940[2054] 'eapd' THREAD: d57e6000
> GPR00: 00000000 d57e7cf0 db3cc940 dd82a500 d57e7d58 d57e7d48 dd82a55a 
> 00000000
> GPR08: c0800000 00016c64 0000ffff d57e7d10 24422488 1008f700 4802cd38 
> 100763e8
> GPR16: 4801d818 d57e7e6c d57e7d58 fffffff4 c0380000 d57e7d50 d57e7d48 
> dd956c74
> GPR24: dc3db680 dd956c08 d57e7d58 c0380000 dd82a500 d57e7d58 00000000 
> d57e7cf0
> NIP [c00bd6fc] d_lookup+0x40/0x90
> LR [c00bd6fc] d_lookup+0x40/0x90
> Call Trace:
> [d57e7cf0] [d57e7d48] 0xd57e7d48 (unreliable)
> [d57e7d10] [c00b07a0] do_lookup+0xe8/0x220
> [d57e7d40] [c00b265c] __link_path_walk+0x5a4/0xd54
> [d57e7db0] [c00b2e64] path_walk+0x58/0xe0
> [d57e7de0] [c00b2fd4] do_path_lookup+0x78/0x13c
> [d57e7e10] [c00b3f3c] __path_lookup_intent_open+0x68/0xdc
> [d57e7e40] [c00b3fd8] path_lookup_open+0x28/0x40
> [d57e7e50] [c00b4234] do_filp_open+0xa0/0x798
> [d57e7f00] [c00a3b80] do_sys_open+0x70/0x114
> [d57e7f30] [c00a3c98] sys_open+0x38/0x50
> [d57e7f40] [c0010554] ret_from_syscall+0x0/0x3c
> Instruction dump:
> 7c0802a6 bf61000c 3f60c038 7c3f0b78 90010024 7c7c1b78 7c9d2378 83db52a0
> 73c00001 7f83e378 7fa4eb78 4082002f <00000000> 2f830000 409e0030 801b52a0
> ---[ end trace ffb4a6e37d78370a ]---
>
> ---
> Oops: Exception in kernel mode, sig: 11 [#1]
> Exsw1600
> Modules linked in: bridge stp llc tun beacon_freescale(P) 80211(P) 
> shm_freescale(P) ath_remote_regs_freescale(P) exsw 
> load_local_fpga_freescale 8021q poe_fi
> NIP: c00bd6fc LR: c00bd6fc CTR: c00bd6c8
> REGS: decd5cf0 TRAP: 0700   Tainted: P           (2.6.27-rc2)
> MSR: 00029000 <EE,ME>  CR: 22228882  XER: 20000000
> TASK = d9f5a9a0[6897] 'sh' THREAD: decd4000
> GPR00: 00000000 decd5da0 d9f5a9a0 dd801180 decd5de8 00000000 00000004 
> fffffff9
> GPR08: fffffffa 00000000 00d5ca28 decd5d90 2ccc1686 100ad874 1fffb200 
> 00000000
> GPR16: 007fff00 decd5ec4 00000008 c02fe298 00200200 00000000 c031aa34 
> decd5de8
> GPR24: decd5df4 00001af2 dd507d00 c0380000 dd801180 decd5de8 00000000 
> decd5da0
> NIP [c00bd6fc] d_lookup+0x40/0x90
> LR [c00bd6fc] d_lookup+0x40/0x90
> Call Trace:
> [decd5dc0] [c00bd7d8] d_hash_and_lookup+0x8c/0xc4
> [decd5de0] [c00ed728] proc_flush_task+0xb4/0x268
> [decd5e40] [c0032894] release_task+0x6c/0x39c
> [decd5e70] [c0033238] wait_consider_task+0x674/0x920
> [decd5eb0] [c0033610] do_wait+0x12c/0x3d4
> [decd5f20] [c0033954] sys_wait4+0x9c/0xe8
> [decd5f40] [c0010554] ret_from_syscall+0x0/0x3c
> Instruction dump:
> 7c0802a6 bf61000c 3f60c038 7c3f0b78 90010024 7c7c1b78 7c9d2378 83db52a0
> 73c00001 7f83e378 7fa4eb78 4082002f <00000000> 2f830000 409e0030 801b52a0
> ---[ end trace 00106ff4ec3b9c22 ]---
>
>
>
>


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-18 15:07   ` Steven Rostedt
@ 2008-08-18 15:47     ` Mathieu Desnoyers
  2008-08-18 16:12       ` Steven Rostedt
  2008-08-18 17:21       ` Scott Wood
  2008-08-18 18:25     ` Eran Liberty
  2008-08-19  1:51     ` Benjamin Herrenschmidt
  2 siblings, 2 replies; 81+ messages in thread
From: Mathieu Desnoyers @ 2008-08-18 15:47 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Eran Liberty, linux-kernel, Paul E. McKenney, linuxppc-dev, rostedt

* Steven Rostedt (srostedt@redhat.com) wrote:
> Eran Liberty wrote:
>> After compiling a kernel with ftrace I started to experience all sorts of 
>> crashes.
>
> Just to make sure...
>
> ftrace enables markers too, and RCU has tracing with the markers. This may 
> not be the problem, but I just want to eliminate as many variables as 
> possible.
> Could you disable ftrace, but keep the markers on too.  Also, could you 
> enable ftrace again and turn on the FTRACE_STARTUP_TEST.
>

Steve ? I just noticed this :


ftrace_modify_code(unsigned long ip, unsigned char *old_code,
                   unsigned char *new_code)
{
        unsigned replaced;
        unsigned old = *(unsigned *)old_code;
        unsigned new = *(unsigned *)new_code;
        int faulted = 0;

        /*
         * Note: Due to modules and __init, code can
         *  disappear and change, we need to protect against faulting
         *  as well as code changing.
         *
         * No real locking needed, this code is run through
         * kstop_machine.
         */
        asm volatile (
                "1: lwz         %1, 0(%2)\n"
                "   cmpw        %1, %5\n"
                "   bne         2f\n"
                "   stwu        %3, 0(%2)\n"
                "2:\n"
                ".section .fixup, \"ax\"\n"
                "3:     li %0, 1\n"
                "       b 2b\n"
                ".previous\n"
                ".section __ex_table,\"a\"\n"
                _ASM_ALIGN "\n"
                _ASM_PTR "1b, 3b\n"
                ".previous"
                : "=r"(faulted), "=r"(replaced)
                : "r"(ip), "r"(new),
                  "0"(faulted), "r"(old)
                : "memory");

        if (replaced != old && replaced != new)
                faulted = 2;

        if (!faulted)
                flush_icache_range(ip, ip + 8);

        return faulted;
}

What happens if you are really unlucky and get the following execution
sequence ?


Load module A at address 0xddfc3e00
Populate ftrace function table while the system runs
Unload module A
Load module B at address 0xddfc3e00
Activate ftrace
-> At that point, since there is another module loaded in the same
address space, it won't fault. If there happens to be an instruction
which looks exactly like the instruction we are replacing at the same
location, we are introducing a bug. True both on x86 ans powerpc...

Mathieu




> Thanks,
>
> -- Steve
>
>> I have a powerpc env which closly resemble mpc8548cds_defconfig.
>>
>> I have produced the crash in seconds by issuing:
>> > while [ 1 ] ; do find / > /dev/null ; echo .  ; done
>>
>> Liberty
>>
>> here are some of the kernel crashes:
>> ---
>> ~ # find /proc/ -name kft
>> find: /proc/150/net/softnet_stat: No such file or direOops: Exception in 
>> kernel mode, sig: 11 [#1]
>> Exsw1600
>> Modules linked in:
>> NIP: c00bd6fc LR: c00bd6fc CTR: 00000000
>> REGS: ddfc3d50 TRAP: 0700   Not tainted  (2.6.27-rc2)
>> MSR: 00029000 <EE,ME>  CR: 20002284  XER: 20000000
>> TASK = ddcccde0[1699] 'find' THREAD: ddfc2000
>> GPR00: 00000000 ddfc3e00 ddcccde0 dd895580 ddfc3e30 c00b5da0 ddfc3e90 
>> 00000003
>> GPR08: c00ece54 00000370 00145505 00000003 80002282 100ad874 10083ca0 
>> 10083cd8
>> GPR16: 10083cd0 00000008 00000000 c00b5da0 ddfc3f18 c00ece54 00000000 
>> df465720
>> GPR24: d7431900 ddfc3e30 dd895580 c0380000 dd895580 ddfc3e30 00000000 
>> ddfc3e00
>> NIP [c00bd6fc] d_lookup+0x40/0x90
>> LR [c00bd6fc] d_lookup+0x40/0x90
>> Call Trace:
>> [ddfc3e00] [c006ade4] rcu_process_callbacks+0x48/0x60 (unreliable)
>> [ddfc3e20] [c00eb514] proc_fill_cache+0x94/0x1b0
>> [ddfc3e80] [c00ef720] proc_task_readdir+0x294/0x3c4
>> [ddfc3ee0] [c00b60fc] vfs_readdir+0xb8/0xd8
>> [ddfc3f10] [c00b619c] sys_getdents64+0x80/0x104
>> [ddfc3f40] [c0010554] ret_from_syscall+0x0/0x3c
>> Instruction dump:
>> 7c0802a6 bf61000c 3f60c038 7c3f0b78 90010024 7c7c1b78 7c9d2378 83db52a0
>> 73c00001 7f83e378 7fa4eb78 4082002f <00000000> 2f830000 409e0030 801b52a0
>> ctory
>> ---[ end trace 6b059a7f5ba5a55a ]---
>>
>> ---
>>
>> Oops: Exception in kernel mode, sig: 11 [#1]
>> Exsw1600
>> Modules linked in: gplonly_api
>> NIP: c00bd6fc LR: c00bd6fc CTR: 00000000
>> REGS: df52dc40 TRAP: 0700   Not tainted  (2.6.27-rc2)
>> MSR: 00029000 <EE,ME>  CR: 20082422  XER: 20000000
>> TASK = ddc2e060[2294] 'eeprom' THREAD: df52c000
>> GPR00: 00000000 df52dcf0 ddc2e060 dd82c700 df52dd58 df52dd48 dd82c75e 
>> 00000000
>> GPR08: c0800000 000311f0 0000ffff df52dd10 20002488 1001e75c 100b0000 
>> 1008e90c
>> GPR16: bfd15a50 df52de6c df52dd58 fffffff4 c0380000 df52dd50 df52dd48 
>> dd9303fc
>> GPR24: dc3a6800 dd930390 df52dd58 c0380000 dd82c700 df52dd58 00000006 
>> df52dcf0
>> NIP [c00bd6fc] d_lookup+0x40/0x90
>> LR [c00bd6fc] d_lookup+0x40/0x90
>> Call Trace:
>> [df52dcf0] [df52dd48] 0xdf52dd48 (unreliable)
>> [df52dd10] [c00b07a0] do_lookup+0xe8/0x220
>> [df52dd40] [c00b222c] __link_path_walk+0x174/0xd54
>> [df52ddb0] [c00b2e64] path_walk+0x58/0xe0
>> [df52dde0] [c00b2fd4] do_path_lookup+0x78/0x13c
>> [df52de10] [c00b3f3c] __path_lookup_intent_open+0x68/0xdc
>> [df52de40] [c00b3fd8] path_lookup_open+0x28/0x40
>> [df52de50] [c00b4234] do_filp_open+0xa0/0x798
>> [df52df00] [c00a3b80] do_sys_open+0x70/0x114
>> [df52df30] [c00a3c98] sys_open+0x38/0x50
>> [df52df40] [c0010554] ret_from_syscall+0x0/0x3c
>> Instruction dump:
>> 7c0802a6 bf61000c 3f60c038 7c3f0b78 90010024 7c7c1b78 7c9d2378 83db52a0
>> 73c00001 7f83e378 7fa4eb78 4082002f <00000000> 2f830000 409e0030 801b52a0
>> om_version(Canno---[ end trace 0cef3c8cb49d09de ]---
>>
>> ---
>>
>> Oops: Exception in kernel mode, sig: 11 [#1]
>> Exsw1600
>> Modules linked in: bridge stp llc tun beacon_freescale(P) 80211(P) 
>> shm_freescale(P) ath_remote_regs_freescale(P) exsw 
>> load_local_fpga_freescale 8021q poe_fi
>> NIP: c00bd6fc LR: c00bd6fc CTR: 00000000
>> REGS: dd555c50 TRAP: 0700   Tainted: P           (2.6.27-rc2)
>> MSR: 00029000 <EE,ME>  CR: 24082282  XER: 20000000
>> TASK = df4669a0[4976] 'find' THREAD: dd554000
>> GPR00: 00000000 dd555d00 df4669a0 dd82ae00 dd555d68 dd555d58 dd82ae5b 
>> 100234ec
>> GPR08: c0800000 0002dd0c 0000ffff dd555d20 24000288 100ad874 100936f8 
>> 1008a1d0
>> GPR16: 10083f80 dd555e2c dd555d68 fffffff4 c0380000 dd555d60 dd555d58 
>> dd9565d4
>> GPR24: dc3db480 dd956568 dd555d68 c0380000 dd82ae00 dd555d68 00000000 
>> dd555d00
>> NIP [c00bd6fc] d_lookup+0x40/0x90
>> LR [c00bd6fc] d_lookup+0x40/0x90
>> Call Trace:
>> [dd555d00] [dd555d58] 0xdd555d58 (unreliable)
>> [dd555d20] [c00b07a0] do_lookup+0xe8/0x220
>> [dd555d50] [c00b265c] __link_path_walk+0x5a4/0xd54
>> [dd555dc0] [c00b2e64] path_walk+0x58/0xe0
>> [dd555df0] [c00b2fd4] do_path_lookup+0x78/0x13c
>> [dd555e20] [c00b3cd0] user_path_at+0x64/0xac
>> [dd555e90] [c00aac04] vfs_lstat_fd+0x34/0x74
>> [dd555ec0] [c00aacd8] vfs_lstat+0x30/0x48
>> [dd555ed0] [c00aad20] sys_lstat64+0x30/0x5c
>> [dd555f40] [c0010554] ret_from_syscall+0x0/0x3c
>> Instruction dump:
>> 7c0802a6 bf61000c 3f60c038 7c3f0b78 90010024 7c7c1b78 7c9d2378 83db52a0
>> 73c00001 7f83e378 7fa4eb78 4082002f <00000000> 2f830000 409e0030 801b52a0
>> 76/auxv
>> /proc/4---[ end trace 0deef0827ce9df4b ]---
>>
>> ---
>> Oops: Exception in kernel mode, sig: 11 [#1]
>> Exsw1600
>> Modules linked in: bridge stp llc tun beacon_freescale(P) 80211(P) 
>> shm_freescale(P) ath_remote_regs_freescale(P) exsw 
>> load_local_fpga_freescale 8021q poe_fi
>> NIP: c00bd6fc LR: c00bd6fc CTR: 00000000
>> REGS: d57e7c40 TRAP: 0700   Tainted: P           (2.6.27-rc2)
>> MSR: 00029000 <EE,ME>  CR: 24482482  XER: 20000000
>> TASK = db3cc940[2054] 'eapd' THREAD: d57e6000
>> GPR00: 00000000 d57e7cf0 db3cc940 dd82a500 d57e7d58 d57e7d48 dd82a55a 
>> 00000000
>> GPR08: c0800000 00016c64 0000ffff d57e7d10 24422488 1008f700 4802cd38 
>> 100763e8
>> GPR16: 4801d818 d57e7e6c d57e7d58 fffffff4 c0380000 d57e7d50 d57e7d48 
>> dd956c74
>> GPR24: dc3db680 dd956c08 d57e7d58 c0380000 dd82a500 d57e7d58 00000000 
>> d57e7cf0
>> NIP [c00bd6fc] d_lookup+0x40/0x90
>> LR [c00bd6fc] d_lookup+0x40/0x90
>> Call Trace:
>> [d57e7cf0] [d57e7d48] 0xd57e7d48 (unreliable)
>> [d57e7d10] [c00b07a0] do_lookup+0xe8/0x220
>> [d57e7d40] [c00b265c] __link_path_walk+0x5a4/0xd54
>> [d57e7db0] [c00b2e64] path_walk+0x58/0xe0
>> [d57e7de0] [c00b2fd4] do_path_lookup+0x78/0x13c
>> [d57e7e10] [c00b3f3c] __path_lookup_intent_open+0x68/0xdc
>> [d57e7e40] [c00b3fd8] path_lookup_open+0x28/0x40
>> [d57e7e50] [c00b4234] do_filp_open+0xa0/0x798
>> [d57e7f00] [c00a3b80] do_sys_open+0x70/0x114
>> [d57e7f30] [c00a3c98] sys_open+0x38/0x50
>> [d57e7f40] [c0010554] ret_from_syscall+0x0/0x3c
>> Instruction dump:
>> 7c0802a6 bf61000c 3f60c038 7c3f0b78 90010024 7c7c1b78 7c9d2378 83db52a0
>> 73c00001 7f83e378 7fa4eb78 4082002f <00000000> 2f830000 409e0030 801b52a0
>> ---[ end trace ffb4a6e37d78370a ]---
>>
>> ---
>> Oops: Exception in kernel mode, sig: 11 [#1]
>> Exsw1600
>> Modules linked in: bridge stp llc tun beacon_freescale(P) 80211(P) 
>> shm_freescale(P) ath_remote_regs_freescale(P) exsw 
>> load_local_fpga_freescale 8021q poe_fi
>> NIP: c00bd6fc LR: c00bd6fc CTR: c00bd6c8
>> REGS: decd5cf0 TRAP: 0700   Tainted: P           (2.6.27-rc2)
>> MSR: 00029000 <EE,ME>  CR: 22228882  XER: 20000000
>> TASK = d9f5a9a0[6897] 'sh' THREAD: decd4000
>> GPR00: 00000000 decd5da0 d9f5a9a0 dd801180 decd5de8 00000000 00000004 
>> fffffff9
>> GPR08: fffffffa 00000000 00d5ca28 decd5d90 2ccc1686 100ad874 1fffb200 
>> 00000000
>> GPR16: 007fff00 decd5ec4 00000008 c02fe298 00200200 00000000 c031aa34 
>> decd5de8
>> GPR24: decd5df4 00001af2 dd507d00 c0380000 dd801180 decd5de8 00000000 
>> decd5da0
>> NIP [c00bd6fc] d_lookup+0x40/0x90
>> LR [c00bd6fc] d_lookup+0x40/0x90
>> Call Trace:
>> [decd5dc0] [c00bd7d8] d_hash_and_lookup+0x8c/0xc4
>> [decd5de0] [c00ed728] proc_flush_task+0xb4/0x268
>> [decd5e40] [c0032894] release_task+0x6c/0x39c
>> [decd5e70] [c0033238] wait_consider_task+0x674/0x920
>> [decd5eb0] [c0033610] do_wait+0x12c/0x3d4
>> [decd5f20] [c0033954] sys_wait4+0x9c/0xe8
>> [decd5f40] [c0010554] ret_from_syscall+0x0/0x3c
>> Instruction dump:
>> 7c0802a6 bf61000c 3f60c038 7c3f0b78 90010024 7c7c1b78 7c9d2378 83db52a0
>> 73c00001 7f83e378 7fa4eb78 4082002f <00000000> 2f830000 409e0030 801b52a0
>> ---[ end trace 00106ff4ec3b9c22 ]---
>>
>>
>>
>>
>

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-18 15:47     ` Mathieu Desnoyers
@ 2008-08-18 16:12       ` Steven Rostedt
  2008-08-18 17:04         ` Mathieu Desnoyers
  2008-08-18 17:21       ` Scott Wood
  1 sibling, 1 reply; 81+ messages in thread
From: Steven Rostedt @ 2008-08-18 16:12 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Eran Liberty, linux-kernel, Paul E. McKenney, linuxppc-dev, rostedt

Mathieu Desnoyers wrote:
>
> Steve ? I just noticed this :
>
>
> ftrace_modify_code(unsigned long ip, unsigned char *old_code,
>                    unsigned char *new_code)
> {
>         unsigned replaced;
>         unsigned old = *(unsigned *)old_code;
>         unsigned new = *(unsigned *)new_code;
>         int faulted = 0;
>
>         /*
>          * Note: Due to modules and __init, code can
>          *  disappear and change, we need to protect against faulting
>          *  as well as code changing.
>          *
>          * No real locking needed, this code is run through
>          * kstop_machine.
>          */
>         asm volatile (
>                 "1: lwz         %1, 0(%2)\n"
>                 "   cmpw        %1, %5\n"
>                 "   bne         2f\n"
>                 "   stwu        %3, 0(%2)\n"
>                 "2:\n"
>                 ".section .fixup, \"ax\"\n"
>                 "3:     li %0, 1\n"
>                 "       b 2b\n"
>                 ".previous\n"
>                 ".section __ex_table,\"a\"\n"
>                 _ASM_ALIGN "\n"
>                 _ASM_PTR "1b, 3b\n"
>                 ".previous"
>                 : "=r"(faulted), "=r"(replaced)
>                 : "r"(ip), "r"(new),
>                   "0"(faulted), "r"(old)
>                 : "memory");
>
>         if (replaced != old && replaced != new)
>                 faulted = 2;
>
>         if (!faulted)
>                 flush_icache_range(ip, ip + 8);
>
>         return faulted;
> }
>
> What happens if you are really unlucky and get the following execution
> sequence ?
>
>
> Load module A at address 0xddfc3e00
> Populate ftrace function table while the system runs
> Unload module A
> Load module B at address 0xddfc3e00
> Activate ftrace
> -> At that point, since there is another module loaded in the same
> address space, it won't fault. If there happens to be an instruction
> which looks exactly like the instruction we are replacing at the same
> location, we are introducing a bug. True both on x86 ans powerpc...
>
>   

Hi Mathieu,

Yep I know of this issue, and it is very unlikely it would happen.  But 
that's not good enough, so this is why I did this:

http://marc.info/?l=linux-kernel&m=121876928124384&w=2
http://marc.info/?l=linux-kernel&m=121876938524523&w=2

;-)

On powerpc it would be less likely an issue since the code segments are 
all 4 bytes aligned. And a call being replaced would be a call to mcount 
(relative jump). A call to mcount would be the same for both. Then 
again, we could be replacing a nop, but most likely that wouldn't hurt 
either, since nops are seldom used, and if we did call mcount, it would 
probably not hurt. But it would be an issue if Module B happen to put a 
data section that matched the code to jmp to mcount or a nop.

-- Steve


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-18 16:12       ` Steven Rostedt
@ 2008-08-18 17:04         ` Mathieu Desnoyers
  0 siblings, 0 replies; 81+ messages in thread
From: Mathieu Desnoyers @ 2008-08-18 17:04 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Eran Liberty, linux-kernel, Paul E. McKenney, linuxppc-dev, rostedt

* Steven Rostedt (srostedt@redhat.com) wrote:
> Mathieu Desnoyers wrote:
>>
>> Steve ? I just noticed this :
>>
>>
>> ftrace_modify_code(unsigned long ip, unsigned char *old_code,
>>                    unsigned char *new_code)
>> {
>>         unsigned replaced;
>>         unsigned old = *(unsigned *)old_code;
>>         unsigned new = *(unsigned *)new_code;
>>         int faulted = 0;
>>
>>         /*
>>          * Note: Due to modules and __init, code can
>>          *  disappear and change, we need to protect against faulting
>>          *  as well as code changing.
>>          *
>>          * No real locking needed, this code is run through
>>          * kstop_machine.
>>          */
>>         asm volatile (
>>                 "1: lwz         %1, 0(%2)\n"
>>                 "   cmpw        %1, %5\n"
>>                 "   bne         2f\n"
>>                 "   stwu        %3, 0(%2)\n"
>>                 "2:\n"
>>                 ".section .fixup, \"ax\"\n"
>>                 "3:     li %0, 1\n"
>>                 "       b 2b\n"
>>                 ".previous\n"
>>                 ".section __ex_table,\"a\"\n"
>>                 _ASM_ALIGN "\n"
>>                 _ASM_PTR "1b, 3b\n"
>>                 ".previous"
>>                 : "=r"(faulted), "=r"(replaced)
>>                 : "r"(ip), "r"(new),
>>                   "0"(faulted), "r"(old)
>>                 : "memory");
>>
>>         if (replaced != old && replaced != new)
>>                 faulted = 2;
>>
>>         if (!faulted)
>>                 flush_icache_range(ip, ip + 8);
>>
>>         return faulted;
>> }
>>
>> What happens if you are really unlucky and get the following execution
>> sequence ?
>>
>>
>> Load module A at address 0xddfc3e00
>> Populate ftrace function table while the system runs
>> Unload module A
>> Load module B at address 0xddfc3e00
>> Activate ftrace
>> -> At that point, since there is another module loaded in the same
>> address space, it won't fault. If there happens to be an instruction
>> which looks exactly like the instruction we are replacing at the same
>> location, we are introducing a bug. True both on x86 ans powerpc...
>>
>>   
>
> Hi Mathieu,
>
> Yep I know of this issue, and it is very unlikely it would happen.  But 
> that's not good enough, so this is why I did this:
>
> http://marc.info/?l=linux-kernel&m=121876928124384&w=2
> http://marc.info/?l=linux-kernel&m=121876938524523&w=2
>
> ;-)
>
> On powerpc it would be less likely an issue since the code segments are all 
> 4 bytes aligned. And a call being replaced would be a call to mcount 
> (relative jump). A call to mcount would be the same for both. Then again, 
> we could be replacing a nop, but most likely that wouldn't hurt either, 
> since nops are seldom used, and if we did call mcount, it would probably 
> not hurt. But it would be an issue if Module B happen to put a data section 
> that matched the code to jmp to mcount or a nop.
>

Ah, I did not see this one passing by :) That's the right solution
indeed. I guess it's not in 2.6.27-rc2/rc3 though ? I wonder if the
bug can be repeated with a module-free kernel ?

Mathieu

> -- Steve
>

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-18 15:47     ` Mathieu Desnoyers
  2008-08-18 16:12       ` Steven Rostedt
@ 2008-08-18 17:21       ` Scott Wood
  2008-08-18 18:27         ` Steven Rostedt
  2008-08-18 18:47         ` Steven Rostedt
  1 sibling, 2 replies; 81+ messages in thread
From: Scott Wood @ 2008-08-18 17:21 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Steven Rostedt, Paul E. McKenney, Eran Liberty, linux-kernel,
	rostedt, linuxppc-dev

Mathieu Desnoyers wrote:
>         asm volatile (
>                 "1: lwz         %1, 0(%2)\n"
>                 "   cmpw        %1, %5\n"
>                 "   bne         2f\n"
>                 "   stwu        %3, 0(%2)\n"
>                 "2:\n"
>                 ".section .fixup, \"ax\"\n"
>                 "3:     li %0, 1\n"
>                 "       b 2b\n"
>                 ".previous\n"
>                 ".section __ex_table,\"a\"\n"
>                 _ASM_ALIGN "\n"
>                 _ASM_PTR "1b, 3b\n"
>                 ".previous"
>                 : "=r"(faulted), "=r"(replaced)
>                 : "r"(ip), "r"(new),
>                   "0"(faulted), "r"(old)
>                 : "memory");

Some (most likely unrelated) nits in the above inline asm:

Should use a "b" constraint for %2, or you could get r0.  Or, use an "m" 
constraint with %U2%X2 after the lwz/stw.  Why stwu with an offset of 
zero, BTW?

%1 also needs to be an early clobber.

-Scott

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-18 15:07   ` Steven Rostedt
  2008-08-18 15:47     ` Mathieu Desnoyers
@ 2008-08-18 18:25     ` Eran Liberty
  2008-08-18 18:41       ` Mathieu Desnoyers
  2008-08-18 18:50       ` Steven Rostedt
  2008-08-19  1:51     ` Benjamin Herrenschmidt
  2 siblings, 2 replies; 81+ messages in thread
From: Eran Liberty @ 2008-08-18 18:25 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Mathieu Desnoyers, Paul E. McKenney, linuxppc-dev, rostedt

Steven Rostedt wrote:
> Eran Liberty wrote:
>> After compiling a kernel with ftrace I started to experience all 
>> sorts of crashes.
>
> Just to make sure...
>
> ftrace enables markers too, and RCU has tracing with the markers. This 
> may not be the problem, but I just want to eliminate as many variables 
> as possible.
> Could you disable ftrace, but keep the markers on too.  Also, could 
> you enable ftrace again and turn on the FTRACE_STARTUP_TEST.

for the fun of it I took out all my propriety modules; so now its a non 
tainted kernel.

Here is the matrix:

!FTRACE x !MARKERS => stable
!FTRACE x MARKERS => stable
FTRACE x !MARKERS => n/a (FTRACE forces MARKERS)
FTRACE x MARKERS => unstable
FTRACE x FTRACE_STARTUP_TEST x MARKERS => unstable + tests passed

Testing tracer sched_switch: PASSED
Testing tracer ftrace: PASSED
Testing dynamic ftrace: PASSED

Oops: Exception in kernel mode, sig: 11 [#1]
Exsw1600
Modules linked in:
NIP: c00bbb20 LR: c00bbb20 CTR: 00000000
REGS: dd5b1c50 TRAP: 0700   Not tainted  (2.6.27-rc2)
MSR: 00029000 <EE,ME>  CR: 24082282  XER: 20000000
TASK = ddcce060[1707] 'find' THREAD: dd5b0000
GPR00: 00000000 dd5b1d00 ddcce060 dd801180 dd5b1d68 dd5b1d58 dd80125b 
100234ec
GPR08: c0800000 00019330 0000ffff dd5b1d20 24000288 100ad874 100936f8 
1008a1d0
GPR16: 10083f80 dd5b1e2c dd5b1d68 fffffff4 c0380000 dd5b1d60 dd5b1d58 
dd802084
GPR24: dc3d7700 dd802018 dd5b1d68 c0380000 dd801180 dd5b1d68 00000000 
dd5b1d00
NIP [c00bbb20] d_lookup+0x40/0x90
LR [c00bbb20] d_lookup+0x40/0x90
Call Trace:
[dd5b1d00] [dd5b1d58] 0xdd5b1d58 (unreliable)
[dd5b1d20] [c00aebc4] do_lookup+0xe8/0x220
[dd5b1d50] [c00b0a80] __link_path_walk+0x5a4/0xd54
[dd5b1dc0] [c00b1288] path_walk+0x58/0xe0
[dd5b1df0] [c00b13f8] do_path_lookup+0x78/0x13c
[dd5b1e20] [c00b20f4] user_path_at+0x64/0xac
[dd5b1e90] [c00a9028] vfs_lstat_fd+0x34/0x74
[dd5b1ec0] [c00a90fc] vfs_lstat+0x30/0x48
[dd5b1ed0] [c00a9144] sys_lstat64+0x30/0x5c
[dd5b1f40] [c0010554] ret_from_syscall+0x0/0x3c
Instruction dump:
7c0802a6 bf61000c 3f60c038 7c3f0b78 90010024 7c7c1b78 7c9d2378 83db32a0
73c00001 7f83e378 7fa4eb78 4082002f <00000000> 2f830000 409e0030 801b32a0
---[ end trace 1eb8fd7adac2bb65 ]---

Liberty



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-18 17:21       ` Scott Wood
@ 2008-08-18 18:27         ` Steven Rostedt
  2008-08-18 18:29           ` Scott Wood
  2008-08-19  1:53           ` Benjamin Herrenschmidt
  2008-08-18 18:47         ` Steven Rostedt
  1 sibling, 2 replies; 81+ messages in thread
From: Steven Rostedt @ 2008-08-18 18:27 UTC (permalink / raw)
  To: Scott Wood
  Cc: Mathieu Desnoyers, Steven Rostedt, Paul E. McKenney,
	Eran Liberty, linux-kernel, linuxppc-dev



On Mon, 18 Aug 2008, Scott Wood wrote:

> Mathieu Desnoyers wrote:
> >         asm volatile (
> >                 "1: lwz         %1, 0(%2)\n"
> >                 "   cmpw        %1, %5\n"
> >                 "   bne         2f\n"
> >                 "   stwu        %3, 0(%2)\n"
> >                 "2:\n"
> >                 ".section .fixup, \"ax\"\n"
> >                 "3:     li %0, 1\n"
> >                 "       b 2b\n"
> >                 ".previous\n"
> >                 ".section __ex_table,\"a\"\n"
> >                 _ASM_ALIGN "\n"
> >                 _ASM_PTR "1b, 3b\n"
> >                 ".previous"
> >                 : "=r"(faulted), "=r"(replaced)
> >                 : "r"(ip), "r"(new),
> >                   "0"(faulted), "r"(old)
> >                 : "memory");
> 
> Some (most likely unrelated) nits in the above inline asm:

Thanks,

> 
> Should use a "b" constraint for %2, or you could get r0. 

I will make an updated patch.

> Or, use an "m"
> constraint with %U2%X2 after the lwz/stw. 

The 'b' seems easier ;-)

> Why stwu with an offset of zero,
> BTW?

Because it worked :-)  Really, it has been a while since I did any PPC 
assembly, and I didn't break out the old reference manuals to do this.
I simply looked at other asm code, and followed suit. I compiled and 
tested it, and it worked.  I did make a disclaimer about my rusty PPC
knowledge when I posted the code.

> 
> %1 also needs to be an early clobber.

Not exactly sure what you mean by the above.

-- Steve

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-18 18:27         ` Steven Rostedt
@ 2008-08-18 18:29           ` Scott Wood
  2008-08-19  1:53           ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 81+ messages in thread
From: Scott Wood @ 2008-08-18 18:29 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Mathieu Desnoyers, Steven Rostedt, Paul E. McKenney,
	Eran Liberty, linux-kernel, linuxppc-dev

Steven Rostedt wrote:
>> Should use a "b" constraint for %2, or you could get r0. 
> 
> I will make an updated patch.
> 
>> Or, use an "m"
>> constraint with %U2%X2 after the lwz/stw. 
> 
> The 'b' seems easier ;-)

The advantage of the latter is that it allows GCC to choose indexed or 
update instructions -- but that's merely an optimization.  Switching to 
"b" is enough to avoid the potential bug.

>> %1 also needs to be an early clobber.
> 
> Not exactly sure what you mean by the above.

%1 is written to before some inputs are consumed, so you need to use 
"=&r" rather than "=r" so that GCC won't use the same register for both.

-Scott

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-18 18:25     ` Eran Liberty
@ 2008-08-18 18:41       ` Mathieu Desnoyers
  2008-08-19  1:54         ` Benjamin Herrenschmidt
  2008-08-19  9:56         ` Eran Liberty
  2008-08-18 18:50       ` Steven Rostedt
  1 sibling, 2 replies; 81+ messages in thread
From: Mathieu Desnoyers @ 2008-08-18 18:41 UTC (permalink / raw)
  To: Eran Liberty
  Cc: Steven Rostedt, linux-kernel, Paul E. McKenney, linuxppc-dev, rostedt

* Eran Liberty (liberty@extricom.com) wrote:
> Steven Rostedt wrote:
>> Eran Liberty wrote:
>>> After compiling a kernel with ftrace I started to experience all sorts of 
>>> crashes.
>>
>> Just to make sure...
>>
>> ftrace enables markers too, and RCU has tracing with the markers. This may 
>> not be the problem, but I just want to eliminate as many variables as 
>> possible.
>> Could you disable ftrace, but keep the markers on too.  Also, could you 
>> enable ftrace again and turn on the FTRACE_STARTUP_TEST.
>
> for the fun of it I took out all my propriety modules; so now its a non 
> tainted kernel.
>
> Here is the matrix:
>
> !FTRACE x !MARKERS => stable
> !FTRACE x MARKERS => stable
> FTRACE x !MARKERS => n/a (FTRACE forces MARKERS)
> FTRACE x MARKERS => unstable
> FTRACE x FTRACE_STARTUP_TEST x MARKERS => unstable + tests passed
>
> Testing tracer sched_switch: PASSED
> Testing tracer ftrace: PASSED
> Testing dynamic ftrace: PASSED
>
> Oops: Exception in kernel mode, sig: 11 [#1]
> Exsw1600
> Modules linked in:
> NIP: c00bbb20 LR: c00bbb20 CTR: 00000000
> REGS: dd5b1c50 TRAP: 0700   Not tainted  (2.6.27-rc2)
> MSR: 00029000 <EE,ME>  CR: 24082282  XER: 20000000
> TASK = ddcce060[1707] 'find' THREAD: dd5b0000
> GPR00: 00000000 dd5b1d00 ddcce060 dd801180 dd5b1d68 dd5b1d58 dd80125b 
> 100234ec
> GPR08: c0800000 00019330 0000ffff dd5b1d20 24000288 100ad874 100936f8 
> 1008a1d0
> GPR16: 10083f80 dd5b1e2c dd5b1d68 fffffff4 c0380000 dd5b1d60 dd5b1d58 
> dd802084
> GPR24: dc3d7700 dd802018 dd5b1d68 c0380000 dd801180 dd5b1d68 00000000 
> dd5b1d00
> NIP [c00bbb20] d_lookup+0x40/0x90
> LR [c00bbb20] d_lookup+0x40/0x90
> Call Trace:
> [dd5b1d00] [dd5b1d58] 0xdd5b1d58 (unreliable)

Can you check if, at some point during the system execution (starting
from boot), 0xdd5b1d58 is an address where a module is loaded ? (the
module can be later unloaded, what I wonder is if this address would
appear to have had a loaded+unloaded module).

Actually, could you try to compile your kernel without "MODULE_UNLOAD" ?

Mathieu

> [dd5b1d20] [c00aebc4] do_lookup+0xe8/0x220
> [dd5b1d50] [c00b0a80] __link_path_walk+0x5a4/0xd54
> [dd5b1dc0] [c00b1288] path_walk+0x58/0xe0
> [dd5b1df0] [c00b13f8] do_path_lookup+0x78/0x13c
> [dd5b1e20] [c00b20f4] user_path_at+0x64/0xac
> [dd5b1e90] [c00a9028] vfs_lstat_fd+0x34/0x74
> [dd5b1ec0] [c00a90fc] vfs_lstat+0x30/0x48
> [dd5b1ed0] [c00a9144] sys_lstat64+0x30/0x5c
> [dd5b1f40] [c0010554] ret_from_syscall+0x0/0x3c
> Instruction dump:
> 7c0802a6 bf61000c 3f60c038 7c3f0b78 90010024 7c7c1b78 7c9d2378 83db32a0
> 73c00001 7f83e378 7fa4eb78 4082002f <00000000> 2f830000 409e0030 801b32a0
> ---[ end trace 1eb8fd7adac2bb65 ]---
>
> Liberty
>
>

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-18 17:21       ` Scott Wood
  2008-08-18 18:27         ` Steven Rostedt
@ 2008-08-18 18:47         ` Steven Rostedt
  2008-08-18 18:56           ` Scott Wood
  1 sibling, 1 reply; 81+ messages in thread
From: Steven Rostedt @ 2008-08-18 18:47 UTC (permalink / raw)
  To: Scott Wood
  Cc: Mathieu Desnoyers, Steven Rostedt, Paul E. McKenney,
	Eran Liberty, linux-kernel, linuxppc-dev


On Mon, 18 Aug 2008, Scott Wood wrote:

> Mathieu Desnoyers wrote:
> >         asm volatile (
> >                 "1: lwz         %1, 0(%2)\n"
> >                 "   cmpw        %1, %5\n"
> >                 "   bne         2f\n"
> >                 "   stwu        %3, 0(%2)\n"
> >                 "2:\n"
> >                 ".section .fixup, \"ax\"\n"
> >                 "3:     li %0, 1\n"
> >                 "       b 2b\n"
> >                 ".previous\n"
> >                 ".section __ex_table,\"a\"\n"
> >                 _ASM_ALIGN "\n"
> >                 _ASM_PTR "1b, 3b\n"
> >                 ".previous"
> >                 : "=r"(faulted), "=r"(replaced)
> >                 : "r"(ip), "r"(new),
> >                   "0"(faulted), "r"(old)
> >                 : "memory");
> 
> Some (most likely unrelated) nits in the above inline asm:
> 
> Should use a "b" constraint for %2, or you could get r0.  Or, use an "m"
> constraint with %U2%X2 after the lwz/stw. 

What syntax to do that with?

  lwz %1,0(%U2)
  stu %3, 0(%X2)

I'm new to those. (and the above does not compile)

> Why stwu with an offset of zero,

How else to do it?  stwu %3, (%2) does not compile for me.

-- Steve


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-18 18:25     ` Eran Liberty
  2008-08-18 18:41       ` Mathieu Desnoyers
@ 2008-08-18 18:50       ` Steven Rostedt
  2008-08-19 12:09         ` Eran Liberty
  1 sibling, 1 reply; 81+ messages in thread
From: Steven Rostedt @ 2008-08-18 18:50 UTC (permalink / raw)
  To: Eran Liberty
  Cc: Steven Rostedt, linux-kernel, Mathieu Desnoyers,
	Paul E. McKenney, linuxppc-dev


On Mon, 18 Aug 2008, Eran Liberty wrote:

> Steven Rostedt wrote:
> > Eran Liberty wrote:
> > > After compiling a kernel with ftrace I started to experience all sorts of
> > > crashes.
> > 
> > Just to make sure...
> > 
> > ftrace enables markers too, and RCU has tracing with the markers. This may
> > not be the problem, but I just want to eliminate as many variables as
> > possible.
> > Could you disable ftrace, but keep the markers on too.  Also, could you
> > enable ftrace again and turn on the FTRACE_STARTUP_TEST.
> 
> for the fun of it I took out all my propriety modules; so now its a non
> tainted kernel.
> 
> Here is the matrix:
> 
> !FTRACE x !MARKERS => stable
> !FTRACE x MARKERS => stable
> FTRACE x !MARKERS => n/a (FTRACE forces MARKERS)
> FTRACE x MARKERS => unstable
> FTRACE x FTRACE_STARTUP_TEST x MARKERS => unstable + tests passed

Thanks

> 
> Testing tracer sched_switch: PASSED
> Testing tracer ftrace: PASSED
> Testing dynamic ftrace: PASSED
> 
> Oops: Exception in kernel mode, sig: 11 [#1]
> Exsw1600
> Modules linked in:
> NIP: c00bbb20 LR: c00bbb20 CTR: 00000000

Could you load this into gdb for me and show me the output of:

gdb> li *0xc00bbb20

(Assuming you compiled with debuginfo on)

and...

gdb> disass 0xc00bbb20

> REGS: dd5b1c50 TRAP: 0700   Not tainted  (2.6.27-rc2)
> MSR: 00029000 <EE,ME>  CR: 24082282  XER: 20000000
> TASK = ddcce060[1707] 'find' THREAD: dd5b0000
> GPR00: 00000000 dd5b1d00 ddcce060 dd801180 dd5b1d68 dd5b1d58 dd80125b 100234ec
> GPR08: c0800000 00019330 0000ffff dd5b1d20 24000288 100ad874 100936f8 1008a1d0
> GPR16: 10083f80 dd5b1e2c dd5b1d68 fffffff4 c0380000 dd5b1d60 dd5b1d58 dd802084
> GPR24: dc3d7700 dd802018 dd5b1d68 c0380000 dd801180 dd5b1d68 00000000 dd5b1d00
> NIP [c00bbb20] d_lookup+0x40/0x90
> LR [c00bbb20] d_lookup+0x40/0x90
> Call Trace:
> [dd5b1d00] [dd5b1d58] 0xdd5b1d58 (unreliable)
> [dd5b1d20] [c00aebc4] do_lookup+0xe8/0x220
> [dd5b1d50] [c00b0a80] __link_path_walk+0x5a4/0xd54
> [dd5b1dc0] [c00b1288] path_walk+0x58/0xe0
> [dd5b1df0] [c00b13f8] do_path_lookup+0x78/0x13c
> [dd5b1e20] [c00b20f4] user_path_at+0x64/0xac
> [dd5b1e90] [c00a9028] vfs_lstat_fd+0x34/0x74
> [dd5b1ec0] [c00a90fc] vfs_lstat+0x30/0x48
> [dd5b1ed0] [c00a9144] sys_lstat64+0x30/0x5c
> [dd5b1f40] [c0010554] ret_from_syscall+0x0/0x3c
> Instruction dump:
> 7c0802a6 bf61000c 3f60c038 7c3f0b78 90010024 7c7c1b78 7c9d2378 83db32a0
> 73c00001 7f83e378 7fa4eb78 4082002f <00000000> 2f830000 409e0030 801b32a0

Ouch! we have a 00000000 instruction. I'm almost done with the new mcount 
record for PPC (I have it done for ppc64, I'm just porting it to 32). I'm 
thinking this new code may solve your issues too. I hate the daemon.

-- Steve

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-18 18:47         ` Steven Rostedt
@ 2008-08-18 18:56           ` Scott Wood
  2008-08-18 19:28             ` Steven Rostedt
  0 siblings, 1 reply; 81+ messages in thread
From: Scott Wood @ 2008-08-18 18:56 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Eran Liberty, Mathieu Desnoyers, linux-kernel, linuxppc-dev,
	Steven Rostedt, Paul E. McKenney

Steven Rostedt wrote:
> What syntax to do that with?
> 
>   lwz %1,0(%U2)
>   stu %3, 0(%X2)
> 
> I'm new to those. (and the above does not compile)

lwz%U2%X2 %1, %2
stw%U2%X2 %3, %2

>> Why stwu with an offset of zero,
> 
> How else to do it?  stwu %3, (%2) does not compile for me.

stw %3, 0(%2)

The "u" tells it to write the effective address back to %2 -- but with 
an offset of zero, the effective address is unchanged.

-Scott

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-18 18:56           ` Scott Wood
@ 2008-08-18 19:28             ` Steven Rostedt
  0 siblings, 0 replies; 81+ messages in thread
From: Steven Rostedt @ 2008-08-18 19:28 UTC (permalink / raw)
  To: Scott Wood
  Cc: Eran Liberty, Mathieu Desnoyers, linux-kernel, linuxppc-dev,
	Steven Rostedt, Paul E. McKenney



On Mon, 18 Aug 2008, Scott Wood wrote:

> Steven Rostedt wrote:
> > What syntax to do that with?
> > 
> >   lwz %1,0(%U2)
> >   stu %3, 0(%X2)
> > 
> > I'm new to those. (and the above does not compile)
> 
> lwz%U2%X2 %1, %2
> stw%U2%X2 %3, %2

Thanks, that's new to me.

> 
> > > Why stwu with an offset of zero,
> > 
> > How else to do it?  stwu %3, (%2) does not compile for me.
> 
> stw %3, 0(%2)
> 
> The "u" tells it to write the effective address back to %2 -- but with an
> offset of zero, the effective address is unchanged.

Ah! Right!

/me should open up his PowerPC ref books again :-p

-- Steve


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-18 15:07   ` Steven Rostedt
  2008-08-18 15:47     ` Mathieu Desnoyers
  2008-08-18 18:25     ` Eran Liberty
@ 2008-08-19  1:51     ` Benjamin Herrenschmidt
  2 siblings, 0 replies; 81+ messages in thread
From: Benjamin Herrenschmidt @ 2008-08-19  1:51 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Eran Liberty, linuxppc-dev, Paul E. McKenney, Mathieu Desnoyers,
	linux-kernel, rostedt

On Mon, 2008-08-18 at 11:07 -0400, Steven Rostedt wrote:
> Eran Liberty wrote:
> > After compiling a kernel with ftrace I started to experience all sorts 
> > of crashes.
> 
> Just to make sure...
> 
> ftrace enables markers too, and RCU has tracing with the markers. This 
> may not be the problem, but I just want to eliminate as many variables 
> as possible.
> Could you disable ftrace, but keep the markers on too.  Also, could you 
> enable ftrace again and turn on the FTRACE_STARTUP_TEST.

I spent some time last week tracking one of those crashes and it appears
that we are getting corruption of some of the non-volatile registers.

So far, I found out that it -seems- to be coming from stack frame
corruption during a timer interrupt.

I haven't had a chance to dig further yet.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-18 18:27         ` Steven Rostedt
  2008-08-18 18:29           ` Scott Wood
@ 2008-08-19  1:53           ` Benjamin Herrenschmidt
  2008-08-19  2:28             ` Steven Rostedt
  1 sibling, 1 reply; 81+ messages in thread
From: Benjamin Herrenschmidt @ 2008-08-19  1:53 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Scott Wood, Eran Liberty, Mathieu Desnoyers, linux-kernel,
	linuxppc-dev, Steven Rostedt, Paul E. McKenney

On Mon, 2008-08-18 at 14:27 -0400, Steven Rostedt wrote:
> 
> On Mon, 18 Aug 2008, Scott Wood wrote:
> 
> > Mathieu Desnoyers wrote:
> > >         asm volatile (
> > >                 "1: lwz         %1, 0(%2)\n"
> > >                 "   cmpw        %1, %5\n"
> > >                 "   bne         2f\n"
> > >                 "   stwu        %3, 0(%2)\n"
> > >                 "2:\n"
> > >                 ".section .fixup, \"ax\"\n"
> > >                 "3:     li %0, 1\n"
> > >                 "       b 2b\n"
> > >                 ".previous\n"
> > >                 ".section __ex_table,\"a\"\n"
> > >                 _ASM_ALIGN "\n"
> > >                 _ASM_PTR "1b, 3b\n"
> > >                 ".previous"
> > >                 : "=r"(faulted), "=r"(replaced)
> > >                 : "r"(ip), "r"(new),
> > >                   "0"(faulted), "r"(old)
> > >                 : "memory");
> > 
> > Some (most likely unrelated) nits in the above inline asm:


Why not use __get_user/__put_user ?

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-18 18:41       ` Mathieu Desnoyers
@ 2008-08-19  1:54         ` Benjamin Herrenschmidt
  2008-08-19  9:56         ` Eran Liberty
  1 sibling, 0 replies; 81+ messages in thread
From: Benjamin Herrenschmidt @ 2008-08-19  1:54 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Eran Liberty, linuxppc-dev, Steven Rostedt, Paul E. McKenney,
	linux-kernel, rostedt


> > Oops: Exception in kernel mode, sig: 11 [#1]
> > Exsw1600
> > Modules linked in:
> > NIP: c00bbb20 LR: c00bbb20 CTR: 00000000
> > REGS: dd5b1c50 TRAP: 0700   Not tainted  (2.6.27-rc2)
> > MSR: 00029000 <EE,ME>  CR: 24082282  XER: 20000000
> > TASK = ddcce060[1707] 'find' THREAD: dd5b0000
> > GPR00: 00000000 dd5b1d00 ddcce060 dd801180 dd5b1d68 dd5b1d58 dd80125b 
> > 100234ec
> > GPR08: c0800000 00019330 0000ffff dd5b1d20 24000288 100ad874 100936f8 
> > 1008a1d0
> > GPR16: 10083f80 dd5b1e2c dd5b1d68 fffffff4 c0380000 dd5b1d60 dd5b1d58 
> > dd802084
> > GPR24: dc3d7700 dd802018 dd5b1d68 c0380000 dd801180 dd5b1d68 00000000 
> > dd5b1d00
> > NIP [c00bbb20] d_lookup+0x40/0x90
> > LR [c00bbb20] d_lookup+0x40/0x90
> > Call Trace:
> > [dd5b1d00] [dd5b1d58] 0xdd5b1d58 (unreliable)
> 
> Can you check if, at some point during the system execution (starting
> from boot), 0xdd5b1d58 is an address where a module is loaded ? (the
> module can be later unloaded, what I wonder is if this address would
> appear to have had a loaded+unloaded module).
> 
> Actually, could you try to compile your kernel without "MODULE_UNLOAD" ?

I've had similar crashes (with similar backtraces in the VFS) on
machines with no modules, netbooted zImages.

Ben.



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-19  1:53           ` Benjamin Herrenschmidt
@ 2008-08-19  2:28             ` Steven Rostedt
  2008-08-19  2:39               ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 81+ messages in thread
From: Steven Rostedt @ 2008-08-19  2:28 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Scott Wood, Eran Liberty, Mathieu Desnoyers, linux-kernel,
	linuxppc-dev, Steven Rostedt, Paul E. McKenney


On Tue, 19 Aug 2008, Benjamin Herrenschmidt wrote:

> On Mon, 2008-08-18 at 14:27 -0400, Steven Rostedt wrote:
> > 
> > On Mon, 18 Aug 2008, Scott Wood wrote:
> > 
> > > Mathieu Desnoyers wrote:
> > > >         asm volatile (
> > > >                 "1: lwz         %1, 0(%2)\n"
> > > >                 "   cmpw        %1, %5\n"
> > > >                 "   bne         2f\n"
> > > >                 "   stwu        %3, 0(%2)\n"
> > > >                 "2:\n"
> > > >                 ".section .fixup, \"ax\"\n"
> > > >                 "3:     li %0, 1\n"
> > > >                 "       b 2b\n"
> > > >                 ".previous\n"
> > > >                 ".section __ex_table,\"a\"\n"
> > > >                 _ASM_ALIGN "\n"
> > > >                 _ASM_PTR "1b, 3b\n"
> > > >                 ".previous"
> > > >                 : "=r"(faulted), "=r"(replaced)
> > > >                 : "r"(ip), "r"(new),
> > > >                   "0"(faulted), "r"(old)
> > > >                 : "memory");
> > > 
> > > Some (most likely unrelated) nits in the above inline asm:
> 
> 
> Why not use __get_user/__put_user ?

Hmm, this was originally copied from x86, where we did a cmpxchg, but that 
is probably not needed since all of this is done in kstop_machine. Also, 
only the "get" is needed. If we don't fault there, we wont fault on the 
put (unless we have permissions wrong, and that would be a bug).

So are you recommending something like

        int cmd;

	if (__get_user(cmd, ip))
		goto fault;

	if (cmd != old)
		goto not_same;

	WARN_ON_ONCE(__put_user(cmd, ip));

If we did this, we could probably put this into the generic code:


	if (copy_from_user(cmd, ip, ARCH_CALL_SIZE))
		goto fault;

	if (memcmp(cmd, old, ARCH_CALL_SIZE) != 0)
		goto not_same;

	WARN_ON_ONCE(copy_to_user(cmd, ip, ARCH_CALL_SIZE));


-- Steve


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-19  2:28             ` Steven Rostedt
@ 2008-08-19  2:39               ` Benjamin Herrenschmidt
  2008-08-19  2:41                 ` Steven Rostedt
  2008-08-19  2:56                 ` Benjamin Herrenschmidt
  0 siblings, 2 replies; 81+ messages in thread
From: Benjamin Herrenschmidt @ 2008-08-19  2:39 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Scott Wood, Eran Liberty, Mathieu Desnoyers, linux-kernel,
	linuxppc-dev, Steven Rostedt, Paul E. McKenney


> Hmm, this was originally copied from x86, where we did a cmpxchg, but that 
> is probably not needed since all of this is done in kstop_machine. Also, 
> only the "get" is needed. If we don't fault there, we wont fault on the 
> put (unless we have permissions wrong, and that would be a bug).

Would it ? How do we make sure the kernel text is mapped writeable ?

> So are you recommending something like
> 
>         int cmd;
> 
> 	if (__get_user(cmd, ip))
> 		goto fault;
> 
> 	if (cmd != old)
> 		goto not_same;
> 
> 	WARN_ON_ONCE(__put_user(cmd, ip));
> 
> If we did this, we could probably put this into the generic code:

That would work I suppose, I'll give it a try.

> 	if (copy_from_user(cmd, ip, ARCH_CALL_SIZE))
> 		goto fault;
> 
> 	if (memcmp(cmd, old, ARCH_CALL_SIZE) != 0)
> 		goto not_same;
> 
> 	WARN_ON_ONCE(copy_to_user(cmd, ip, ARCH_CALL_SIZE));

You need the __ variants or the access_ok() checks will bite
you bad.

Ben.



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-19  2:39               ` Benjamin Herrenschmidt
@ 2008-08-19  2:41                 ` Steven Rostedt
  2008-08-19  2:47                   ` Mathieu Desnoyers
  2008-08-19  2:56                 ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 81+ messages in thread
From: Steven Rostedt @ 2008-08-19  2:41 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Scott Wood, Eran Liberty, Mathieu Desnoyers, linux-kernel,
	linuxppc-dev, Steven Rostedt, Paul E. McKenney


On Tue, 19 Aug 2008, Benjamin Herrenschmidt wrote:

> 
> > Hmm, this was originally copied from x86, where we did a cmpxchg, but that 
> > is probably not needed since all of this is done in kstop_machine. Also, 
> > only the "get" is needed. If we don't fault there, we wont fault on the 
> > put (unless we have permissions wrong, and that would be a bug).
> 
> Would it ? How do we make sure the kernel text is mapped writeable ?

We map it writeable if FTRACE is enabled.

> 
> > So are you recommending something like
> > 
> >         int cmd;
> > 
> > 	if (__get_user(cmd, ip))
> > 		goto fault;
> > 
> > 	if (cmd != old)
> > 		goto not_same;
> > 
> > 	WARN_ON_ONCE(__put_user(cmd, ip));
> > 
> > If we did this, we could probably put this into the generic code:
> 
> That would work I suppose, I'll give it a try.
> 
> > 	if (copy_from_user(cmd, ip, ARCH_CALL_SIZE))
> > 		goto fault;
> > 
> > 	if (memcmp(cmd, old, ARCH_CALL_SIZE) != 0)
> > 		goto not_same;
> > 
> > 	WARN_ON_ONCE(copy_to_user(cmd, ip, ARCH_CALL_SIZE));
> 
> You need the __ variants or the access_ok() checks will bite
> you bad.

Ah, good point ;-)

-- Steve


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-19  2:41                 ` Steven Rostedt
@ 2008-08-19  2:47                   ` Mathieu Desnoyers
  2008-08-19  3:32                     ` Steven Rostedt
  2008-08-19 16:47                     ` Steven Rostedt
  0 siblings, 2 replies; 81+ messages in thread
From: Mathieu Desnoyers @ 2008-08-19  2:47 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Benjamin Herrenschmidt, Scott Wood, Eran Liberty, linux-kernel,
	linuxppc-dev, Steven Rostedt, Paul E. McKenney

* Steven Rostedt (rostedt@goodmis.org) wrote:
> 
> On Tue, 19 Aug 2008, Benjamin Herrenschmidt wrote:
> 
> > 
> > > Hmm, this was originally copied from x86, where we did a cmpxchg, but that 
> > > is probably not needed since all of this is done in kstop_machine. Also, 
> > > only the "get" is needed. If we don't fault there, we wont fault on the 
> > > put (unless we have permissions wrong, and that would be a bug).
> > 
> > Would it ? How do we make sure the kernel text is mapped writeable ?
> 
> We map it writeable if FTRACE is enabled.
> 

Argh. See text_poke(). It's there exactly for this purpose on x86.

Mathieu

> > 
> > > So are you recommending something like
> > > 
> > >         int cmd;
> > > 
> > > 	if (__get_user(cmd, ip))
> > > 		goto fault;
> > > 
> > > 	if (cmd != old)
> > > 		goto not_same;
> > > 
> > > 	WARN_ON_ONCE(__put_user(cmd, ip));
> > > 
> > > If we did this, we could probably put this into the generic code:
> > 
> > That would work I suppose, I'll give it a try.
> > 
> > > 	if (copy_from_user(cmd, ip, ARCH_CALL_SIZE))
> > > 		goto fault;
> > > 
> > > 	if (memcmp(cmd, old, ARCH_CALL_SIZE) != 0)
> > > 		goto not_same;
> > > 
> > > 	WARN_ON_ONCE(copy_to_user(cmd, ip, ARCH_CALL_SIZE));
> > 
> > You need the __ variants or the access_ok() checks will bite
> > you bad.
> 
> Ah, good point ;-)
> 
> -- Steve
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-19  2:39               ` Benjamin Herrenschmidt
  2008-08-19  2:41                 ` Steven Rostedt
@ 2008-08-19  2:56                 ` Benjamin Herrenschmidt
  2008-08-19  3:12                   ` Steven Rostedt
  1 sibling, 1 reply; 81+ messages in thread
From: Benjamin Herrenschmidt @ 2008-08-19  2:56 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Scott Wood, Eran Liberty, Mathieu Desnoyers, linux-kernel,
	linuxppc-dev, Steven Rostedt, Paul E. McKenney

Ok, so i did a patch, but it doesn't fix the problem. So
there's something else whacking on the stack frames.

Still, here it is:

powerpc/ftrace: Fix broken assembly for code replacement

Instead, uses __get_user() and __put_user().

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---

Index: linux-work/arch/powerpc/kernel/ftrace.c
===================================================================
--- linux-work.orig/arch/powerpc/kernel/ftrace.c	2008-08-19 12:45:49.000000000 +1000
+++ linux-work/arch/powerpc/kernel/ftrace.c	2008-08-19 12:52:51.000000000 +1000
@@ -16,7 +16,7 @@
 
 #include <asm/cacheflush.h>
 #include <asm/ftrace.h>
-
+#include <asm/uaccess.h>
 
 static unsigned int ftrace_nop = 0x60000000;
 
@@ -72,10 +72,9 @@ notrace int
 ftrace_modify_code(unsigned long ip, unsigned char *old_code,
 		   unsigned char *new_code)
 {
-	unsigned replaced;
-	unsigned old = *(unsigned *)old_code;
-	unsigned new = *(unsigned *)new_code;
-	int faulted = 0;
+	unsigned int old = *(unsigned int *)old_code;
+	unsigned int new = *(unsigned int *)new_code;
+	unsigned int instr;
 
 	/*
 	 * Note: Due to modules and __init, code can
@@ -85,32 +84,13 @@ ftrace_modify_code(unsigned long ip, uns
 	 * No real locking needed, this code is run through
 	 * kstop_machine.
 	 */
-	asm volatile (
-		"1: lwz		%1, 0(%2)\n"
-		"   cmpw	%1, %5\n"
-		"   bne		2f\n"
-		"   stwu	%3, 0(%2)\n"
-		"2:\n"
-		".section .fixup, \"ax\"\n"
-		"3:	li %0, 1\n"
-		"	b 2b\n"
-		".previous\n"
-		".section __ex_table,\"a\"\n"
-		_ASM_ALIGN "\n"
-		_ASM_PTR "1b, 3b\n"
-		".previous"
-		: "=r"(faulted), "=r"(replaced)
-		: "r"(ip), "r"(new),
-		  "0"(faulted), "r"(old)
-		: "memory");
-
-	if (replaced != old && replaced != new)
-		faulted = 2;
-
-	if (!faulted)
-		flush_icache_range(ip, ip + 8);
-
-	return faulted;
+	if (__get_user(instr, (unsigned int __user *)ip))
+		return 1;
+	if (instr != old && instr != new)
+		return 2;
+	WARN_ON_ONCE(__put_user(new, (unsigned int __user *)ip));
+	flush_icache_range(ip, ip + 8);
+	return 0;
 }
 
 notrace int ftrace_update_ftrace_func(ftrace_func_t func)



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-19  2:56                 ` Benjamin Herrenschmidt
@ 2008-08-19  3:12                   ` Steven Rostedt
  2008-08-19  4:17                     ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 81+ messages in thread
From: Steven Rostedt @ 2008-08-19  3:12 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Scott Wood, Eran Liberty, Mathieu Desnoyers, linux-kernel,
	linuxppc-dev, Steven Rostedt, Paul E. McKenney


On Tue, 19 Aug 2008, Benjamin Herrenschmidt wrote:

> Ok, so i did a patch, but it doesn't fix the problem. So
> there's something else whacking on the stack frames.

Is it 32 bit or 64?  I've tested the 64 bit quite a bit, but not so much 
the 32 (my powerbook is not usually that stable).

You might want to look at the entry.S mcount code too.

> 
> Still, here it is:

Thanks,

> 
> powerpc/ftrace: Fix broken assembly for code replacement
> 
> Instead, uses __get_user() and __put_user().
> 

Acked-by: Steven Rostedt <srostedt@redhat.com>

-- Steve


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-19  2:47                   ` Mathieu Desnoyers
@ 2008-08-19  3:32                     ` Steven Rostedt
  2008-08-19  3:36                       ` Mathieu Desnoyers
  2008-08-19 16:47                     ` Steven Rostedt
  1 sibling, 1 reply; 81+ messages in thread
From: Steven Rostedt @ 2008-08-19  3:32 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Benjamin Herrenschmidt, Scott Wood, Eran Liberty, linux-kernel,
	linuxppc-dev, Steven Rostedt, Paul E. McKenney


On Mon, 18 Aug 2008, Mathieu Desnoyers wrote:

> * Steven Rostedt (rostedt@goodmis.org) wrote:
> > 
> > On Tue, 19 Aug 2008, Benjamin Herrenschmidt wrote:
> > 
> > > 
> > > > Hmm, this was originally copied from x86, where we did a cmpxchg, but that 
> > > > is probably not needed since all of this is done in kstop_machine. Also, 
> > > > only the "get" is needed. If we don't fault there, we wont fault on the 
> > > > put (unless we have permissions wrong, and that would be a bug).
> > > 
> > > Would it ? How do we make sure the kernel text is mapped writeable ?
> > 
> > We map it writeable if FTRACE is enabled.
> > 
> 
> Argh. See text_poke(). It's there exactly for this purpose on x86.

Ouch, I just did. text_poke is quite heavy. It would be interesting to see 
that performed on 20,000 locations at one time. I could play with it, but 
I'm a bit nervous.

-- Steve

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-19  3:32                     ` Steven Rostedt
@ 2008-08-19  3:36                       ` Mathieu Desnoyers
  2008-08-19  4:00                         ` Steven Rostedt
  0 siblings, 1 reply; 81+ messages in thread
From: Mathieu Desnoyers @ 2008-08-19  3:36 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Benjamin Herrenschmidt, Scott Wood, Eran Liberty, linux-kernel,
	linuxppc-dev, Steven Rostedt, Paul E. McKenney

* Steven Rostedt (rostedt@goodmis.org) wrote:
> 
> On Mon, 18 Aug 2008, Mathieu Desnoyers wrote:
> 
> > * Steven Rostedt (rostedt@goodmis.org) wrote:
> > > 
> > > On Tue, 19 Aug 2008, Benjamin Herrenschmidt wrote:
> > > 
> > > > 
> > > > > Hmm, this was originally copied from x86, where we did a cmpxchg, but that 
> > > > > is probably not needed since all of this is done in kstop_machine. Also, 
> > > > > only the "get" is needed. If we don't fault there, we wont fault on the 
> > > > > put (unless we have permissions wrong, and that would be a bug).
> > > > 
> > > > Would it ? How do we make sure the kernel text is mapped writeable ?
> > > 
> > > We map it writeable if FTRACE is enabled.
> > > 
> > 
> > Argh. See text_poke(). It's there exactly for this purpose on x86.
> 
> Ouch, I just did. text_poke is quite heavy. It would be interesting to see 
> that performed on 20,000 locations at one time. I could play with it, but 
> I'm a bit nervous.
> 

It's alread used to modify the LOCK prefixes in alternative.c and did
not seem to be too slow for that.. it should therefore be ok.

Mathieu

> -- Steve

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-19  3:36                       ` Mathieu Desnoyers
@ 2008-08-19  4:00                         ` Steven Rostedt
  0 siblings, 0 replies; 81+ messages in thread
From: Steven Rostedt @ 2008-08-19  4:00 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Benjamin Herrenschmidt, Scott Wood, Eran Liberty, linux-kernel,
	linuxppc-dev, Steven Rostedt, Paul E. McKenney


On Mon, 18 Aug 2008, Mathieu Desnoyers wrote:
> > > > 
> > > 
> > > Argh. See text_poke(). It's there exactly for this purpose on x86.
> > 
> > Ouch, I just did. text_poke is quite heavy. It would be interesting to see 
> > that performed on 20,000 locations at one time. I could play with it, but 
> > I'm a bit nervous.
> > 
> 
> It's alread used to modify the LOCK prefixes in alternative.c and did
> not seem to be too slow for that.. it should therefore be ok.

There's a lot more functions than locks ;-)

Anyway, this is only needed after system boot up. But then we do it ever 
time we start or stop tracing. I'll try it out and see if it is noticable.

-- Steve


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-19  3:12                   ` Steven Rostedt
@ 2008-08-19  4:17                     ` Benjamin Herrenschmidt
  2008-08-20  7:18                       ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 81+ messages in thread
From: Benjamin Herrenschmidt @ 2008-08-19  4:17 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Scott Wood, Eran Liberty, Mathieu Desnoyers, linux-kernel,
	linuxppc-dev, Steven Rostedt, Paul E. McKenney

On Mon, 2008-08-18 at 23:12 -0400, Steven Rostedt wrote:
> On Tue, 19 Aug 2008, Benjamin Herrenschmidt wrote:
> 
> > Ok, so i did a patch, but it doesn't fix the problem. So
> > there's something else whacking on the stack frames.
> 
> Is it 32 bit or 64?  I've tested the 64 bit quite a bit, but not so much 
> the 32 (my powerbook is not usually that stable).

32-bit. Reproduced on powerbook by a user, 440, 750, ...

> You might want to look at the entry.S mcount code too.

Yeah, I had a look, nothing obviously bad there, but I may just
have missed it :-)

Ben.



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-18 18:41       ` Mathieu Desnoyers
  2008-08-19  1:54         ` Benjamin Herrenschmidt
@ 2008-08-19  9:56         ` Eran Liberty
  2008-08-19 13:02           ` Mathieu Desnoyers
  1 sibling, 1 reply; 81+ messages in thread
From: Eran Liberty @ 2008-08-19  9:56 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Steven Rostedt, linux-kernel, Paul E. McKenney, linuxppc-dev, rostedt

Mathieu Desnoyers wrote:
> Can you check if, at some point during the system execution (starting
> from boot), 0xdd5b1d58 is an address where a module is loaded ? (the
> module can be later unloaded, what I wonder is if this address would
> appear to have had a loaded+unloaded module).
>
> Actually, could you try to compile your kernel without "MODULE_UNLOAD" ?
>
> Mathieu
>   
No modules...

~ # uname -a
Linux 2.6.27-rc2 #6 Tue Aug 19 12:33:34 IDT 2008 ppc unknown
~ # gunzip /proc/config.gz -c | grep UNLOAD
# CONFIG_MODULE_UNLOAD is not set
~ # gunzip /proc/config.gz -c | grep FTRACE
CONFIG_HAVE_FTRACE=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_FTRACE=y
CONFIG_DYNAMIC_FTRACE=y
CONFIG_FTRACE_SELFTEST=y
CONFIG_FTRACE_STARTUP_TEST=y
~ # gunzip /proc/config.gz -c | grep MARKERS
CONFIG_MARKERS=y
~ # lsmod
Module                  Size  Used by
~ # while [ 1 ] ; do find / > /dev/null ; echo .  ; done
.
Oops: Exception in kernel mode, sig: 11 [#1]
Exsw1600
Modules linked in:
NIP: c00bb724 LR: c00bb724 CTR: 00000000
REGS: d8b59c50 TRAP: 0700   Not tainted  (2.6.27-rc2)
MSR: 00029000 <EE,ME>  CR: 24082282  XER: 20000000
TASK = dbc68de0[1712] 'find' THREAD: d8b58000
GPR00: 00000000 d8b59d00 dbc68de0 dd801180 d8b59d68 d8b59d58 dd8019db 
100234ec
GPR08: c0800000 00019330 0000ffff d8b59d20 24000288 100ad874 100936f8 
1008a1d0
GPR16: 10083f80 d8b59e2c d8b59d68 fffffff4 c0380000 d8b59d60 d8b59d58 
dd802084
GPR24: ddc69500 dd802018 d8b59d68 c0380000 dd801180 d8b59d68 00000000 
d8b59d00
NIP [c00bb724] d_lookup+0x40/0x90
LR [c00bb724] d_lookup+0x40/0x90
Call Trace:
[d8b59d00] [d8b59d58] 0xd8b59d58 (unreliable)
[d8b59d20] [c00ae7c8] do_lookup+0xe8/0x220
[d8b59d50] [c00b0684] __link_path_walk+0x5a4/0xd54
[d8b59dc0] [c00b0e8c] path_walk+0x58/0xe0
[d8b59df0] [c00b0ffc] do_path_lookup+0x78/0x13c
[d8b59e20] [c00b1cf8] user_path_at+0x64/0xac
[d8b59e90] [c00a8c98] vfs_lstat_fd+0x34/0x74
[d8b59ec0] [c00a8d6c] vfs_lstat+0x30/0x48
[d8b59ed0] [c00a8db4] sys_lstat64+0x30/0x5c
[d8b59f40] [c0010554] ret_from_syscall+0x0/0x3c
Instruction dump:
7c0802a6 bf61000c 3f60c038 7c3f0b78 90010024 7c7c1b78 7c9d2378 83db32a0
73c00001 7f83e378 7fa4eb78 4082002f <00000000> 2f830000 409e0030 801b32a0
---[ end trace 7766edd310cd3442 ]---


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-18 18:50       ` Steven Rostedt
@ 2008-08-19 12:09         ` Eran Liberty
  2008-08-19 13:05           ` Mathieu Desnoyers
  2008-08-19 20:15           ` Steven Rostedt
  0 siblings, 2 replies; 81+ messages in thread
From: Eran Liberty @ 2008-08-19 12:09 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Steven Rostedt, linux-kernel, Mathieu Desnoyers,
	Paul E. McKenney, linuxppc-dev

Steven Rostedt wrote:
> On Mon, 18 Aug 2008, Eran Liberty wrote:
>
>   
>> Steven Rostedt wrote:
>>     
>>> Eran Liberty wrote:
>>>       
>>>> After compiling a kernel with ftrace I started to experience all sorts of
>>>> crashes.
>>>>         
>>> Just to make sure...
>>>
>>> ftrace enables markers too, and RCU has tracing with the markers. This may
>>> not be the problem, but I just want to eliminate as many variables as
>>> possible.
>>> Could you disable ftrace, but keep the markers on too.  Also, could you
>>> enable ftrace again and turn on the FTRACE_STARTUP_TEST.
>>>       
>> for the fun of it I took out all my propriety modules; so now its a non
>> tainted kernel.
>>
>> Here is the matrix:
>>
>> !FTRACE x !MARKERS => stable
>> !FTRACE x MARKERS => stable
>> FTRACE x !MARKERS => n/a (FTRACE forces MARKERS)
>> FTRACE x MARKERS => unstable
>> FTRACE x FTRACE_STARTUP_TEST x MARKERS => unstable + tests passed
>>     
>
> Thanks
>
>   
>> Testing tracer sched_switch: PASSED
>> Testing tracer ftrace: PASSED
>> Testing dynamic ftrace: PASSED
>>
>> Oops: Exception in kernel mode, sig: 11 [#1]
>> Exsw1600
>> Modules linked in:
>> NIP: c00bbb20 LR: c00bbb20 CTR: 00000000
>>     
>
> Could you load this into gdb for me and show me the output of:
>
> gdb> li *0xc00bbb20
>
> (Assuming you compiled with debuginfo on)
>
> and...
>
> gdb> disass 0xc00bbb20
>
>   
>> REGS: dd5b1c50 TRAP: 0700   Not tainted  (2.6.27-rc2)
>> MSR: 00029000 <EE,ME>  CR: 24082282  XER: 20000000
>> TASK = ddcce060[1707] 'find' THREAD: dd5b0000
>> GPR00: 00000000 dd5b1d00 ddcce060 dd801180 dd5b1d68 dd5b1d58 dd80125b 100234ec
>> GPR08: c0800000 00019330 0000ffff dd5b1d20 24000288 100ad874 100936f8 1008a1d0
>> GPR16: 10083f80 dd5b1e2c dd5b1d68 fffffff4 c0380000 dd5b1d60 dd5b1d58 dd802084
>> GPR24: dc3d7700 dd802018 dd5b1d68 c0380000 dd801180 dd5b1d68 00000000 dd5b1d00
>> NIP [c00bbb20] d_lookup+0x40/0x90
>> LR [c00bbb20] d_lookup+0x40/0x90
>> Call Trace:
>> [dd5b1d00] [dd5b1d58] 0xdd5b1d58 (unreliable)
>> [dd5b1d20] [c00aebc4] do_lookup+0xe8/0x220
>> [dd5b1d50] [c00b0a80] __link_path_walk+0x5a4/0xd54
>> [dd5b1dc0] [c00b1288] path_walk+0x58/0xe0
>> [dd5b1df0] [c00b13f8] do_path_lookup+0x78/0x13c
>> [dd5b1e20] [c00b20f4] user_path_at+0x64/0xac
>> [dd5b1e90] [c00a9028] vfs_lstat_fd+0x34/0x74
>> [dd5b1ec0] [c00a90fc] vfs_lstat+0x30/0x48
>> [dd5b1ed0] [c00a9144] sys_lstat64+0x30/0x5c
>> [dd5b1f40] [c0010554] ret_from_syscall+0x0/0x3c
>> Instruction dump:
>> 7c0802a6 bf61000c 3f60c038 7c3f0b78 90010024 7c7c1b78 7c9d2378 83db32a0
>> 73c00001 7f83e378 7fa4eb78 4082002f <00000000> 2f830000 409e0030 801b32a0
>>     
>
> Ouch! we have a 00000000 instruction. I'm almost done with the new mcount 
> record for PPC (I have it done for ppc64, I'm just porting it to 32). I'm 
> thinking this new code may solve your issues too. I hate the daemon.
>
> -- Steve
>
>
>   

I have attached ddd. Up on crashing it points on this line

struct dentry * d_lookup(struct dentry * parent, struct qstr * name)
{
        struct dentry * dentry = NULL;
        unsigned long seq;
   
        do {
                seq = read_seqbegin(&rename_lock);
                dentry = __d_lookup(parent, name);     <==== ddd marker
                if (dentry)
                        break;
        } while (read_seqretry(&rename_lock, seq));
        return dentry;
}

(gdb) bt
#0  d_lookup (parent=0xdd801180, name=0xd99b3d68) at fs/dcache.c:1361
#1  0xc00ae7c8 in real_lookup (nd=<value optimized out>, name=<value 
optimized out>, parent=0xdd801180) at fs/namei.c:507
#2  do_lookup (nd=0xd99b3e28, name=0xd99b3d68, path=0xd99b3d58) at 
fs/namei.c:825
#3  0xc00b0684 in __link_path_walk (name=0xddc23009 "", nd=0xd99b3e28) 
at fs/namei.c:982
#4  0xc00b0e8c in link_path_walk (nd=<value optimized out>, name=<value 
optimized out>) at fs/namei.c:570
#5  path_walk (name=0xddc23000 "/proc/mtd", nd=0xd99b3e28) at 
fs/namei.c:1041
#6  0xc00b0ffc in do_path_lookup (dfd=<value optimized out>, 
name=0xddc23000 "/proc/mtd", flags=<value optimized out>, nd=0xd99b3e28) 
at fs/namei.c:1091
#7  0xc00b1cf8 in user_path_at (dfd=-100, name=<value optimized out>, 
flags=0, path=0xd99b3e98) at fs/namei.c:1334
#8  0xc00a8c98 in vfs_lstat_fd (dfd=-578809472, name=0xd99b3d68 "", 
stat=0xd99b3ed8) at fs/stat.c:83
#9  0xc00a8d6c in vfs_lstat (name=0xd99b3d68 "", stat=0xd99b3d58) at 
fs/stat.c:93
#10 0xc00a8db4 in sys_lstat64 (filename=0xdd801180 "", 
statbuf=0xbfe285d8) at fs/stat.c:381
#11 0xc0010554 in syscall_dotrace_cont ()

both cp & lr points to 0xC00BB724
(gdb) info registers
r0             0x0    0
r1             0xd99b3d00    3650829568
r2             0xddccf2e0    3721196256
r3             0xdd801180    3716157824
r4             0xd99b3d68    3650829672
r5             0xd99b3d58    3650829656
r6             0xdd822a5b    3716295259
r7             0x100234ec    268580076
r8             0xc0800000    3229614080
r9             0x19330    103216
r10            0xffff    65535
r11            0xd99b3d20    3650829600
r12            0x24000288    603980424
r13            0x100ad874    269146228
r14            0x100936f8    269039352
r15            0x1008a1d0    269001168
r16            0x10083f80    268976000
r17            0xd99b3e2c    3650829868
r18            0xd99b3d68    3650829672
r19            0xfffffff4    4294967284
r20            0xc0380000    3224895488
r21            0xd99b3d60    3650829664
r22            0xd99b3d58    3650829656
r23            0xdd802084    3716161668
r24            0xdc3fb280    3695161984
r25            0xdd802018    3716161560
r26            0xd99b3d68    3650829672
r27            0xc0380000    3224895488
r28            0xdd801180    3716157824
r29            0xd99b3d68    3650829672
r30            0x0    0
r31            0xd99b3d00    3650829568
pc             0xc00bb724    3221993252
cr             0x24082282    604512898
lr             0xc00bb724    3221993252
ctr            0x0    0
xer            0x20000000    536870912

(gdb) li *0xC00BB724
Line 1361 of "fs/dcache.c" starts at address 0xc00bb724 <d_lookup+64> 
and ends at 0xc00bb728 <d_lookup+68>.

(gdb) disassemble 0xC00BB724
Dump of assembler code for function d_lookup:
0xc00bb6e4 <d_lookup+0>:    mflr    r0
0xc00bb6e8 <d_lookup+4>:    stw     r0,4(r1)
0xc00bb6ec <d_lookup+8>:    nop
0xc00bb6f0 <d_lookup+12>:    stwu    r1,-32(r1)
0xc00bb6f4 <d_lookup+16>:    mflr    r0
0xc00bb6f8 <d_lookup+20>:    stmw    r27,12(r1)
0xc00bb6fc <d_lookup+24>:    lis     r27,-16328
0xc00bb700 <d_lookup+28>:    mr      r31,r1
0xc00bb704 <d_lookup+32>:    stw     r0,36(r1)
0xc00bb708 <d_lookup+36>:    mr      r28,r3
0xc00bb70c <d_lookup+40>:    mr      r29,r4
0xc00bb710 <d_lookup+44>:    lwz     r30,12960(r27)
0xc00bb714 <d_lookup+48>:    andi.   r0,r30,1
0xc00bb718 <d_lookup+52>:    mr      r3,r28
0xc00bb71c <d_lookup+56>:    mr      r4,r29
0xc00bb720 <d_lookup+60>:    bnela-  0x2c
0xc00bb724 <d_lookup+64>:    .long 0x0
0xc00bb728 <d_lookup+68>:    cmpwi   cr7,r3,0
0xc00bb72c <d_lookup+72>:    bne-    cr7,0xc00bb75c <d_lookup+120>
0xc00bb730 <d_lookup+76>:    lwz     r0,12960(r27)
0xc00bb734 <d_lookup+80>:    cmpw    cr7,r0,r30
0xc00bb738 <d_lookup+84>:    beq-    cr7,0xc00bb75c <d_lookup+120>
0xc00bb73c <d_lookup+88>:    mr      r30,r0
0xc00bb740 <d_lookup+92>:    andi.   r0,r30,1
0xc00bb744 <d_lookup+96>:    mr      r3,r28
0xc00bb748 <d_lookup+100>:    mr      r4,r29
0xc00bb74c <d_lookup+104>:    beq+    0xc00bb724 <d_lookup+64>
0xc00bb750 <d_lookup+108>:    lwz     r0,12960(r27)
0xc00bb754 <d_lookup+112>:    mr      r30,r0
0xc00bb758 <d_lookup+116>:    b       0xc00bb740 <d_lookup+92>
0xc00bb75c <d_lookup+120>:    lwz     r11,0(r1)
0xc00bb760 <d_lookup+124>:    lwz     r0,4(r11)
0xc00bb764 <d_lookup+128>:    lmw     r27,-20(r11)
0xc00bb768 <d_lookup+132>:    mtlr    r0
0xc00bb76c <d_lookup+136>:    mr      r1,r11
0xc00bb770 <d_lookup+140>:    blr
End of assembler dump.
(gdb)

Hope this is helpfull

Liberty




^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-19  9:56         ` Eran Liberty
@ 2008-08-19 13:02           ` Mathieu Desnoyers
  2008-08-19 21:46             ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 81+ messages in thread
From: Mathieu Desnoyers @ 2008-08-19 13:02 UTC (permalink / raw)
  To: Eran Liberty
  Cc: Steven Rostedt, linux-kernel, Paul E. McKenney, linuxppc-dev,
	rostedt, Benjamin Herrenschmidt

* Eran Liberty (liberty@extricom.com) wrote:
> Mathieu Desnoyers wrote:
>> Can you check if, at some point during the system execution (starting
>> from boot), 0xdd5b1d58 is an address where a module is loaded ? (the
>> module can be later unloaded, what I wonder is if this address would
>> appear to have had a loaded+unloaded module).
>>
>> Actually, could you try to compile your kernel without "MODULE_UNLOAD" ?
>>
>> Mathieu
>>   
> No modules...
>
> ~ # uname -a
> Linux 2.6.27-rc2 #6 Tue Aug 19 12:33:34 IDT 2008 ppc unknown
> ~ # gunzip /proc/config.gz -c | grep UNLOAD
> # CONFIG_MODULE_UNLOAD is not set
> ~ # gunzip /proc/config.gz -c | grep FTRACE
> CONFIG_HAVE_FTRACE=y
> CONFIG_HAVE_DYNAMIC_FTRACE=y
> CONFIG_FTRACE=y
> CONFIG_DYNAMIC_FTRACE=y
> CONFIG_FTRACE_SELFTEST=y
> CONFIG_FTRACE_STARTUP_TEST=y
> ~ # gunzip /proc/config.gz -c | grep MARKERS
> CONFIG_MARKERS=y
> ~ # lsmod
> Module                  Size  Used by
> ~ # while [ 1 ] ; do find / > /dev/null ; echo .  ; done
> .
> Oops: Exception in kernel mode, sig: 11 [#1]
> Exsw1600
> Modules linked in:
> NIP: c00bb724 LR: c00bb724 CTR: 00000000
> REGS: d8b59c50 TRAP: 0700   Not tainted  (2.6.27-rc2)
> MSR: 00029000 <EE,ME>  CR: 24082282  XER: 20000000
> TASK = dbc68de0[1712] 'find' THREAD: d8b58000
> GPR00: 00000000 d8b59d00 dbc68de0 dd801180 d8b59d68 d8b59d58 dd8019db 
> 100234ec
> GPR08: c0800000 00019330 0000ffff d8b59d20 24000288 100ad874 100936f8 
> 1008a1d0
> GPR16: 10083f80 d8b59e2c d8b59d68 fffffff4 c0380000 d8b59d60 d8b59d58 
> dd802084

Register GPR30 is always zero... (also true for the other oops)

> GPR24: ddc69500 dd802018 d8b59d68 c0380000 dd801180 d8b59d68 00000000 
> d8b59d00
> NIP [c00bb724] d_lookup+0x40/0x90
> LR [c00bb724] d_lookup+0x40/0x90
> Call Trace:
> [d8b59d00] [d8b59d58] 0xd8b59d58 (unreliable)
> [d8b59d20] [c00ae7c8] do_lookup+0xe8/0x220
> [d8b59d50] [c00b0684] __link_path_walk+0x5a4/0xd54
> [d8b59dc0] [c00b0e8c] path_walk+0x58/0xe0
> [d8b59df0] [c00b0ffc] do_path_lookup+0x78/0x13c
> [d8b59e20] [c00b1cf8] user_path_at+0x64/0xac
> [d8b59e90] [c00a8c98] vfs_lstat_fd+0x34/0x74
> [d8b59ec0] [c00a8d6c] vfs_lstat+0x30/0x48
> [d8b59ed0] [c00a8db4] sys_lstat64+0x30/0x5c
> [d8b59f40] [c0010554] ret_from_syscall+0x0/0x3c
> Instruction dump:
> 7c0802a6 bf61000c 3f60c038 7c3f0b78 90010024 7c7c1b78 7c9d2378 83db32a0

Just like this instruction... maybe related ?

Mathieu

> 73c00001 7f83e378 7fa4eb78 4082002f <00000000> 2f830000 409e0030 801b32a0
> ---[ end trace 7766edd310cd3442 ]---
>

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-19 12:09         ` Eran Liberty
@ 2008-08-19 13:05           ` Mathieu Desnoyers
  2008-08-19 14:21             ` Eran Liberty
  2008-08-19 20:15           ` Steven Rostedt
  1 sibling, 1 reply; 81+ messages in thread
From: Mathieu Desnoyers @ 2008-08-19 13:05 UTC (permalink / raw)
  To: Eran Liberty
  Cc: Steven Rostedt, Steven Rostedt, linux-kernel, Paul E. McKenney,
	linuxppc-dev

* Eran Liberty (liberty@extricom.com) wrote:
> Steven Rostedt wrote:
>> On Mon, 18 Aug 2008, Eran Liberty wrote:
>>
>>   
>>> Steven Rostedt wrote:
>>>     
>>>> Eran Liberty wrote:
>>>>       
>>>>> After compiling a kernel with ftrace I started to experience all sorts 
>>>>> of
>>>>> crashes.
>>>>>         
>>>> Just to make sure...
>>>>
>>>> ftrace enables markers too, and RCU has tracing with the markers. This 
>>>> may
>>>> not be the problem, but I just want to eliminate as many variables as
>>>> possible.
>>>> Could you disable ftrace, but keep the markers on too.  Also, could you
>>>> enable ftrace again and turn on the FTRACE_STARTUP_TEST.
>>>>       
>>> for the fun of it I took out all my propriety modules; so now its a non
>>> tainted kernel.
>>>
>>> Here is the matrix:
>>>
>>> !FTRACE x !MARKERS => stable
>>> !FTRACE x MARKERS => stable
>>> FTRACE x !MARKERS => n/a (FTRACE forces MARKERS)
>>> FTRACE x MARKERS => unstable
>>> FTRACE x FTRACE_STARTUP_TEST x MARKERS => unstable + tests passed
>>>     
>>
>> Thanks
>>
>>   
>>> Testing tracer sched_switch: PASSED
>>> Testing tracer ftrace: PASSED
>>> Testing dynamic ftrace: PASSED
>>>
>>> Oops: Exception in kernel mode, sig: 11 [#1]
>>> Exsw1600
>>> Modules linked in:
>>> NIP: c00bbb20 LR: c00bbb20 CTR: 00000000
>>>     
>>
>> Could you load this into gdb for me and show me the output of:
>>
>> gdb> li *0xc00bbb20
>>
>> (Assuming you compiled with debuginfo on)
>>
>> and...
>>
>> gdb> disass 0xc00bbb20
>>
>>   
>>> REGS: dd5b1c50 TRAP: 0700   Not tainted  (2.6.27-rc2)
>>> MSR: 00029000 <EE,ME>  CR: 24082282  XER: 20000000
>>> TASK = ddcce060[1707] 'find' THREAD: dd5b0000
>>> GPR00: 00000000 dd5b1d00 ddcce060 dd801180 dd5b1d68 dd5b1d58 dd80125b 
>>> 100234ec
>>> GPR08: c0800000 00019330 0000ffff dd5b1d20 24000288 100ad874 100936f8 
>>> 1008a1d0
>>> GPR16: 10083f80 dd5b1e2c dd5b1d68 fffffff4 c0380000 dd5b1d60 dd5b1d58 
>>> dd802084
>>> GPR24: dc3d7700 dd802018 dd5b1d68 c0380000 dd801180 dd5b1d68 00000000 
>>> dd5b1d00
>>> NIP [c00bbb20] d_lookup+0x40/0x90
>>> LR [c00bbb20] d_lookup+0x40/0x90
>>> Call Trace:
>>> [dd5b1d00] [dd5b1d58] 0xdd5b1d58 (unreliable)
>>> [dd5b1d20] [c00aebc4] do_lookup+0xe8/0x220
>>> [dd5b1d50] [c00b0a80] __link_path_walk+0x5a4/0xd54
>>> [dd5b1dc0] [c00b1288] path_walk+0x58/0xe0
>>> [dd5b1df0] [c00b13f8] do_path_lookup+0x78/0x13c
>>> [dd5b1e20] [c00b20f4] user_path_at+0x64/0xac
>>> [dd5b1e90] [c00a9028] vfs_lstat_fd+0x34/0x74
>>> [dd5b1ec0] [c00a90fc] vfs_lstat+0x30/0x48
>>> [dd5b1ed0] [c00a9144] sys_lstat64+0x30/0x5c
>>> [dd5b1f40] [c0010554] ret_from_syscall+0x0/0x3c
>>> Instruction dump:
>>> 7c0802a6 bf61000c 3f60c038 7c3f0b78 90010024 7c7c1b78 7c9d2378 83db32a0
>>> 73c00001 7f83e378 7fa4eb78 4082002f <00000000> 2f830000 409e0030 801b32a0
>>>     
>>
>> Ouch! we have a 00000000 instruction. I'm almost done with the new mcount 
>> record for PPC (I have it done for ppc64, I'm just porting it to 32). I'm 
>> thinking this new code may solve your issues too. I hate the daemon.
>>
>> -- Steve
>>
>>
>>   
>
> I have attached ddd. Up on crashing it points on this line
>
> struct dentry * d_lookup(struct dentry * parent, struct qstr * name)
> {
>        struct dentry * dentry = NULL;
>        unsigned long seq;
>          do {
>                seq = read_seqbegin(&rename_lock);
>                dentry = __d_lookup(parent, name);     <==== ddd marker
>                if (dentry)
>                        break;
>        } while (read_seqretry(&rename_lock, seq));
>        return dentry;
> }
>
> (gdb) bt
> #0  d_lookup (parent=0xdd801180, name=0xd99b3d68) at fs/dcache.c:1361
> #1  0xc00ae7c8 in real_lookup (nd=<value optimized out>, name=<value 
> optimized out>, parent=0xdd801180) at fs/namei.c:507
> #2  do_lookup (nd=0xd99b3e28, name=0xd99b3d68, path=0xd99b3d58) at 
> fs/namei.c:825
> #3  0xc00b0684 in __link_path_walk (name=0xddc23009 "", nd=0xd99b3e28) at 
> fs/namei.c:982
> #4  0xc00b0e8c in link_path_walk (nd=<value optimized out>, name=<value 
> optimized out>) at fs/namei.c:570
> #5  path_walk (name=0xddc23000 "/proc/mtd", nd=0xd99b3e28) at 
> fs/namei.c:1041
> #6  0xc00b0ffc in do_path_lookup (dfd=<value optimized out>, 
> name=0xddc23000 "/proc/mtd", flags=<value optimized out>, nd=0xd99b3e28) at 
> fs/namei.c:1091
> #7  0xc00b1cf8 in user_path_at (dfd=-100, name=<value optimized out>, 
> flags=0, path=0xd99b3e98) at fs/namei.c:1334
> #8  0xc00a8c98 in vfs_lstat_fd (dfd=-578809472, name=0xd99b3d68 "", 
> stat=0xd99b3ed8) at fs/stat.c:83
> #9  0xc00a8d6c in vfs_lstat (name=0xd99b3d68 "", stat=0xd99b3d58) at 
> fs/stat.c:93
> #10 0xc00a8db4 in sys_lstat64 (filename=0xdd801180 "", statbuf=0xbfe285d8) 
> at fs/stat.c:381
> #11 0xc0010554 in syscall_dotrace_cont ()
>
> both cp & lr points to 0xC00BB724
> (gdb) info registers
> r0             0x0    0
> r1             0xd99b3d00    3650829568
> r2             0xddccf2e0    3721196256
> r3             0xdd801180    3716157824
> r4             0xd99b3d68    3650829672
> r5             0xd99b3d58    3650829656
> r6             0xdd822a5b    3716295259
> r7             0x100234ec    268580076
> r8             0xc0800000    3229614080
> r9             0x19330    103216
> r10            0xffff    65535
> r11            0xd99b3d20    3650829600
> r12            0x24000288    603980424
> r13            0x100ad874    269146228
> r14            0x100936f8    269039352
> r15            0x1008a1d0    269001168
> r16            0x10083f80    268976000
> r17            0xd99b3e2c    3650829868
> r18            0xd99b3d68    3650829672
> r19            0xfffffff4    4294967284
> r20            0xc0380000    3224895488
> r21            0xd99b3d60    3650829664
> r22            0xd99b3d58    3650829656
> r23            0xdd802084    3716161668
> r24            0xdc3fb280    3695161984
> r25            0xdd802018    3716161560
> r26            0xd99b3d68    3650829672
> r27            0xc0380000    3224895488
> r28            0xdd801180    3716157824
> r29            0xd99b3d68    3650829672
> r30            0x0    0
> r31            0xd99b3d00    3650829568
> pc             0xc00bb724    3221993252
> cr             0x24082282    604512898
> lr             0xc00bb724    3221993252
> ctr            0x0    0
> xer            0x20000000    536870912
>
> (gdb) li *0xC00BB724
> Line 1361 of "fs/dcache.c" starts at address 0xc00bb724 <d_lookup+64> and 
> ends at 0xc00bb728 <d_lookup+68>.
>
> (gdb) disassemble 0xC00BB724
> Dump of assembler code for function d_lookup:
> 0xc00bb6e4 <d_lookup+0>:    mflr    r0
> 0xc00bb6e8 <d_lookup+4>:    stw     r0,4(r1)
> 0xc00bb6ec <d_lookup+8>:    nop
> 0xc00bb6f0 <d_lookup+12>:    stwu    r1,-32(r1)
> 0xc00bb6f4 <d_lookup+16>:    mflr    r0
> 0xc00bb6f8 <d_lookup+20>:    stmw    r27,12(r1)
> 0xc00bb6fc <d_lookup+24>:    lis     r27,-16328
> 0xc00bb700 <d_lookup+28>:    mr      r31,r1
> 0xc00bb704 <d_lookup+32>:    stw     r0,36(r1)
> 0xc00bb708 <d_lookup+36>:    mr      r28,r3
> 0xc00bb70c <d_lookup+40>:    mr      r29,r4
> 0xc00bb710 <d_lookup+44>:    lwz     r30,12960(r27)
> 0xc00bb714 <d_lookup+48>:    andi.   r0,r30,1
> 0xc00bb718 <d_lookup+52>:    mr      r3,r28
> 0xc00bb71c <d_lookup+56>:    mr      r4,r29
> 0xc00bb720 <d_lookup+60>:    bnela-  0x2c
> 0xc00bb724 <d_lookup+64>:    .long 0x0
> 0xc00bb728 <d_lookup+68>:    cmpwi   cr7,r3,0
> 0xc00bb72c <d_lookup+72>:    bne-    cr7,0xc00bb75c <d_lookup+120>
> 0xc00bb730 <d_lookup+76>:    lwz     r0,12960(r27)
> 0xc00bb734 <d_lookup+80>:    cmpw    cr7,r0,r30
> 0xc00bb738 <d_lookup+84>:    beq-    cr7,0xc00bb75c <d_lookup+120>
> 0xc00bb73c <d_lookup+88>:    mr      r30,r0
> 0xc00bb740 <d_lookup+92>:    andi.   r0,r30,1
> 0xc00bb744 <d_lookup+96>:    mr      r3,r28
> 0xc00bb748 <d_lookup+100>:    mr      r4,r29
> 0xc00bb74c <d_lookup+104>:    beq+    0xc00bb724 <d_lookup+64>
> 0xc00bb750 <d_lookup+108>:    lwz     r0,12960(r27)
> 0xc00bb754 <d_lookup+112>:    mr      r30,r0
> 0xc00bb758 <d_lookup+116>:    b       0xc00bb740 <d_lookup+92>
> 0xc00bb75c <d_lookup+120>:    lwz     r11,0(r1)
> 0xc00bb760 <d_lookup+124>:    lwz     r0,4(r11)
> 0xc00bb764 <d_lookup+128>:    lmw     r27,-20(r11)
> 0xc00bb768 <d_lookup+132>:    mtlr    r0
> 0xc00bb76c <d_lookup+136>:    mr      r1,r11
> 0xc00bb770 <d_lookup+140>:    blr
> End of assembler dump.
> (gdb)
>
> Hope this is helpfull
>

Can you also give us

objdump -S --start-address=0xC00BB724 vmlinux | head 20

?

Then we could compare the result with the OOPS instruction dump :

7c0802a6 bf61000c 3f60c038 7c3f0b78 90010024 7c7c1b78 7c9d2378 83db32a0
73c00001 7f83e378 7fa4eb78 4082002f <00000000> 2f830000 409e0030 801b32a0

Mathieu


> Liberty
>
>
>

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-19 13:05           ` Mathieu Desnoyers
@ 2008-08-19 14:21             ` Eran Liberty
  2008-08-19 14:42               ` Mathieu Desnoyers
  0 siblings, 1 reply; 81+ messages in thread
From: Eran Liberty @ 2008-08-19 14:21 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Steven Rostedt, Steven Rostedt, linux-kernel, Paul E. McKenney,
	linuxppc-dev

Mathieu Desnoyers wrote:
> Can you also give us
>
> objdump -S --start-address=0xC00BB724 vmlinux | head 20
>
> ?
>
> Then we could compare the result with the OOPS instruction dump :
>
> 7c0802a6 bf61000c 3f60c038 7c3f0b78 90010024 7c7c1b78 7c9d2378 83db32a0
> 73c00001 7f83e378 7fa4eb78 4082002f <00000000> 2f830000 409e0030 801b32a0
>
> Mathieu
>
>   

to give you more context I have run:
 > powerpc-linux-gnu-objdump -S --start-address=0xC00BB700 vmlinux | 
head -n 60

the discrepancy starts at address:
c00bb720: 40 82 00 30 <=> 40 82 00 2f
c00bb724: 4b ff fe 61 <=> 00 00 00 00

vmlinux:     file format elf32-powerpc

Disassembly of section .text:

c00bb700 <d_lookup+0x1c>:
 * d_lookup() is protected against the concurrent renames in some unrelated
 * directory using the seqlockt_t rename_lock.
 */

struct dentry * d_lookup(struct dentry * parent, struct qstr * name)
{
c00bb700:       7c 3f 0b 78     mr      r31,r1
c00bb704:       90 01 00 24     stw     r0,36(r1)
c00bb708:       7c 7c 1b 78     mr      r28,r3
c00bb70c:       7c 9d 23 78     mr      r29,r4
c00bb710:       83 db 32 a0     lwz     r30,12960(r27)
{
        unsigned ret;

repeat:
        ret = sl->sequence;
        smp_rmb();
c00bb714:       73 c0 00 01     andi.   r0,r30,1
        struct dentry * dentry = NULL;
        unsigned long seq;

        do {
                seq = read_seqbegin(&rename_lock);
                dentry = __d_lookup(parent, name);
c00bb718:       7f 83 e3 78     mr      r3,r28
c00bb71c:       7f a4 eb 78     mr      r4,r29
        if (unlikely(ret & 1)) {
c00bb720:       40 82 00 30     bne-    c00bb750 <d_lookup+0x6c>
c00bb724:       4b ff fe 61     bl      c00bb584 <__d_lookup>
                if (dentry)
c00bb728:       2f 83 00 00     cmpwi   cr7,r3,0
c00bb72c:       40 9e 00 30     bne-    cr7,c00bb75c <d_lookup+0x78>
 *
 * If sequence value changed then writer changed data while in section.
 */
static __always_inline int read_seqretry(const seqlock_t *sl, unsigned 
start)
{
        smp_rmb();
c00bb730:       80 1b 32 a0     lwz     r0,12960(r27)
                        break;
        } while (read_seqretry(&rename_lock, seq));
c00bb734:       7f 80 f0 00     cmpw    cr7,r0,r30
c00bb738:       41 9e 00 24     beq-    cr7,c00bb75c <d_lookup+0x78>
/* Start of read calculation -- fetch last complete writer token */
static __always_inline unsigned read_seqbegin(const seqlock_t *sl)
{
        unsigned ret;

repeat:
c00bb73c:       7c 1e 03 78     mr      r30,r0
        ret = sl->sequence;
        smp_rmb();
c00bb740:       73 c0 00 01     andi.   r0,r30,1
        struct dentry * dentry = NULL;


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-19 14:21             ` Eran Liberty
@ 2008-08-19 14:42               ` Mathieu Desnoyers
  0 siblings, 0 replies; 81+ messages in thread
From: Mathieu Desnoyers @ 2008-08-19 14:42 UTC (permalink / raw)
  To: Eran Liberty
  Cc: Steven Rostedt, Steven Rostedt, linux-kernel, Paul E. McKenney,
	linuxppc-dev

* Eran Liberty (liberty@extricom.com) wrote:
> Mathieu Desnoyers wrote:
>> Can you also give us
>>
>> objdump -S --start-address=0xC00BB724 vmlinux | head 20
>>
>> ?
>>
>> Then we could compare the result with the OOPS instruction dump :
>>
>> 7c0802a6 bf61000c 3f60c038 7c3f0b78 90010024 7c7c1b78 7c9d2378 83db32a0
>> 73c00001 7f83e378 7fa4eb78 4082002f <00000000> 2f830000 409e0030 801b32a0
>>
>> Mathieu
>>
>>   
>
> to give you more context I have run:
> > powerpc-linux-gnu-objdump -S --start-address=0xC00BB700 vmlinux | head -n 
> 60
>
> the discrepancy starts at address:
> c00bb720: 40 82 00 30 <=> 40 82 00 2f
> c00bb724: 4b ff fe 61 <=> 00 00 00 00
>

Ah !

I think I see what could be wrong :

First we have :

static unsigned int ftrace_nop = 0x60000000;

We probably replace the original function call by this nop.

Then we do :

notrace unsigned char *ftrace_call_replace(unsigned long ip, unsigned long addr)
{
        static unsigned int op;

        /*
         * It would be nice to just use create_function_call, but that will
         * update the code itself. Here we need to just return the
         * instruction that is going to be modified, without modifying the
         * code.
         */
        addr = GET_ADDR(addr);

        /* Set to "bl addr" */
        op = 0x48000001 | (ftrace_calc_offset(ip, addr) & 0x03fffffc);

        /*
         * No locking needed, this must be called via kstop_machine
         * which in essence is like running on a uniprocessor machine.
         */
        return (unsigned char *)&op;
}

And I guess we must be doing a 0x48000001 | 0x0; or something ?

Also, we have to consider that POWERPC 64 functions are :

/* PowerPC64's functions are data that points to the functions */

And this does not seem to hold for ppc32. Therefore, it is strange to me
that the same code is used for the update.. are we updating the correct
site ?

Mathieu

> vmlinux:     file format elf32-powerpc
>
> Disassembly of section .text:
>
> c00bb700 <d_lookup+0x1c>:
> * d_lookup() is protected against the concurrent renames in some unrelated
> * directory using the seqlockt_t rename_lock.
> */
>
> struct dentry * d_lookup(struct dentry * parent, struct qstr * name)
> {
> c00bb700:       7c 3f 0b 78     mr      r31,r1
> c00bb704:       90 01 00 24     stw     r0,36(r1)
> c00bb708:       7c 7c 1b 78     mr      r28,r3
> c00bb70c:       7c 9d 23 78     mr      r29,r4
> c00bb710:       83 db 32 a0     lwz     r30,12960(r27)
> {
>        unsigned ret;
>
> repeat:
>        ret = sl->sequence;
>        smp_rmb();
> c00bb714:       73 c0 00 01     andi.   r0,r30,1
>        struct dentry * dentry = NULL;
>        unsigned long seq;
>
>        do {
>                seq = read_seqbegin(&rename_lock);
>                dentry = __d_lookup(parent, name);
> c00bb718:       7f 83 e3 78     mr      r3,r28
> c00bb71c:       7f a4 eb 78     mr      r4,r29
>        if (unlikely(ret & 1)) {
> c00bb720:       40 82 00 30     bne-    c00bb750 <d_lookup+0x6c>
> c00bb724:       4b ff fe 61     bl      c00bb584 <__d_lookup>
>                if (dentry)
> c00bb728:       2f 83 00 00     cmpwi   cr7,r3,0
> c00bb72c:       40 9e 00 30     bne-    cr7,c00bb75c <d_lookup+0x78>
> *
> * If sequence value changed then writer changed data while in section.
> */
> static __always_inline int read_seqretry(const seqlock_t *sl, unsigned 
> start)
> {
>        smp_rmb();
> c00bb730:       80 1b 32 a0     lwz     r0,12960(r27)
>                        break;
>        } while (read_seqretry(&rename_lock, seq));
> c00bb734:       7f 80 f0 00     cmpw    cr7,r0,r30
> c00bb738:       41 9e 00 24     beq-    cr7,c00bb75c <d_lookup+0x78>
> /* Start of read calculation -- fetch last complete writer token */
> static __always_inline unsigned read_seqbegin(const seqlock_t *sl)
> {
>        unsigned ret;
>
> repeat:
> c00bb73c:       7c 1e 03 78     mr      r30,r0
>        ret = sl->sequence;
>        smp_rmb();
> c00bb740:       73 c0 00 01     andi.   r0,r30,1
>        struct dentry * dentry = NULL;
>

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-19  2:47                   ` Mathieu Desnoyers
  2008-08-19  3:32                     ` Steven Rostedt
@ 2008-08-19 16:47                     ` Steven Rostedt
  2008-08-19 17:34                       ` Mathieu Desnoyers
  1 sibling, 1 reply; 81+ messages in thread
From: Steven Rostedt @ 2008-08-19 16:47 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Benjamin Herrenschmidt, Scott Wood, Eran Liberty, linux-kernel,
	linuxppc-dev, Steven Rostedt, Paul E. McKenney



On Mon, 18 Aug 2008, Mathieu Desnoyers wrote:

> * Steven Rostedt (rostedt@goodmis.org) wrote:
> > 
> > On Tue, 19 Aug 2008, Benjamin Herrenschmidt wrote:
> > 
> > > 
> > > > Hmm, this was originally copied from x86, where we did a cmpxchg, but that 
> > > > is probably not needed since all of this is done in kstop_machine. Also, 
> > > > only the "get" is needed. If we don't fault there, we wont fault on the 
> > > > put (unless we have permissions wrong, and that would be a bug).
> > > 
> > > Would it ? How do we make sure the kernel text is mapped writeable ?
> > 
> > We map it writeable if FTRACE is enabled.
> > 
> 
> Argh. See text_poke(). It's there exactly for this purpose on x86.
> 

OK, I just tried text_poke and it unfortunately fails. The problem is that 
it requires that the text you are changing is aligned and fits on one 
page. We have no control over that.

-- Steve


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-19 16:47                     ` Steven Rostedt
@ 2008-08-19 17:34                       ` Mathieu Desnoyers
  2008-08-19 21:08                         ` Steven Rostedt
  2008-08-19 21:47                         ` Benjamin Herrenschmidt
  0 siblings, 2 replies; 81+ messages in thread
From: Mathieu Desnoyers @ 2008-08-19 17:34 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Benjamin Herrenschmidt, Scott Wood, Eran Liberty, linux-kernel,
	linuxppc-dev, Steven Rostedt, Paul E. McKenney

* Steven Rostedt (rostedt@goodmis.org) wrote:
> 
> 
> On Mon, 18 Aug 2008, Mathieu Desnoyers wrote:
> 
> > * Steven Rostedt (rostedt@goodmis.org) wrote:
> > > 
> > > On Tue, 19 Aug 2008, Benjamin Herrenschmidt wrote:
> > > 
> > > > 
> > > > > Hmm, this was originally copied from x86, where we did a cmpxchg, but that 
> > > > > is probably not needed since all of this is done in kstop_machine. Also, 
> > > > > only the "get" is needed. If we don't fault there, we wont fault on the 
> > > > > put (unless we have permissions wrong, and that would be a bug).
> > > > 
> > > > Would it ? How do we make sure the kernel text is mapped writeable ?
> > > 
> > > We map it writeable if FTRACE is enabled.
> > > 
> > 
> > Argh. See text_poke(). It's there exactly for this purpose on x86.
> > 
> 
> OK, I just tried text_poke and it unfortunately fails. The problem is that 
> it requires that the text you are changing is aligned and fits on one 
> page. We have no control over that.
> 
> -- Steve
> 

Ok, there are two cases where it's ok :

1 - in stop_machine, considering we are not touching code executed in
NMI handlers.
2 - when using my replace_instruction_safe() which uses a temporary
breakpoint when doing the instruction replacement.

In those cases you could use text_poke_early().

See
http://git.kernel.org/?p=linux/kernel/git/compudj/linux-2.6-lttng.git;a=blob;f=arch/x86/kernel/immediate.c;h=7789e2c75bf03e645f15759d5dff0c1698493f92;hb=HEAD

For a use example. Basically it looks like :


360                 pages[0] = virt_to_page((void *)bypass_eip);
361                 vaddr = vmap(pages, 1, VM_MAP, PAGE_KERNEL);
362                 BUG_ON(!vaddr);
363                 text_poke_early(&vaddr[bypass_eip & ~PAGE_MASK],
364                         (void *)addr, size);
365                 /*
366                  * Fill the rest with nops.
367                  */
368                 len = NR_NOPS - size;
369                 add_nops((void *)
370                         &vaddr[(bypass_eip & ~PAGE_MASK) + size],
371                         len);
372                 print_dbg_bytes("inserted nops",
373                         &vaddr[(bypass_eip & ~PAGE_MASK) + size], len);
374                 vunmap(vaddr);

Mathieu


-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-19 12:09         ` Eran Liberty
  2008-08-19 13:05           ` Mathieu Desnoyers
@ 2008-08-19 20:15           ` Steven Rostedt
  2008-08-20 11:18             ` Eran Liberty
  1 sibling, 1 reply; 81+ messages in thread
From: Steven Rostedt @ 2008-08-19 20:15 UTC (permalink / raw)
  To: Eran Liberty
  Cc: Steven Rostedt, linux-kernel, Mathieu Desnoyers,
	Paul E. McKenney, linuxppc-dev


On Tue, 19 Aug 2008, Eran Liberty wrote:

> Steven Rostedt wrote:
> >   
> > > Testing tracer sched_switch: PASSED
> > > Testing tracer ftrace: PASSED
> > > Testing dynamic ftrace: PASSED

Do you have PREEMPT_TRACER enabled, or any other tracer for that matter?

-- Steve

> > > 
> > > Oops: Exception in kernel mode, sig: 11 [#1]
> > > Exsw1600
> > > Modules linked in:
> > > NIP: c00bbb20 LR: c00bbb20 CTR: 00000000
> > >     
> > 

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-19 17:34                       ` Mathieu Desnoyers
@ 2008-08-19 21:08                         ` Steven Rostedt
  2008-08-20  9:40                           ` Nick Piggin
  2008-08-19 21:47                         ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 81+ messages in thread
From: Steven Rostedt @ 2008-08-19 21:08 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Benjamin Herrenschmidt, Scott Wood, Eran Liberty, linux-kernel,
	linuxppc-dev, Steven Rostedt, Paul E. McKenney


On Tue, 19 Aug 2008, Mathieu Desnoyers wrote:
> 
> Ok, there are two cases where it's ok :
> 
> 1 - in stop_machine, considering we are not touching code executed in
> NMI handlers.
> 2 - when using my replace_instruction_safe() which uses a temporary
> breakpoint when doing the instruction replacement.
> 
> In those cases you could use text_poke_early().
> 
> See
> http://git.kernel.org/?p=linux/kernel/git/compudj/linux-2.6-lttng.git;a=blob;f=arch/x86/kernel/immediate.c;h=7789e2c75bf03e645f15759d5dff0c1698493f92;hb=HEAD
> 
> For a use example. Basically it looks like :
> 
> 
> 360                 pages[0] = virt_to_page((void *)bypass_eip);
> 361                 vaddr = vmap(pages, 1, VM_MAP, PAGE_KERNEL);
> 362                 BUG_ON(!vaddr);
> 363                 text_poke_early(&vaddr[bypass_eip & ~PAGE_MASK],
> 364                         (void *)addr, size);
> 365                 /*
> 366                  * Fill the rest with nops.
> 367                  */
> 368                 len = NR_NOPS - size;
> 369                 add_nops((void *)
> 370                         &vaddr[(bypass_eip & ~PAGE_MASK) + size],
> 371                         len);
> 372                 print_dbg_bytes("inserted nops",
> 373                         &vaddr[(bypass_eip & ~PAGE_MASK) + size], len);
> 374                 vunmap(vaddr);

vunmap can not be called with interrupts disabled, and this is exactly 
what my code does.

-- Steve

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-19 13:02           ` Mathieu Desnoyers
@ 2008-08-19 21:46             ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 81+ messages in thread
From: Benjamin Herrenschmidt @ 2008-08-19 21:46 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Eran Liberty, Steven Rostedt, linux-kernel, Paul E. McKenney,
	linuxppc-dev, rostedt


> Register GPR30 is always zero... (also true for the other oops)

>From my experiments, -a- non volatile register gets corrupted. Not
always the same and not always with the same value.

The corruption -seems- to happen due to corruption of the location on
the stack where it was saved to / restored from in a previous function
call or interrupt.

I haven't yet tracked down what actually whacks the stack.

Ben.



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-19 17:34                       ` Mathieu Desnoyers
  2008-08-19 21:08                         ` Steven Rostedt
@ 2008-08-19 21:47                         ` Benjamin Herrenschmidt
  2008-08-19 23:58                           ` Jeremy Fitzhardinge
  1 sibling, 1 reply; 81+ messages in thread
From: Benjamin Herrenschmidt @ 2008-08-19 21:47 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Steven Rostedt, Scott Wood, Eran Liberty, linux-kernel,
	linuxppc-dev, Steven Rostedt, Paul E. McKenney


> 
> Ok, there are two cases where it's ok :
> 
> 1 - in stop_machine, considering we are not touching code executed in
> NMI handlers.
> 2 - when using my replace_instruction_safe() which uses a temporary
> breakpoint when doing the instruction replacement.
> 
> In those cases you could use text_poke_early().

Note that vmap/vunmap will be very slow.

Ben.



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-19 21:47                         ` Benjamin Herrenschmidt
@ 2008-08-19 23:58                           ` Jeremy Fitzhardinge
  2008-08-20  1:17                             ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 81+ messages in thread
From: Jeremy Fitzhardinge @ 2008-08-19 23:58 UTC (permalink / raw)
  To: benh
  Cc: Mathieu Desnoyers, Steven Rostedt, Scott Wood, Eran Liberty,
	linux-kernel, linuxppc-dev, Steven Rostedt, Paul E. McKenney,
	Nick Piggin

Benjamin Herrenschmidt wrote:
>> Ok, there are two cases where it's ok :
>>
>> 1 - in stop_machine, considering we are not touching code executed in
>> NMI handlers.
>> 2 - when using my replace_instruction_safe() which uses a temporary
>> breakpoint when doing the instruction replacement.
>>
>> In those cases you could use text_poke_early().
>>     
>
> Note that vmap/vunmap will be very slow.
>   

Don't we have Nick's speedups now?

    J

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-19 23:58                           ` Jeremy Fitzhardinge
@ 2008-08-20  1:17                             ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 81+ messages in thread
From: Benjamin Herrenschmidt @ 2008-08-20  1:17 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Mathieu Desnoyers, Steven Rostedt, Scott Wood, Eran Liberty,
	linux-kernel, linuxppc-dev, Steven Rostedt, Paul E. McKenney,
	Nick Piggin

On Tue, 2008-08-19 at 16:58 -0700, Jeremy Fitzhardinge wrote:
> Benjamin Herrenschmidt wrote:
> >> Ok, there are two cases where it's ok :
> >>
> >> 1 - in stop_machine, considering we are not touching code executed in
> >> NMI handlers.
> >> 2 - when using my replace_instruction_safe() which uses a temporary
> >> breakpoint when doing the instruction replacement.
> >>
> >> In those cases you could use text_poke_early().
> >>     
> >
> > Note that vmap/vunmap will be very slow.
> >   
> 
> Don't we have Nick's speedups now?

Not sure what speedups this is, but it's stlil not something you want to
do a lot ... setting up and tearing down MMU mappings isn't something
I'd like to see happening on a per-mcount basis.

Ben


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-19  4:17                     ` Benjamin Herrenschmidt
@ 2008-08-20  7:18                       ` Benjamin Herrenschmidt
  2008-08-20 13:14                         ` Steven Rostedt
  0 siblings, 1 reply; 81+ messages in thread
From: Benjamin Herrenschmidt @ 2008-08-20  7:18 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Paul E. McKenney, Mathieu Desnoyers, linux-kernel, linuxppc-dev,
	Steven Rostedt, Scott Wood, Eran Liberty, Alan Modra,
	Segher Boessenkool

Found the problem (or at least -a- problem), it's a gcc bug.

Well, first I must say the code generated by -pg is just plain
horrible :-)

Appart from that, look at the exit of, for example, __d_lookup, as
generated by gcc when ftrace is enabled:

c00c0498:       38 60 00 00     li      r3,0
c00c049c:       81 61 00 00     lwz     r11,0(r1)
c00c04a0:       80 0b 00 04     lwz     r0,4(r11)
c00c04a4:       7d 61 5b 78     mr      r1,r11
c00c04a8:       bb 0b ff e0     lmw     r24,-32(r11)
c00c04ac:       7c 08 03 a6     mtlr    r0
c00c04b0:       4e 80 00 20     blr

As you can see, it restores r1 -before- it pops r24..r31 off
the stack ! I let you imagine what happens if an interrupt happens
just in between those two instructions (mr and lmw). We don't do
redzones on our ABI, so basically, the registers end up corrupted
by the interrupt.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-19 21:08                         ` Steven Rostedt
@ 2008-08-20  9:40                           ` Nick Piggin
  0 siblings, 0 replies; 81+ messages in thread
From: Nick Piggin @ 2008-08-20  9:40 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Mathieu Desnoyers, Benjamin Herrenschmidt, Scott Wood,
	Eran Liberty, linux-kernel, linuxppc-dev, Steven Rostedt,
	Paul E. McKenney

On Wednesday 20 August 2008 07:08, Steven Rostedt wrote:
> On Tue, 19 Aug 2008, Mathieu Desnoyers wrote:
> > Ok, there are two cases where it's ok :
> >
> > 1 - in stop_machine, considering we are not touching code executed in
> > NMI handlers.
> > 2 - when using my replace_instruction_safe() which uses a temporary
> > breakpoint when doing the instruction replacement.
> >
> > In those cases you could use text_poke_early().
> >
> > See
> > http://git.kernel.org/?p=linux/kernel/git/compudj/linux-2.6-lttng.git;a=b
> >lob;f=arch/x86/kernel/immediate.c;h=7789e2c75bf03e645f15759d5dff0c1698493f
> >92;hb=HEAD
> >
> > For a use example. Basically it looks like :
> >
> >
> > 360                 pages[0] = virt_to_page((void *)bypass_eip);
> > 361                 vaddr = vmap(pages, 1, VM_MAP, PAGE_KERNEL);
> > 362                 BUG_ON(!vaddr);
> > 363                 text_poke_early(&vaddr[bypass_eip & ~PAGE_MASK],
> > 364                         (void *)addr, size);
> > 365                 /*
> > 366                  * Fill the rest with nops.
> > 367                  */
> > 368                 len = NR_NOPS - size;
> > 369                 add_nops((void *)
> > 370                         &vaddr[(bypass_eip & ~PAGE_MASK) + size],
> > 371                         len);
> > 372                 print_dbg_bytes("inserted nops",
> > 373                         &vaddr[(bypass_eip & ~PAGE_MASK) + size],
> > len); 374                 vunmap(vaddr);
>
> vunmap can not be called with interrupts disabled, and this is exactly
> what my code does.

It could be after the vmap rewrite (with a few other small tweaks).
But a) it would be less robust when called from interrupt context
and this code looks broken as it is WRT error handling; and b) it
still costs several thousand cycles to vmap+touch+vunmap...

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-19 20:15           ` Steven Rostedt
@ 2008-08-20 11:18             ` Eran Liberty
  2008-08-20 13:12               ` Steven Rostedt
  0 siblings, 1 reply; 81+ messages in thread
From: Eran Liberty @ 2008-08-20 11:18 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Steven Rostedt, linux-kernel, Mathieu Desnoyers,
	Paul E. McKenney, linuxppc-dev

Steven Rostedt wrote:
> On Tue, 19 Aug 2008, Eran Liberty wrote:
>
>   
>> Steven Rostedt wrote:
>>     
>>>   
>>>       
>>>> Testing tracer sched_switch: PASSED
>>>> Testing tracer ftrace: PASSED
>>>> Testing dynamic ftrace: PASSED
>>>>         
>
> Do you have PREEMPT_TRACER enabled, or any other tracer for that matter?
>
> -- Steve
>   
I can see stack trace & context trace, but they are derived from other 
choices (I can not un select them)

cat .config | grep TRACE
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_HAVE_ARCH_TRACEHOOK=y
# CONFIG_BLK_DEV_IO_TRACE is not set
CONFIG_STACKTRACE=y
# CONFIG_BACKTRACE_SELF_TEST is not set
CONFIG_HAVE_FTRACE=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_FTRACE=y
# CONFIG_SCHED_TRACER is not set
CONFIG_CONTEXT_SWITCH_TRACER=y
CONFIG_DYNAMIC_FTRACE=y
CONFIG_FTRACE_SELFTEST=y
CONFIG_FTRACE_STARTUP_TEST=y
>   
>>>> Oops: Exception in kernel mode, sig: 11 [#1]
>>>> Exsw1600
>>>> Modules linked in:
>>>> NIP: c00bbb20 LR: c00bbb20 CTR: 00000000
>>>>     
>>>>         
>
>
>   


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-20 11:18             ` Eran Liberty
@ 2008-08-20 13:12               ` Steven Rostedt
  0 siblings, 0 replies; 81+ messages in thread
From: Steven Rostedt @ 2008-08-20 13:12 UTC (permalink / raw)
  To: Eran Liberty
  Cc: Steven Rostedt, linux-kernel, Mathieu Desnoyers,
	Paul E. McKenney, linuxppc-dev


On Wed, 20 Aug 2008, Eran Liberty wrote:

> Steven Rostedt wrote:
> > On Tue, 19 Aug 2008, Eran Liberty wrote:
> > 
> >   
> > > Steven Rostedt wrote:
> > >     
> > > >         
> > > > > Testing tracer sched_switch: PASSED
> > > > > Testing tracer ftrace: PASSED
> > > > > Testing dynamic ftrace: PASSED
> > > > >         
> > 
> > Do you have PREEMPT_TRACER enabled, or any other tracer for that matter?
> > 
> > -- Steve
> >   
> I can see stack trace & context trace, but they are derived from other choices
> (I can not un select them)

Yeah, those are not bad.

> 
> cat .config | grep TRACE
> CONFIG_STACKTRACE_SUPPORT=y
> CONFIG_HAVE_ARCH_TRACEHOOK=y
> # CONFIG_BLK_DEV_IO_TRACE is not set
> CONFIG_STACKTRACE=y
> # CONFIG_BACKTRACE_SELF_TEST is not set
> CONFIG_HAVE_FTRACE=y
> CONFIG_HAVE_DYNAMIC_FTRACE=y
> CONFIG_FTRACE=y
> # CONFIG_SCHED_TRACER is not set
> CONFIG_CONTEXT_SWITCH_TRACER=y
> CONFIG_DYNAMIC_FTRACE=y
> CONFIG_FTRACE_SELFTEST=y
> CONFIG_FTRACE_STARTUP_TEST=y

You must not have PREEMPT on, so the PREEMPT_TRACER will not show up.

The reason I asked, is that my PowerBook runs fine without PREEMPT_TRACER 
but is very unstable when I have PREEMPT_TRACER enabled.

-- Steve


> >   
> > > > > Oops: Exception in kernel mode, sig: 11 [#1]
> > > > > Exsw1600
> > > > > Modules linked in:
> > > > > NIP: c00bbb20 LR: c00bbb20 CTR: 00000000
> > > > >             
> > 
> > 
> >   
> 
> 

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-20  7:18                       ` Benjamin Herrenschmidt
@ 2008-08-20 13:14                         ` Steven Rostedt
  2008-08-20 13:19                           ` Steven Rostedt
                                             ` (2 more replies)
  0 siblings, 3 replies; 81+ messages in thread
From: Steven Rostedt @ 2008-08-20 13:14 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Paul E. McKenney, Mathieu Desnoyers, linux-kernel, linuxppc-dev,
	Steven Rostedt, Scott Wood, Eran Liberty, Alan Modra,
	Segher Boessenkool


On Wed, 20 Aug 2008, Benjamin Herrenschmidt wrote:

> Found the problem (or at least -a- problem), it's a gcc bug.
> 
> Well, first I must say the code generated by -pg is just plain
> horrible :-)
> 
> Appart from that, look at the exit of, for example, __d_lookup, as
> generated by gcc when ftrace is enabled:
> 
> c00c0498:       38 60 00 00     li      r3,0
> c00c049c:       81 61 00 00     lwz     r11,0(r1)
> c00c04a0:       80 0b 00 04     lwz     r0,4(r11)
> c00c04a4:       7d 61 5b 78     mr      r1,r11
> c00c04a8:       bb 0b ff e0     lmw     r24,-32(r11)
> c00c04ac:       7c 08 03 a6     mtlr    r0
> c00c04b0:       4e 80 00 20     blr
> 
> As you can see, it restores r1 -before- it pops r24..r31 off
> the stack ! I let you imagine what happens if an interrupt happens
> just in between those two instructions (mr and lmw). We don't do
> redzones on our ABI, so basically, the registers end up corrupted
> by the interrupt.

Ouch!  You've disassembled this without -pg too, and it does not have this 
bug? What version of gcc do you have?

-- Steve


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-20 13:14                         ` Steven Rostedt
@ 2008-08-20 13:19                           ` Steven Rostedt
  2008-08-20 13:36                             ` Eran Liberty
  2008-08-20 14:16                           ` Josh Boyer
  2008-08-20 21:36                           ` ftrace introduces instability into kernel 2.6.27(-rc2,-rc3) Benjamin Herrenschmidt
  2 siblings, 1 reply; 81+ messages in thread
From: Steven Rostedt @ 2008-08-20 13:19 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Paul E. McKenney, Mathieu Desnoyers, linux-kernel, linuxppc-dev,
	Steven Rostedt, Scott Wood, Eran Liberty, Alan Modra,
	Segher Boessenkool


On Wed, 20 Aug 2008, Steven Rostedt wrote:

> 
> On Wed, 20 Aug 2008, Benjamin Herrenschmidt wrote:
> 
> > Found the problem (or at least -a- problem), it's a gcc bug.
> > 
> > Well, first I must say the code generated by -pg is just plain
> > horrible :-)
> > 
> > Appart from that, look at the exit of, for example, __d_lookup, as
> > generated by gcc when ftrace is enabled:
> > 
> > c00c0498:       38 60 00 00     li      r3,0
> > c00c049c:       81 61 00 00     lwz     r11,0(r1)
> > c00c04a0:       80 0b 00 04     lwz     r0,4(r11)
> > c00c04a4:       7d 61 5b 78     mr      r1,r11
> > c00c04a8:       bb 0b ff e0     lmw     r24,-32(r11)
> > c00c04ac:       7c 08 03 a6     mtlr    r0
> > c00c04b0:       4e 80 00 20     blr
> > 
> > As you can see, it restores r1 -before- it pops r24..r31 off
> > the stack ! I let you imagine what happens if an interrupt happens
> > just in between those two instructions (mr and lmw). We don't do
> > redzones on our ABI, so basically, the registers end up corrupted
> > by the interrupt.
> 
> Ouch!  You've disassembled this without -pg too, and it does not have this 
> bug? What version of gcc do you have?
> 

I have:
 gcc (Debian 4.3.1-2) 4.3.1

c00c64c8:       81 61 00 00     lwz     r11,0(r1)
c00c64cc:       7f 83 e3 78     mr      r3,r28
c00c64d0:       80 0b 00 04     lwz     r0,4(r11)
c00c64d4:       ba eb ff dc     lmw     r23,-36(r11)
c00c64d8:       7d 61 5b 78     mr      r1,r11
c00c64dc:       7c 08 03 a6     mtlr    r0
c00c64e0:       4e 80 00 20     blr


My version looks fine.  I'm thinking that this is a separate issue than 
what Eran is seeing.

Eran, can you do an "objdump -dr vmlinux" and search for __d_lookup, and 
print out the end of the function dump.

Thanks,

-- Steve


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-20 13:19                           ` Steven Rostedt
@ 2008-08-20 13:36                             ` Eran Liberty
  2008-08-20 13:43                               ` Steven Rostedt
  0 siblings, 1 reply; 81+ messages in thread
From: Eran Liberty @ 2008-08-20 13:36 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Benjamin Herrenschmidt, Paul E. McKenney, Mathieu Desnoyers,
	linux-kernel, linuxppc-dev, Steven Rostedt, Scott Wood,
	Alan Modra, Segher Boessenkool

Steven Rostedt wrote:
> On Wed, 20 Aug 2008, Steven Rostedt wrote:
>
>   
>> On Wed, 20 Aug 2008, Benjamin Herrenschmidt wrote:
>>
>>     
>>> Found the problem (or at least -a- problem), it's a gcc bug.
>>>
>>> Well, first I must say the code generated by -pg is just plain
>>> horrible :-)
>>>
>>> Appart from that, look at the exit of, for example, __d_lookup, as
>>> generated by gcc when ftrace is enabled:
>>>
>>> c00c0498:       38 60 00 00     li      r3,0
>>> c00c049c:       81 61 00 00     lwz     r11,0(r1)
>>> c00c04a0:       80 0b 00 04     lwz     r0,4(r11)
>>> c00c04a4:       7d 61 5b 78     mr      r1,r11
>>> c00c04a8:       bb 0b ff e0     lmw     r24,-32(r11)
>>> c00c04ac:       7c 08 03 a6     mtlr    r0
>>> c00c04b0:       4e 80 00 20     blr
>>>
>>> As you can see, it restores r1 -before- it pops r24..r31 off
>>> the stack ! I let you imagine what happens if an interrupt happens
>>> just in between those two instructions (mr and lmw). We don't do
>>> redzones on our ABI, so basically, the registers end up corrupted
>>> by the interrupt.
>>>       
>> Ouch!  You've disassembled this without -pg too, and it does not have this 
>> bug? What version of gcc do you have?
>>
>>     
>
> I have:
>  gcc (Debian 4.3.1-2) 4.3.1
>
> c00c64c8:       81 61 00 00     lwz     r11,0(r1)
> c00c64cc:       7f 83 e3 78     mr      r3,r28
> c00c64d0:       80 0b 00 04     lwz     r0,4(r11)
> c00c64d4:       ba eb ff dc     lmw     r23,-36(r11)
> c00c64d8:       7d 61 5b 78     mr      r1,r11
> c00c64dc:       7c 08 03 a6     mtlr    r0
> c00c64e0:       4e 80 00 20     blr
>
>
> My version looks fine.  I'm thinking that this is a separate issue than 
> what Eran is seeing.
>
> Eran, can you do an "objdump -dr vmlinux" and search for __d_lookup, and 
> print out the end of the function dump.
>
> Thanks,
>
> -- Steve
>
>
>
>   
powerpc-linux-gnu-objdump -dr --start-address=0xc00bb584 vmlinux | head 
-n 100

vmlinux:     file format elf32-powerpc

Disassembly of section .text:

c00bb584 <__d_lookup>:
c00bb584:       7c 08 02 a6     mflr    r0
c00bb588:       90 01 00 04     stw     r0,4(r1)
c00bb58c:       4b f5 5c 51     bl      c00111dc <_mcount>
c00bb590:       94 21 ff d0     stwu    r1,-48(r1)
c00bb594:       7c 08 02 a6     mflr    r0
c00bb598:       3d 20 9e 37     lis     r9,-25033
c00bb59c:       bf 01 00 10     stmw    r24,16(r1)
c00bb5a0:       61 29 00 01     ori     r9,r9,1
c00bb5a4:       3d 60 c0 38     lis     r11,-16328
c00bb5a8:       90 01 00 34     stw     r0,52(r1)
c00bb5ac:       7c 60 4a 78     xor     r0,r3,r9
c00bb5b0:       54 00 d9 7e     rlwinm  r0,r0,27,5,31
c00bb5b4:       83 84 00 00     lwz     r28,0(r4)
c00bb5b8:       7c 3f 0b 78     mr      r31,r1
c00bb5bc:       81 0b 1a 2c     lwz     r8,6700(r11)
c00bb5c0:       39 6b 1a 2c     addi    r11,r11,6700
c00bb5c4:       7c 00 e2 14     add     r0,r0,r28
c00bb5c8:       81 4b 00 04     lwz     r10,4(r11)
c00bb5cc:       7c 09 4a 78     xor     r9,r0,r9
c00bb5d0:       83 24 00 04     lwz     r25,4(r4)
c00bb5d4:       7d 29 44 30     srw     r9,r9,r8
c00bb5d8:       81 0b 00 08     lwz     r8,8(r11)
c00bb5dc:       7d 29 02 78     xor     r9,r9,r0
c00bb5e0:       83 04 00 08     lwz     r24,8(r4)
c00bb5e4:       7d 29 50 38     and     r9,r9,r10
c00bb5e8:       55 29 10 3a     rlwinm  r9,r9,2,0,29
c00bb5ec:       7c 09 40 2e     lwzx    r0,r9,r8
c00bb5f0:       7c 9a 23 78     mr      r26,r4
c00bb5f4:       7c 7b 1b 78     mr      r27,r3
c00bb5f8:       2f 80 00 00     cmpwi   cr7,r0,0
c00bb5fc:       7c 1e 03 78     mr      r30,r0
c00bb600:       40 be 00 14     bne+    cr7,c00bb614 <__d_lookup+0x90>
c00bb604:       48 00 00 7c     b       c00bb680 <__d_lookup+0xfc>
c00bb608:       83 de 00 00     lwz     r30,0(r30)
c00bb60c:       2f 9e 00 00     cmpwi   cr7,r30,0
c00bb610:       41 9e 00 70     beq-    cr7,c00bb680 <__d_lookup+0xfc>
c00bb614:       80 1e 00 00     lwz     r0,0(r30)
c00bb618:       2f 80 00 00     cmpwi   cr7,r0,0
c00bb61c:       41 9e 00 08     beq-    cr7,c00bb624 <__d_lookup+0xa0>
c00bb620:       7c 00 02 2c     dcbt    r0,r0
c00bb624:       3b be ff f4     addi    r29,r30,-12
c00bb628:       80 1d 00 18     lwz     r0,24(r29)
c00bb62c:       7f 80 e0 00     cmpw    cr7,r0,r28
c00bb630:       40 9e ff d8     bne+    cr7,c00bb608 <__d_lookup+0x84>
c00bb634:       80 1d 00 14     lwz     r0,20(r29)
c00bb638:       7f 80 d8 00     cmpw    cr7,r0,r27
c00bb63c:       40 9e ff cc     bne+    cr7,c00bb608 <__d_lookup+0x84>
c00bb640:       81 3b 00 48     lwz     r9,72(r27)
c00bb644:       38 9d 00 18     addi    r4,r29,24
c00bb648:       2f 89 00 00     cmpwi   cr7,r9,0
c00bb64c:       41 9e 00 50     beq-    cr7,c00bb69c <__d_lookup+0x118>
c00bb650:       80 09 00 08     lwz     r0,8(r9)
c00bb654:       2f 80 00 00     cmpwi   cr7,r0,0
c00bb658:       41 9e 00 44     beq-    cr7,c00bb69c <__d_lookup+0x118>
c00bb65c:       7f 63 db 78     mr      r3,r27
c00bb660:       7c 09 03 a6     mtctr   r0
c00bb664:       7f 45 d3 78     mr      r5,r26
c00bb668:       4e 80 04 21     bctrl
c00bb66c:       2f 83 00 00     cmpwi   cr7,r3,0
c00bb670:       41 9e 00 50     beq-    cr7,c00bb6c0 <__d_lookup+0x13c>
c00bb674:       83 de 00 00     lwz     r30,0(r30)
c00bb678:       2f 9e 00 00     cmpwi   cr7,r30,0
c00bb67c:       40 9e ff 98     bne+    cr7,c00bb614 <__d_lookup+0x90>
c00bb680:       38 60 00 00     li      r3,0
c00bb684:       81 61 00 00     lwz     r11,0(r1)
c00bb688:       80 0b 00 04     lwz     r0,4(r11)
c00bb68c:       7d 61 5b 78     mr      r1,r11
c00bb690:       bb 0b ff e0     lmw     r24,-32(r11)
c00bb694:       7c 08 03 a6     mtlr    r0
c00bb698:       4e 80 00 20     blr
c00bb69c:       80 04 00 04     lwz     r0,4(r4)
c00bb6a0:       7f 80 c8 00     cmpw    cr7,r0,r25
c00bb6a4:       40 9e ff 64     bne+    cr7,c00bb608 <__d_lookup+0x84>
c00bb6a8:       80 64 00 08     lwz     r3,8(r4)
c00bb6ac:       7f 25 cb 78     mr      r5,r25
c00bb6b0:       7f 04 c3 78     mr      r4,r24
c00bb6b4:       4b f5 ab 65     bl      c0016218 <memcmp>
c00bb6b8:       2f 83 00 00     cmpwi   cr7,r3,0
c00bb6bc:       40 9e ff 4c     bne+    cr7,c00bb608 <__d_lookup+0x84>
c00bb6c0:       80 1d 00 04     lwz     r0,4(r29)
c00bb6c4:       70 09 00 10     andi.   r9,r0,16
c00bb6c8:       40 a2 ff b8     bne-    c00bb680 <__d_lookup+0xfc>
c00bb6cc:       7c 00 e8 28     lwarx   r0,0,r29
c00bb6d0:       30 00 00 01     addic   r0,r0,1
c00bb6d4:       7c 00 e9 2d     stwcx.  r0,0,r29
c00bb6d8:       40 a2 ff f4     bne-    c00bb6cc <__d_lookup+0x148>
c00bb6dc:       7f a3 eb 78     mr      r3,r29
c00bb6e0:       4b ff ff a4     b       c00bb684 <__d_lookup+0x100>

c00bb6e4 <d_lookup>:
c00bb6e4:       7c 08 02 a6     mflr    r0
c00bb6e8:       90 01 00 04     stw     r0,4(r1)


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-20 13:36                             ` Eran Liberty
@ 2008-08-20 13:43                               ` Steven Rostedt
  2008-08-20 14:02                                 ` Eran Liberty
  0 siblings, 1 reply; 81+ messages in thread
From: Steven Rostedt @ 2008-08-20 13:43 UTC (permalink / raw)
  To: Eran Liberty
  Cc: Benjamin Herrenschmidt, Paul E. McKenney, Mathieu Desnoyers,
	linux-kernel, linuxppc-dev, Steven Rostedt, Scott Wood,
	Alan Modra, Segher Boessenkool


On Wed, 20 Aug 2008, Eran Liberty wrote:

> Steven Rostedt wrote:
> > On Wed, 20 Aug 2008, Steven Rostedt wrote:
> > 
> >   
> > > On Wed, 20 Aug 2008, Benjamin Herrenschmidt wrote:
> > > 
> > >     
> > > > Found the problem (or at least -a- problem), it's a gcc bug.
> > > > 
> > > > Well, first I must say the code generated by -pg is just plain
> > > > horrible :-)
> > > > 
> > > > Appart from that, look at the exit of, for example, __d_lookup, as
> > > > generated by gcc when ftrace is enabled:
> > > > 
> > > > c00c0498:       38 60 00 00     li      r3,0
> > > > c00c049c:       81 61 00 00     lwz     r11,0(r1)
> > > > c00c04a0:       80 0b 00 04     lwz     r0,4(r11)
> > > > c00c04a4:       7d 61 5b 78     mr      r1,r11
> > > > c00c04a8:       bb 0b ff e0     lmw     r24,-32(r11)
> > > > c00c04ac:       7c 08 03 a6     mtlr    r0
> > > > c00c04b0:       4e 80 00 20     blr
> > > > 
> > > > As you can see, it restores r1 -before- it pops r24..r31 off
> > > > the stack ! I let you imagine what happens if an interrupt happens
> > > > just in between those two instructions (mr and lmw). We don't do
> > > > redzones on our ABI, so basically, the registers end up corrupted
> > > > by the interrupt.
> > > >       
> > > Ouch!  You've disassembled this without -pg too, and it does not have this
> > > bug? What version of gcc do you have?
> > > 
> > >     
> > 
> > I have:
> >  gcc (Debian 4.3.1-2) 4.3.1
> > 
> > c00c64c8:       81 61 00 00     lwz     r11,0(r1)
> > c00c64cc:       7f 83 e3 78     mr      r3,r28
> > c00c64d0:       80 0b 00 04     lwz     r0,4(r11)
> > c00c64d4:       ba eb ff dc     lmw     r23,-36(r11)
> > c00c64d8:       7d 61 5b 78     mr      r1,r11
> > c00c64dc:       7c 08 03 a6     mtlr    r0
> > c00c64e0:       4e 80 00 20     blr
> > 
> > 
> > My version looks fine.  I'm thinking that this is a separate issue than what
> > Eran is seeing.
> > 
> > Eran, can you do an "objdump -dr vmlinux" and search for __d_lookup, and
> > print out the end of the function dump.
> > 
> > Thanks,
> > 
> > -- Steve
> > 
> > 
> > 
> >   
> powerpc-linux-gnu-objdump -dr --start-address=0xc00bb584 vmlinux | head -n 100
> 
> vmlinux:     file format elf32-powerpc
> 
> Disassembly of section .text:
> 
> c00bb584 <__d_lookup>:

[...]

> c00bb670:       41 9e 00 50     beq-    cr7,c00bb6c0 <__d_lookup+0x13c>
> c00bb674:       83 de 00 00     lwz     r30,0(r30)
> c00bb678:       2f 9e 00 00     cmpwi   cr7,r30,0
> c00bb67c:       40 9e ff 98     bne+    cr7,c00bb614 <__d_lookup+0x90>
> c00bb680:       38 60 00 00     li      r3,0
> c00bb684:       81 61 00 00     lwz     r11,0(r1)
> c00bb688:       80 0b 00 04     lwz     r0,4(r11)
> c00bb68c:       7d 61 5b 78     mr      r1,r11

[ BUG HERE IF INTERRUPT HAPPENS ]

> c00bb690:       bb 0b ff e0     lmw     r24,-32(r11)
> c00bb694:       7c 08 03 a6     mtlr    r0
> c00bb698:       4e 80 00 20     blr

Yep, you have the same bug in your compiler.

-- Steve



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-20 13:43                               ` Steven Rostedt
@ 2008-08-20 14:02                                 ` Eran Liberty
  2008-08-20 14:55                                   ` Jon Smirl
                                                     ` (2 more replies)
  0 siblings, 3 replies; 81+ messages in thread
From: Eran Liberty @ 2008-08-20 14:02 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Benjamin Herrenschmidt, Paul E. McKenney, Mathieu Desnoyers,
	linux-kernel, linuxppc-dev, Steven Rostedt, Scott Wood,
	Alan Modra, Segher Boessenkool

Steven Rostedt wrote:
> On Wed, 20 Aug 2008, Eran Liberty wrote:
>
>   
>> Steven Rostedt wrote:
>>     
>>> On Wed, 20 Aug 2008, Steven Rostedt wrote:
>>>
>>>   
>>>       
>>>> On Wed, 20 Aug 2008, Benjamin Herrenschmidt wrote:
>>>>
>>>>     
>>>>         
>>>>> Found the problem (or at least -a- problem), it's a gcc bug.
>>>>>
>>>>> Well, first I must say the code generated by -pg is just plain
>>>>> horrible :-)
>>>>>
>>>>> Appart from that, look at the exit of, for example, __d_lookup, as
>>>>> generated by gcc when ftrace is enabled:
>>>>>
>>>>> c00c0498:       38 60 00 00     li      r3,0
>>>>> c00c049c:       81 61 00 00     lwz     r11,0(r1)
>>>>> c00c04a0:       80 0b 00 04     lwz     r0,4(r11)
>>>>> c00c04a4:       7d 61 5b 78     mr      r1,r11
>>>>> c00c04a8:       bb 0b ff e0     lmw     r24,-32(r11)
>>>>> c00c04ac:       7c 08 03 a6     mtlr    r0
>>>>> c00c04b0:       4e 80 00 20     blr
>>>>>
>>>>> As you can see, it restores r1 -before- it pops r24..r31 off
>>>>> the stack ! I let you imagine what happens if an interrupt happens
>>>>> just in between those two instructions (mr and lmw). We don't do
>>>>> redzones on our ABI, so basically, the registers end up corrupted
>>>>> by the interrupt.
>>>>>       
>>>>>           
>>>> Ouch!  You've disassembled this without -pg too, and it does not have this
>>>> bug? What version of gcc do you have?
>>>>
>>>>     
>>>>         
>>> I have:
>>>  gcc (Debian 4.3.1-2) 4.3.1
>>>
>>> c00c64c8:       81 61 00 00     lwz     r11,0(r1)
>>> c00c64cc:       7f 83 e3 78     mr      r3,r28
>>> c00c64d0:       80 0b 00 04     lwz     r0,4(r11)
>>> c00c64d4:       ba eb ff dc     lmw     r23,-36(r11)
>>> c00c64d8:       7d 61 5b 78     mr      r1,r11
>>> c00c64dc:       7c 08 03 a6     mtlr    r0
>>> c00c64e0:       4e 80 00 20     blr
>>>
>>>
>>> My version looks fine.  I'm thinking that this is a separate issue than what
>>> Eran is seeing.
>>>
>>> Eran, can you do an "objdump -dr vmlinux" and search for __d_lookup, and
>>> print out the end of the function dump.
>>>
>>> Thanks,
>>>
>>> -- Steve
>>>
>>>
>>>
>>>   
>>>       
>> powerpc-linux-gnu-objdump -dr --start-address=0xc00bb584 vmlinux | head -n 100
>>
>> vmlinux:     file format elf32-powerpc
>>
>> Disassembly of section .text:
>>
>> c00bb584 <__d_lookup>:
>>     
>
> [...]
>
>   
>> c00bb670:       41 9e 00 50     beq-    cr7,c00bb6c0 <__d_lookup+0x13c>
>> c00bb674:       83 de 00 00     lwz     r30,0(r30)
>> c00bb678:       2f 9e 00 00     cmpwi   cr7,r30,0
>> c00bb67c:       40 9e ff 98     bne+    cr7,c00bb614 <__d_lookup+0x90>
>> c00bb680:       38 60 00 00     li      r3,0
>> c00bb684:       81 61 00 00     lwz     r11,0(r1)
>> c00bb688:       80 0b 00 04     lwz     r0,4(r11)
>> c00bb68c:       7d 61 5b 78     mr      r1,r11
>>     
>
> [ BUG HERE IF INTERRUPT HAPPENS ]
>
>   
>> c00bb690:       bb 0b ff e0     lmw     r24,-32(r11)
>> c00bb694:       7c 08 03 a6     mtlr    r0
>> c00bb698:       4e 80 00 20     blr
>>     
>
> Yep, you have the same bug in your compiler.
>
> -- Steve
>   
Hmm... so whats now?

Is there a way to prove this scenario is indeed the one that caused the 
opps?

-- Liberty

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-20 13:14                         ` Steven Rostedt
  2008-08-20 13:19                           ` Steven Rostedt
@ 2008-08-20 14:16                           ` Josh Boyer
  2008-08-20 14:22                             ` Steven Rostedt
  2008-08-20 21:36                           ` ftrace introduces instability into kernel 2.6.27(-rc2,-rc3) Benjamin Herrenschmidt
  2 siblings, 1 reply; 81+ messages in thread
From: Josh Boyer @ 2008-08-20 14:16 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Benjamin Herrenschmidt, Eran Liberty, Mathieu Desnoyers,
	linux-kernel, linuxppc-dev, Steven Rostedt, Alan Modra,
	Scott Wood, Paul E. McKenney, segher

On Wed, 2008-08-20 at 09:14 -0400, Steven Rostedt wrote:
> On Wed, 20 Aug 2008, Benjamin Herrenschmidt wrote:
> 
> > Found the problem (or at least -a- problem), it's a gcc bug.
> > 
> > Well, first I must say the code generated by -pg is just plain
> > horrible :-)
> > 
> > Appart from that, look at the exit of, for example, __d_lookup, as
> > generated by gcc when ftrace is enabled:
> > 
> > c00c0498:       38 60 00 00     li      r3,0
> > c00c049c:       81 61 00 00     lwz     r11,0(r1)
> > c00c04a0:       80 0b 00 04     lwz     r0,4(r11)
> > c00c04a4:       7d 61 5b 78     mr      r1,r11
> > c00c04a8:       bb 0b ff e0     lmw     r24,-32(r11)
> > c00c04ac:       7c 08 03 a6     mtlr    r0
> > c00c04b0:       4e 80 00 20     blr
> > 
> > As you can see, it restores r1 -before- it pops r24..r31 off
> > the stack ! I let you imagine what happens if an interrupt happens
> > just in between those two instructions (mr and lmw). We don't do
> > redzones on our ABI, so basically, the registers end up corrupted
> > by the interrupt.
> 
> Ouch!  You've disassembled this without -pg too, and it does not have this 
> bug? What version of gcc do you have?

Segher was looking at this a bit this morning.  He thinks it's really
-fno-omit-frame-pointer that is causing this.  That really shouldn't
even be set on PowerPC, but FTRACE uses select which overrides the
depends on stuff in Kconfig.

Segher can probably tell you more once his email is back up.

josh


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-20 14:16                           ` Josh Boyer
@ 2008-08-20 14:22                             ` Steven Rostedt
  2008-08-20 14:50                               ` Josh Boyer
  0 siblings, 1 reply; 81+ messages in thread
From: Steven Rostedt @ 2008-08-20 14:22 UTC (permalink / raw)
  To: Josh Boyer
  Cc: Benjamin Herrenschmidt, Eran Liberty, Mathieu Desnoyers,
	linux-kernel, linuxppc-dev, Steven Rostedt, Alan Modra,
	Scott Wood, Paul E. McKenney, segher


On Wed, 20 Aug 2008, Josh Boyer wrote:
> 
> Segher was looking at this a bit this morning.  He thinks it's really
> -fno-omit-frame-pointer that is causing this.  That really shouldn't
> even be set on PowerPC, but FTRACE uses select which overrides the
> depends on stuff in Kconfig.

I can easily make a patch that makes that select an x86 only.

-- Steve


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-20 14:22                             ` Steven Rostedt
@ 2008-08-20 14:50                               ` Josh Boyer
  2008-09-15 16:30                                 ` [PATCH 2.6.26] SERIAL DRIVER: Handle Multiple consecutive sysrq from the serial Eran Liberty
  0 siblings, 1 reply; 81+ messages in thread
From: Josh Boyer @ 2008-08-20 14:50 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Benjamin Herrenschmidt, Eran Liberty, Mathieu Desnoyers,
	linux-kernel, linuxppc-dev, Steven Rostedt, Alan Modra,
	Scott Wood, Paul E. McKenney, segher

On Wed, 20 Aug 2008 10:22:49 -0400 (EDT)
Steven Rostedt <rostedt@goodmis.org> wrote:

> 
> On Wed, 20 Aug 2008, Josh Boyer wrote:
> > 
> > Segher was looking at this a bit this morning.  He thinks it's really
> > -fno-omit-frame-pointer that is causing this.  That really shouldn't
> > even be set on PowerPC, but FTRACE uses select which overrides the
> > depends on stuff in Kconfig.
> 
> I can easily make a patch that makes that select an x86 only.

That's probably a first step, but you might want to wait until Segher
can fill in more details.  I'm sort of just relaying what he and I were
talking about on IRC.  IIRC, he was testing builds with and without
both -pg and -fno-omit-frame-pointer and found the bug only when
-fno-omit-frame-pointer was present.

josh

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-20 14:02                                 ` Eran Liberty
@ 2008-08-20 14:55                                   ` Jon Smirl
  2008-08-20 15:23                                     ` Steven Rostedt
  2008-08-20 18:23                                     ` Eran Liberty
  2008-08-20 15:27                                   ` Steven Rostedt
  2008-08-20 21:37                                   ` Benjamin Herrenschmidt
  2 siblings, 2 replies; 81+ messages in thread
From: Jon Smirl @ 2008-08-20 14:55 UTC (permalink / raw)
  To: Eran Liberty
  Cc: Steven Rostedt, Mathieu Desnoyers, linux-kernel, linuxppc-dev,
	Steven Rostedt, Alan Modra, Scott Wood, Paul E. McKenney

On 8/20/08, Eran Liberty <liberty@extricom.com> wrote:
> Steven Rostedt wrote:
>
> > On Wed, 20 Aug 2008, Eran Liberty wrote:
> >
> >
> >
> > > Steven Rostedt wrote:
> > >
> > >
> > > > On Wed, 20 Aug 2008, Steven Rostedt wrote:
> > > >
> > > >
> > > >
> > > > > On Wed, 20 Aug 2008, Benjamin Herrenschmidt wrote:
> > > > >
> > > > >
> > > > >
> > > > > > Found the problem (or at least -a- problem), it's a gcc bug.
> > > > > >
> > > > > > Well, first I must say the code generated by -pg is just plain
> > > > > > horrible :-)
> > > > > >
> > > > > > Appart from that, look at the exit of, for example, __d_lookup, as
> > > > > > generated by gcc when ftrace is enabled:
> > > > > >
> > > > > > c00c0498:       38 60 00 00     li      r3,0
> > > > > > c00c049c:       81 61 00 00     lwz     r11,0(r1)
> > > > > > c00c04a0:       80 0b 00 04     lwz     r0,4(r11)
> > > > > > c00c04a4:       7d 61 5b 78     mr      r1,r11
> > > > > > c00c04a8:       bb 0b ff e0     lmw     r24,-32(r11)
> > > > > > c00c04ac:       7c 08 03 a6     mtlr    r0
> > > > > > c00c04b0:       4e 80 00 20     blr
> > > > > >
> > > > > > As you can see, it restores r1 -before- it pops r24..r31 off
> > > > > > the stack ! I let you imagine what happens if an interrupt happens
> > > > > > just in between those two instructions (mr and lmw). We don't do
> > > > > > redzones on our ABI, so basically, the registers end up corrupted
> > > > > > by the interrupt.
> > > > > >
> > > > > >
> > > > > Ouch!  You've disassembled this without -pg too, and it does not
> have this
> > > > > bug? What version of gcc do you have?
> > > > >
> > > > >
> > > > >
> > > > I have:
> > > >  gcc (Debian 4.3.1-2) 4.3.1
> > > >
> > > > c00c64c8:       81 61 00 00     lwz     r11,0(r1)
> > > > c00c64cc:       7f 83 e3 78     mr      r3,r28
> > > > c00c64d0:       80 0b 00 04     lwz     r0,4(r11)
> > > > c00c64d4:       ba eb ff dc     lmw     r23,-36(r11)
> > > > c00c64d8:       7d 61 5b 78     mr      r1,r11
> > > > c00c64dc:       7c 08 03 a6     mtlr    r0
> > > > c00c64e0:       4e 80 00 20     blr
> > > >
> > > >
> > > > My version looks fine.  I'm thinking that this is a separate issue
> than what
> > > > Eran is seeing.
> > > >
> > > > Eran, can you do an "objdump -dr vmlinux" and search for __d_lookup,
> and
> > > > print out the end of the function dump.
> > > >
> > > > Thanks,
> > > >
> > > > -- Steve
> > > >
> > > >
> > > >
> > > >
> > > >
> > > powerpc-linux-gnu-objdump -dr --start-address=0xc00bb584 vmlinux | head
> -n 100
> > >
> > > vmlinux:     file format elf32-powerpc
> > >
> > > Disassembly of section .text:
> > >
> > > c00bb584 <__d_lookup>:
> > >
> > >
> >
> > [...]
> >
> >
> >
> > > c00bb670:       41 9e 00 50     beq-    cr7,c00bb6c0 <__d_lookup+0x13c>
> > > c00bb674:       83 de 00 00     lwz     r30,0(r30)
> > > c00bb678:       2f 9e 00 00     cmpwi   cr7,r30,0
> > > c00bb67c:       40 9e ff 98     bne+    cr7,c00bb614 <__d_lookup+0x90>
> > > c00bb680:       38 60 00 00     li      r3,0
> > > c00bb684:       81 61 00 00     lwz     r11,0(r1)
> > > c00bb688:       80 0b 00 04     lwz     r0,4(r11)
> > > c00bb68c:       7d 61 5b 78     mr      r1,r11
> > >
> > >
> >
> > [ BUG HERE IF INTERRUPT HAPPENS ]
> >
> >
> >
> > > c00bb690:       bb 0b ff e0     lmw     r24,-32(r11)
> > > c00bb694:       7c 08 03 a6     mtlr    r0
> > > c00bb698:       4e 80 00 20     blr
> > >
> > >
> >
> > Yep, you have the same bug in your compiler.
> >
> > -- Steve
> >
> >
>  Hmm... so whats now?
>
>  Is there a way to prove this scenario is indeed the one that caused the
> opps?

Manually edit the broken binary to change the order of the restore and
see if the problem disappears. That will keep everything else
constant.


>
>  -- Liberty
>
>  _______________________________________________
>  Linuxppc-dev mailing list
>  Linuxppc-dev@ozlabs.org
>  https://ozlabs.org/mailman/listinfo/linuxppc-dev
>


-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-20 14:55                                   ` Jon Smirl
@ 2008-08-20 15:23                                     ` Steven Rostedt
  2008-08-20 18:23                                     ` Eran Liberty
  1 sibling, 0 replies; 81+ messages in thread
From: Steven Rostedt @ 2008-08-20 15:23 UTC (permalink / raw)
  To: Jon Smirl
  Cc: Eran Liberty, Mathieu Desnoyers, linux-kernel, linuxppc-dev,
	Steven Rostedt, Alan Modra, Scott Wood, Paul E. McKenney


On Wed, 20 Aug 2008, Jon Smirl wrote:
> > >
> > > > c00bb690:       bb 0b ff e0     lmw     r24,-32(r11)
> > > > c00bb694:       7c 08 03 a6     mtlr    r0
> > > > c00bb698:       4e 80 00 20     blr
> > > >
> > > >
> > >
> > > Yep, you have the same bug in your compiler.
> > >
> > > -- Steve
> > >
> > >
> >  Hmm... so whats now?
> >
> >  Is there a way to prove this scenario is indeed the one that caused the
> > opps?
> 
> Manually edit the broken binary to change the order of the restore and
> see if the problem disappears. That will keep everything else
> constant.

Does it only happen on this function? Or is it happening on others as 
well. So manually fixing it here won't help much unless we fix it 
everywhere.

-- Steve


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-20 14:02                                 ` Eran Liberty
  2008-08-20 14:55                                   ` Jon Smirl
@ 2008-08-20 15:27                                   ` Steven Rostedt
  2008-08-20 21:37                                   ` Benjamin Herrenschmidt
  2 siblings, 0 replies; 81+ messages in thread
From: Steven Rostedt @ 2008-08-20 15:27 UTC (permalink / raw)
  To: Eran Liberty
  Cc: Benjamin Herrenschmidt, Paul E. McKenney, Mathieu Desnoyers,
	linux-kernel, linuxppc-dev, Steven Rostedt, Scott Wood,
	Alan Modra, Segher Boessenkool


On Wed, 20 Aug 2008, Eran Liberty wrote:
> > 
> >   
> > > c00bb670:       41 9e 00 50     beq-    cr7,c00bb6c0 <__d_lookup+0x13c>
> > > c00bb674:       83 de 00 00     lwz     r30,0(r30)
> > > c00bb678:       2f 9e 00 00     cmpwi   cr7,r30,0
> > > c00bb67c:       40 9e ff 98     bne+    cr7,c00bb614 <__d_lookup+0x90>
> > > c00bb680:       38 60 00 00     li      r3,0
> > > c00bb684:       81 61 00 00     lwz     r11,0(r1)
> > > c00bb688:       80 0b 00 04     lwz     r0,4(r11)
> > > c00bb68c:       7d 61 5b 78     mr      r1,r11
> > >     
> > 
> > [ BUG HERE IF INTERRUPT HAPPENS ]
> > 
> >   
> > > c00bb690:       bb 0b ff e0     lmw     r24,-32(r11)
> > > c00bb694:       7c 08 03 a6     mtlr    r0
> > > c00bb698:       4e 80 00 20     blr
> > >     
> > 
> > Yep, you have the same bug in your compiler.
> > 
> > -- Steve
> >   
> Hmm... so whats now?

Try anothe gcc version ;-)

> 
> Is there a way to prove this scenario is indeed the one that caused the opps?

I'm not convinced that this is the bug you were seeing. Do you still get 
the crash if you disable dynamic ftrace but leave ftrace on?

-- Steve


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-20 14:55                                   ` Jon Smirl
  2008-08-20 15:23                                     ` Steven Rostedt
@ 2008-08-20 18:23                                     ` Eran Liberty
  2008-08-20 18:33                                       ` Steven Rostedt
  1 sibling, 1 reply; 81+ messages in thread
From: Eran Liberty @ 2008-08-20 18:23 UTC (permalink / raw)
  To: Jon Smirl
  Cc: Steven Rostedt, Mathieu Desnoyers, linux-kernel, linuxppc-dev,
	Steven Rostedt, Alan Modra, Scott Wood, Paul E. McKenney

Jon Smirl wrote:
> Manually edit the broken binary to change the order of the restore and
> see if the problem disappears. That will keep everything else
> constant.
>
>   
checked with objdump where __d_lookup()

 > powerpc-linux-gnu-objdump -dr --start-address=0xc0065790 vmlinux | 
grep "<__d_lookup>:"
c00b91d8 <__d_lookup>:

 > liberty@liberty:~/svn/exsw1600-2.6.27-rc2$ powerpc-linux-gnu-objdump 
-dr --start-address=0xc00b91d8 vmlinux| head -n 100

vmlinux:     file format elf32-powerpc

Disassembly of section .text:

c00b91d8 <__d_lookup>:
c00b91d8:       7c 08 02 a6     mflr    r0
c00b91dc:       90 01 00 04     stw     r0,4(r1)
 ....

c00b92bc:       4e 80 04 21     bctrl
c00b92c0:       2f 83 00 00     cmpwi   cr7,r3,0
c00b92c4:       41 9e 00 50     beq-    cr7,c00b9314 <__d_lookup+0x13c>
c00b92c8:       83 de 00 00     lwz     r30,0(r30)
c00b92cc:       2f 9e 00 00     cmpwi   cr7,r30,0
c00b92d0:       40 9e ff 98     bne+    cr7,c00b9268 <__d_lookup+0x90>
c00b92d4:       38 60 00 00     li      r3,0
c00b92d8:       81 61 00 00     lwz     r11,0(r1)
c00b92dc:       80 0b 00 04     lwz     r0,4(r11)
c00b92e0:       7d 61 5b 78     mr      r1,r11
       <=== As explained by Steve, these two should be replaced ===>
c00b92e4:       bb 0b ff e0     lmw     r24,-32(r11)
c00b92e8:       7c 08 03 a6     mtlr    r0
c00b92ec:       4e 80 00 20     blr
c00b92f0:       80 04 00 04     lwz     r0,4(r4)
  ....
c00b9330:       7f a3 eb 78     mr      r3,r29
c00b9334:       4b ff ff a4     b       c00b92d8 <__d_lookup+0x100>

c00b9338 <d_lookup>:
c00b9338:       7c 08 02 a6     mflr    r0

on the target I fired up xmon and replaced them.

~ # echo x > /proc/sysrq-trigger
SysRq : Entering xmon
Vector: 0  at [df51fdb8]
    pc: c0025960: sysrq_handle_xmon+0x60/0x64
    lr: c0025960: sysrq_handle_xmon+0x60/0x64
    sp: df51fe80
   msr: 21000
  current = 0xdc22a9a0
    pid   = 1698, comm = echo
WARNING: exception is not recoverable, can't continue
enter ? for help
[df51fe90] c0193c38 __handle_sysrq+0xa8/0x178
[df51fec0] c00ee818 write_sysrq_trigger+0x78/0x7c
[df51fed0] c00e65e4 proc_reg_write+0x5c/0x84
[df51fef0] c00a299c vfs_write+0xc8/0x180
[df51ff10] c00a2f40 sys_write+0x5c/0xa4
[df51ff40] c0010554 ret_from_syscall+0x0/0x3c
SP (bffe87e0) is in userspace

mon> di c00b92d0
c00b92d0  409eff98      bne     cr7,c00b9268    # __d_lookup+0x90/0x160
c00b92d4  38600000      li      r3,0
c00b92d8  81610000      lwz     r11,0(r1)
c00b92dc  800b0004      lwz     r0,4(r11)
c00b92e0  7d615b78      mr      r1,r11
  <=== wrong order ===>
c00b92e4  bb0bffe0      lmw     r24,-32(r11)
c00b92e8  7c0803a6      mtlr    r0
c00b92ec  4e800020      blr
c00b92f0  80040004      lwz     r0,4(r4)
c00b92f4  7f80c800      cmpw    cr7,r0,r25
c00b92f8  409eff64      bne     cr7,c00b925c    # __d_lookup+0x84/0x160
c00b92fc  80640008      lwz     r3,8(r4)
c00b9300  7f25cb78      mr      r5,r25
c00b9304  7f04c378      mr      r4,r24
c00b9308  4bf5ccf1      bl      c0015ff8        # memcmp+0x0/0x30
c00b930c  2f830000      cmpwi   cr7,r3,0
mon> m c00b92e0
c00b92e0  7d l
c00b92e0  7d615b78 bb0bffe0
c00b92e4  bb0bffe0 7d615b78
c00b92e8  7c0803a6 x
mon> di c00b92d0
c00b92d0  409eff98      bne     cr7,c00b9268    # __d_lookup+0x90/0x160
c00b92d4  38600000      li      r3,0
c00b92d8  81610000      lwz     r11,0(r1)
c00b92dc  800b0004      lwz     r0,4(r11)
c00b92e0  bb0bffe0      lmw     r24,-32(r11)
  <=== right order ===>
c00b92e4  7d615b78      mr      r1,r11
c00b92e8  7c0803a6      mtlr    r0
c00b92ec  4e800020      blr
c00b92f0  80040004      lwz     r0,4(r4)
c00b92f4  7f80c800      cmpw    cr7,r0,r25
c00b92f8  409eff64      bne     cr7,c00b925c    # __d_lookup+0x84/0x160
c00b92fc  80640008      lwz     r3,8(r4)
c00b9300  7f25cb78      mr      r5,r25
c00b9304  7f04c378      mr      r4,r24
c00b9308  4bf5ccf1      bl      c0015ff8        # memcmp+0x0/0x30
c00b930c  2f830000      cmpwi   cr7,r3,0
mon> X
~ #

And took it for a test drive
~# while [ 1 ] ; do find / > /dev/null ; echo .  ; done
.
.
.
[ 10 min later ]
.
.
.

Stable! Yeepee.

So how do we get this to be right in the first place and everywhere?

-- Liberty


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-20 18:23                                     ` Eran Liberty
@ 2008-08-20 18:33                                       ` Steven Rostedt
  0 siblings, 0 replies; 81+ messages in thread
From: Steven Rostedt @ 2008-08-20 18:33 UTC (permalink / raw)
  To: Eran Liberty
  Cc: Jon Smirl, Mathieu Desnoyers, linux-kernel, linuxppc-dev,
	Steven Rostedt, Alan Modra, Scott Wood, Paul E. McKenney


On Wed, 20 Aug 2008, Eran Liberty wrote:
> mon> X
> ~ #
> 
> And took it for a test drive
> ~# while [ 1 ] ; do find / > /dev/null ; echo .  ; done
> .
> .
> .
> [ 10 min later ]
> .
> .
> .
> 
> Stable! Yeepee.

Eran, thanks a lot for testing this!

> 
> So how do we get this to be right in the first place and everywhere?
> 

This is a bug in gcc, and should be mentioned to the gcc maintainers.

-- Steve


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-20 13:14                         ` Steven Rostedt
  2008-08-20 13:19                           ` Steven Rostedt
  2008-08-20 14:16                           ` Josh Boyer
@ 2008-08-20 21:36                           ` Benjamin Herrenschmidt
  2008-08-20 21:44                             ` Steven Rostedt
  2 siblings, 1 reply; 81+ messages in thread
From: Benjamin Herrenschmidt @ 2008-08-20 21:36 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Paul E. McKenney, Mathieu Desnoyers, linux-kernel, linuxppc-dev,
	Steven Rostedt, Scott Wood, Eran Liberty, Alan Modra,
	Segher Boessenkool


> > As you can see, it restores r1 -before- it pops r24..r31 off
> > the stack ! I let you imagine what happens if an interrupt happens
> > just in between those two instructions (mr and lmw). We don't do
> > redzones on our ABI, so basically, the registers end up corrupted
> > by the interrupt.
> 
> Ouch!  You've disassembled this without -pg too, and it does not have this 
> bug? What version of gcc do you have?

Oops, should have mentioned it ;-)

4.2.3 (Ubuntu 4.2.3-2ubuntu7)

Ben.



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-20 14:02                                 ` Eran Liberty
  2008-08-20 14:55                                   ` Jon Smirl
  2008-08-20 15:27                                   ` Steven Rostedt
@ 2008-08-20 21:37                                   ` Benjamin Herrenschmidt
  2 siblings, 0 replies; 81+ messages in thread
From: Benjamin Herrenschmidt @ 2008-08-20 21:37 UTC (permalink / raw)
  To: Eran Liberty
  Cc: Steven Rostedt, Paul E. McKenney, Mathieu Desnoyers,
	linux-kernel, linuxppc-dev, Steven Rostedt, Scott Wood,
	Alan Modra, Segher Boessenkool


> Hmm... so whats now?
> 
> Is there a way to prove this scenario is indeed the one that caused the 
> opps?

I've verified in sim that the crash in do_lookup() I was seeing (similar
backtrace) was indeed caused by an interrupt coming between those two
instructions and clobbering the saved GPRs.

So I think it's very likely our problem.

Ben.



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)
  2008-08-20 21:36                           ` ftrace introduces instability into kernel 2.6.27(-rc2,-rc3) Benjamin Herrenschmidt
@ 2008-08-20 21:44                             ` Steven Rostedt
  0 siblings, 0 replies; 81+ messages in thread
From: Steven Rostedt @ 2008-08-20 21:44 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Paul E. McKenney, Mathieu Desnoyers, linux-kernel, linuxppc-dev,
	Steven Rostedt, Scott Wood, Eran Liberty, Alan Modra,
	Segher Boessenkool


On Thu, 21 Aug 2008, Benjamin Herrenschmidt wrote:

> 
> > > As you can see, it restores r1 -before- it pops r24..r31 off
> > > the stack ! I let you imagine what happens if an interrupt happens
> > > just in between those two instructions (mr and lmw). We don't do
> > > redzones on our ABI, so basically, the registers end up corrupted
> > > by the interrupt.
> > 
> > Ouch!  You've disassembled this without -pg too, and it does not have this 
> > bug? What version of gcc do you have?
> 
> Oops, should have mentioned it ;-)
> 
> 4.2.3 (Ubuntu 4.2.3-2ubuntu7)

I'm running a 4.3.1 (Debian 4.3.1-2) on my powerbook. Perhaps the issue is 
already taken care of.

-- Steve


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [PATCH 2.6.26] SERIAL DRIVER: Handle Multiple consecutive sysrq from the serial
  2008-08-20 14:50                               ` Josh Boyer
@ 2008-09-15 16:30                                 ` Eran Liberty
  2008-09-17 23:46                                   ` Andrew Morton
  0 siblings, 1 reply; 81+ messages in thread
From: Eran Liberty @ 2008-09-15 16:30 UTC (permalink / raw)
  Cc: linux-kernel, linux-serial

[-- Attachment #1: Type: text/plain, Size: 1941 bytes --]

Dear Penguins,

Let me start of by saying my particular hardware must be buggy in some 
way. When I issue a sysrq (Ctrl A+ F from minicom) I get a lot of sysrq 
triggers.

I have worked around the problem and I think this workaround is a viable 
patch even for platforms which do not exhibit this peculiar behavior.

upon getting numerous interrupts which request sysrq the function 
uart_handle_break in include/linux/serial_core.h is hit multiple times.
The current code which looks like this:

static inline int uart_handle_break(struct uart_port *port)
 {
        struct uart_info *info = port->info;
 #ifdef SUPPORT_SYSRQ
        if (port->cons && port->cons->index == port->line) {
               if (!port->sysrq) {
                       port->sysrq = jiffies + HZ*5;
                       return 1;
               }
               port->sysrq = 0;
        }
 #endif
        if (port->flags & UPF_SAK)
                do_SAK(info->tty);
       return 0;
}

Will basicly toggle port->sysrq between a timeout value and zero. If you 
are lucky this penguin rullet will stop on timeout and the next 
character hit will trigger the sysrq in the function 
"uart_handle_sysrq_char". But if you are not so lucky the last sysrq 
interupt will toggle port->sysrq to zero and the next char hit will be 
ignored (not trigger sysrq).

The suggested patch will do the next few things:

1. "port->sysrq" is now the time when the last sysrq was triggered and 
not the timeout for the the next char
2. Stamped "port->sysrq" every time there is a sysrq rather then toggled 
it up and down.
3. Always continue to consider UPF_SAK.
4. "port->sysrq" is toggled back to zero only in uart_handle_break() and 
only if the a char has been accepted after the sysrq timeout (5 sec)
5. uart_handle_break() will ignore extra chars received in super human 
speed after the last sysrq (0.01 sec)

Liberty

Signed-off-by: Eran Liberty <liberty@extricom.org>
---

[-- Attachment #2: sysrq.patch --]
[-- Type: text/x-patch, Size: 1068 bytes --]

Index: include/linux/serial_core.h
===================================================================
--- include/linux/serial_core.h	(revision 119)
+++ include/linux/serial_core.h	(revision 120)
@@ -447,8 +447,8 @@
 uart_handle_sysrq_char(struct uart_port *port, unsigned int ch)
 {
 #ifdef SUPPORT_SYSRQ
-	if (port->sysrq) {
-		if (ch && time_before(jiffies, port->sysrq)) {
+	if (port->sysrq && time_after(jiffies, port->sysrq + (unsigned long)(HZ*0.01))) {
+		if (ch && time_before(jiffies, port->sysrq + HZ*5)) {
 			handle_sysrq(ch, port->info ? port->info->tty : NULL);
 			port->sysrq = 0;
 			return 1;
@@ -467,19 +467,17 @@
  */
 static inline int uart_handle_break(struct uart_port *port)
 {
+	int ret = 0;
 	struct uart_info *info = port->info;
 #ifdef SUPPORT_SYSRQ
 	if (port->cons && port->cons->index == port->line) {
-		if (!port->sysrq) {
-			port->sysrq = jiffies + HZ*5;
-			return 1;
-		}
-		port->sysrq = 0;
+		port->sysrq = jiffies;
+		ret = 1;
 	}
 #endif
 	if (port->flags & UPF_SAK)
 		do_SAK(info->tty);
-	return 0;
+	return ret;
 }

 /**

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 2.6.26] SERIAL DRIVER: Handle Multiple consecutive sysrq from the serial
  2008-09-15 16:30                                 ` [PATCH 2.6.26] SERIAL DRIVER: Handle Multiple consecutive sysrq from the serial Eran Liberty
@ 2008-09-17 23:46                                   ` Andrew Morton
  2008-09-18  6:58                                     ` Eran Liberty
  0 siblings, 1 reply; 81+ messages in thread
From: Andrew Morton @ 2008-09-17 23:46 UTC (permalink / raw)
  To: Eran Liberty; +Cc: linux-kernel, linux-serial

On Mon, 15 Sep 2008 19:30:05 +0300
Eran Liberty <liberty@extricom.com> wrote:

> Dear Penguins,
> 
> Let me start of by saying my particular hardware must be buggy in some 
> way. When I issue a sysrq (Ctrl A+ F from minicom) I get a lot of sysrq 
> triggers.
> 
> I have worked around the problem and I think this workaround is a viable 
> patch even for platforms which do not exhibit this peculiar behavior.
> 
> upon getting numerous interrupts which request sysrq the function 
> uart_handle_break in include/linux/serial_core.h is hit multiple times.
> The current code which looks like this:
> 
> static inline int uart_handle_break(struct uart_port *port)
>  {
>         struct uart_info *info = port->info;
>  #ifdef SUPPORT_SYSRQ
>         if (port->cons && port->cons->index == port->line) {
>                if (!port->sysrq) {
>                        port->sysrq = jiffies + HZ*5;
>                        return 1;
>                }
>                port->sysrq = 0;
>         }
>  #endif
>         if (port->flags & UPF_SAK)
>                 do_SAK(info->tty);
>        return 0;
> }
> 
> Will basicly toggle port->sysrq between a timeout value and zero. If you 
> are lucky this penguin rullet will stop on timeout and the next 
> character hit will trigger the sysrq in the function 
> "uart_handle_sysrq_char". But if you are not so lucky the last sysrq 
> interupt will toggle port->sysrq to zero and the next char hit will be 
> ignored (not trigger sysrq).
> 
> The suggested patch will do the next few things:
> 
> 1. "port->sysrq" is now the time when the last sysrq was triggered and 
> not the timeout for the the next char
> 2. Stamped "port->sysrq" every time there is a sysrq rather then toggled 
> it up and down.
> 3. Always continue to consider UPF_SAK.
> 4. "port->sysrq" is toggled back to zero only in uart_handle_break() and 
> only if the a char has been accepted after the sysrq timeout (5 sec)
> 5. uart_handle_break() will ignore extra chars received in super human 
> speed after the last sysrq (0.01 sec)
> 

yes, that could be irritating.

> Index: include/linux/serial_core.h
> ===================================================================
> --- include/linux/serial_core.h	(revision 119)
> +++ include/linux/serial_core.h	(revision 120)

We prefer patches in `patch -p1' form, please.

Even after fixing that, none of it applied, so I typed it in again.

> @@ -447,8 +447,8 @@
>  uart_handle_sysrq_char(struct uart_port *port, unsigned int ch)
>  {
>  #ifdef SUPPORT_SYSRQ
> -	if (port->sysrq) {
> -		if (ch && time_before(jiffies, port->sysrq)) {
> +	if (port->sysrq && time_after(jiffies, port->sysrq + (unsigned long)(HZ*0.01))) {
> +		if (ch && time_before(jiffies, port->sysrq + HZ*5)) {
>  			handle_sysrq(ch, port->info ? port->info->tty : NULL);
>  			port->sysrq = 0;
>  			return 1;
> @@ -467,19 +467,17 @@
>   */
>  static inline int uart_handle_break(struct uart_port *port)
>  {
> +	int ret = 0;
>  	struct uart_info *info = port->info;
>  #ifdef SUPPORT_SYSRQ
>  	if (port->cons && port->cons->index == port->line) {
> -		if (!port->sysrq) {
> -			port->sysrq = jiffies + HZ*5;
> -			return 1;
> -		}
> -		port->sysrq = 0;
> +		port->sysrq = jiffies;
> +		ret = 1;
>  	}
>  #endif
>  	if (port->flags & UPF_SAK)
>  		do_SAK(info->tty);
> -	return 0;
> +	return ret;
>  }

The 0.01 is a big no-no.  Sometimes gcc like to go into stupid mode and
starts doing floating point stuff.

A suitable fix would be to use HZ/100.  But that assumes that HZ is
always >= 100.  That's a pretty good assumption, and various parts of
the kernel will explode if HZ is set too small.  However it's always
good to ensure that someone else's stuff will explode before yours
does, so how about we make it HZ/50?  Will that still work OK for you?

--- a/include/linux/serial_core.h~serial-driver-handle-multiple-consecutive-sysrq-from-the-serial-fix
+++ a/include/linux/serial_core.h
@@ -444,8 +444,7 @@ static inline int
 uart_handle_sysrq_char(struct uart_port *port, unsigned int ch)
 {
 #ifdef SUPPORT_SYSRQ
-	if (port->sysrq && time_after(jiffies, port->sysrq +
-					(unsigned long)(HZ*0.01))) {
+	if (port->sysrq && time_after(jiffies, port->sysrq + HZ / 50)) {
 		if (ch && time_before(jiffies, port->sysrq + HZ*5)) {
 			handle_sysrq(ch, port->info ? port->info->port.tty : NULL);
 			port->sysrq = 0;
_



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [PATCH 2.6.26] SERIAL DRIVER: Handle Multiple consecutive sysrq from the serial
  2008-09-17 23:46                                   ` Andrew Morton
@ 2008-09-18  6:58                                     ` Eran Liberty
  0 siblings, 0 replies; 81+ messages in thread
From: Eran Liberty @ 2008-09-18  6:58 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, linux-serial

Andrew Morton wrote:
> On Mon, 15 Sep 2008 19:30:05 +0300
> Eran Liberty <liberty@extricom.com> wrote:
>
>   
>> Dear Penguins,
>>
>> Let me start of by saying my particular hardware must be buggy in some 
>> way. When I issue a sysrq (Ctrl A+ F from minicom) I get a lot of sysrq 
>> triggers.
>>
>> I have worked around the problem and I think this workaround is a viable 
>> patch even for platforms which do not exhibit this peculiar behavior.
>>
>> upon getting numerous interrupts which request sysrq the function 
>> uart_handle_break in include/linux/serial_core.h is hit multiple times.
>> The current code which looks like this:
>>
>> static inline int uart_handle_break(struct uart_port *port)
>>  {
>>         struct uart_info *info = port->info;
>>  #ifdef SUPPORT_SYSRQ
>>         if (port->cons && port->cons->index == port->line) {
>>                if (!port->sysrq) {
>>                        port->sysrq = jiffies + HZ*5;
>>                        return 1;
>>                }
>>                port->sysrq = 0;
>>         }
>>  #endif
>>         if (port->flags & UPF_SAK)
>>                 do_SAK(info->tty);
>>        return 0;
>> }
>>
>> Will basicly toggle port->sysrq between a timeout value and zero. If you 
>> are lucky this penguin rullet will stop on timeout and the next 
>> character hit will trigger the sysrq in the function 
>> "uart_handle_sysrq_char". But if you are not so lucky the last sysrq 
>> interupt will toggle port->sysrq to zero and the next char hit will be 
>> ignored (not trigger sysrq).
>>
>> The suggested patch will do the next few things:
>>
>> 1. "port->sysrq" is now the time when the last sysrq was triggered and 
>> not the timeout for the the next char
>> 2. Stamped "port->sysrq" every time there is a sysrq rather then toggled 
>> it up and down.
>> 3. Always continue to consider UPF_SAK.
>> 4. "port->sysrq" is toggled back to zero only in uart_handle_break() and 
>> only if the a char has been accepted after the sysrq timeout (5 sec)
>> 5. uart_handle_break() will ignore extra chars received in super human 
>> speed after the last sysrq (0.01 sec)
>>
>>     
>
> yes, that could be irritating.
>
>   
>> Index: include/linux/serial_core.h
>> ===================================================================
>> --- include/linux/serial_core.h	(revision 119)
>> +++ include/linux/serial_core.h	(revision 120)
>>     
>
> We prefer patches in `patch -p1' form, please.
>
> Even after fixing that, none of it applied, so I typed it in again.
>   
So  should I resubmit or have you done then nasty work for me (thanks :) )?
>   
>> @@ -447,8 +447,8 @@
>>  uart_handle_sysrq_char(struct uart_port *port, unsigned int ch)
>>  {
>>  #ifdef SUPPORT_SYSRQ
>> -	if (port->sysrq) {
>> -		if (ch && time_before(jiffies, port->sysrq)) {
>> +	if (port->sysrq && time_after(jiffies, port->sysrq + (unsigned long)(HZ*0.01))) {
>> +		if (ch && time_before(jiffies, port->sysrq + HZ*5)) {
>>  			handle_sysrq(ch, port->info ? port->info->tty : NULL);
>>  			port->sysrq = 0;
>>  			return 1;
>> @@ -467,19 +467,17 @@
>>   */
>>  static inline int uart_handle_break(struct uart_port *port)
>>  {
>> +	int ret = 0;
>>  	struct uart_info *info = port->info;
>>  #ifdef SUPPORT_SYSRQ
>>  	if (port->cons && port->cons->index == port->line) {
>> -		if (!port->sysrq) {
>> -			port->sysrq = jiffies + HZ*5;
>> -			return 1;
>> -		}
>> -		port->sysrq = 0;
>> +		port->sysrq = jiffies;
>> +		ret = 1;
>>  	}
>>  #endif
>>  	if (port->flags & UPF_SAK)
>>  		do_SAK(info->tty);
>> -	return 0;
>> +	return ret;
>>  }
>>     
>
> The 0.01 is a big no-no.  Sometimes gcc like to go into stupid mode and
> starts doing floating point stuff.
>
> A suitable fix would be to use HZ/100.  But that assumes that HZ is
> always >= 100.  That's a pretty good assumption, and various parts of
> the kernel will explode if HZ is set too small.  However it's always
> good to ensure that someone else's stuff will explode before yours
> does, so how about we make it HZ/50?  Will that still work OK for you?
>   

HZ/50 is totally OK by me.


-- Liberty

^ permalink raw reply	[flat|nested] 81+ messages in thread

end of thread, other threads:[~2008-09-18  6:58 UTC | newest]

Thread overview: 81+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-06-18 14:18 2.6.24-rc4: pci_remove_bus_device() => pci_scan_child_bus() => pci_bus_add_devices bug? Eran Liberty
2008-07-20 10:31 ` [PATCH 2.6.24-rc4] PCI: refuse to re-add a device to a bus upon pci_scan_child_bus() Eran Liberty
2008-07-20 16:48   ` Eran Liberty
2008-07-21 19:18 ` [PATCH 2.6.26] " Eran Liberty
2008-07-21 19:49   ` Matthew Wilcox
2008-07-22  8:21     ` eran liberty
2008-07-22 11:49       ` Matthew Wilcox
2008-07-22 13:08         ` Eran Liberty
2008-07-22 13:14         ` Eran Liberty
2008-07-22 14:13           ` Matthew Wilcox
2008-07-22 15:25             ` Eran Liberty
2008-07-22 16:52               ` Matthew Wilcox
2008-07-22 17:41                 ` Eran Liberty
2008-07-22 18:11                   ` Matthew Wilcox
2008-07-23 18:31                     ` Eran Liberty
2008-07-27 11:01                       ` Eran Liberty
2008-07-27 15:08                         ` Matthew Wilcox
2008-08-18  8:08 ` ftrace introduces instability into kernel 2.6.27(-rc2,-rc3) Eran Liberty
2008-08-18 15:07   ` Steven Rostedt
2008-08-18 15:47     ` Mathieu Desnoyers
2008-08-18 16:12       ` Steven Rostedt
2008-08-18 17:04         ` Mathieu Desnoyers
2008-08-18 17:21       ` Scott Wood
2008-08-18 18:27         ` Steven Rostedt
2008-08-18 18:29           ` Scott Wood
2008-08-19  1:53           ` Benjamin Herrenschmidt
2008-08-19  2:28             ` Steven Rostedt
2008-08-19  2:39               ` Benjamin Herrenschmidt
2008-08-19  2:41                 ` Steven Rostedt
2008-08-19  2:47                   ` Mathieu Desnoyers
2008-08-19  3:32                     ` Steven Rostedt
2008-08-19  3:36                       ` Mathieu Desnoyers
2008-08-19  4:00                         ` Steven Rostedt
2008-08-19 16:47                     ` Steven Rostedt
2008-08-19 17:34                       ` Mathieu Desnoyers
2008-08-19 21:08                         ` Steven Rostedt
2008-08-20  9:40                           ` Nick Piggin
2008-08-19 21:47                         ` Benjamin Herrenschmidt
2008-08-19 23:58                           ` Jeremy Fitzhardinge
2008-08-20  1:17                             ` Benjamin Herrenschmidt
2008-08-19  2:56                 ` Benjamin Herrenschmidt
2008-08-19  3:12                   ` Steven Rostedt
2008-08-19  4:17                     ` Benjamin Herrenschmidt
2008-08-20  7:18                       ` Benjamin Herrenschmidt
2008-08-20 13:14                         ` Steven Rostedt
2008-08-20 13:19                           ` Steven Rostedt
2008-08-20 13:36                             ` Eran Liberty
2008-08-20 13:43                               ` Steven Rostedt
2008-08-20 14:02                                 ` Eran Liberty
2008-08-20 14:55                                   ` Jon Smirl
2008-08-20 15:23                                     ` Steven Rostedt
2008-08-20 18:23                                     ` Eran Liberty
2008-08-20 18:33                                       ` Steven Rostedt
2008-08-20 15:27                                   ` Steven Rostedt
2008-08-20 21:37                                   ` Benjamin Herrenschmidt
2008-08-20 14:16                           ` Josh Boyer
2008-08-20 14:22                             ` Steven Rostedt
2008-08-20 14:50                               ` Josh Boyer
2008-09-15 16:30                                 ` [PATCH 2.6.26] SERIAL DRIVER: Handle Multiple consecutive sysrq from the serial Eran Liberty
2008-09-17 23:46                                   ` Andrew Morton
2008-09-18  6:58                                     ` Eran Liberty
2008-08-20 21:36                           ` ftrace introduces instability into kernel 2.6.27(-rc2,-rc3) Benjamin Herrenschmidt
2008-08-20 21:44                             ` Steven Rostedt
2008-08-18 18:47         ` Steven Rostedt
2008-08-18 18:56           ` Scott Wood
2008-08-18 19:28             ` Steven Rostedt
2008-08-18 18:25     ` Eran Liberty
2008-08-18 18:41       ` Mathieu Desnoyers
2008-08-19  1:54         ` Benjamin Herrenschmidt
2008-08-19  9:56         ` Eran Liberty
2008-08-19 13:02           ` Mathieu Desnoyers
2008-08-19 21:46             ` Benjamin Herrenschmidt
2008-08-18 18:50       ` Steven Rostedt
2008-08-19 12:09         ` Eran Liberty
2008-08-19 13:05           ` Mathieu Desnoyers
2008-08-19 14:21             ` Eran Liberty
2008-08-19 14:42               ` Mathieu Desnoyers
2008-08-19 20:15           ` Steven Rostedt
2008-08-20 11:18             ` Eran Liberty
2008-08-20 13:12               ` Steven Rostedt
2008-08-19  1:51     ` Benjamin Herrenschmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).