Am Dienstag 07 September 2010 schrieb Ted Ts'o: > On Sun, Sep 05, 2010 at 09:53:41AM +0200, Martin Steigerwald wrote: > > Quite some kernels were unbootable with an ext4 and readahead related > > backtrace[1]. > > Unfortunately, you don't have a full backtrace in the picture which > submitted as an attachment to the bugzilla. It shows part of the > backtrace which has an ext4 and readahead stack, yes. But we didn't > get to see the beginning of the stack trace with the IP and the reason > for the oops. If keyboard interrupts still work, you might try seing > if you can scroll upwards and see more of the backtrace. Or you might > try configuring your console to use a higher resolution display so > more lines can be displayed. Or you might try getting a serial > console. Thanks for your detailled analysis. I missed posting an update to the thread. I did not have to go back to those kernels again and bisected the issue down to about 10 revisions, when Alex suggested my bug might be a duplicate of [Bug 28402] random radeon/kms/drm related freezes with kernel 2.6.34 https://bugs.freedesktop.org/show_bug.cgi?id=28402 So I tried some patches in there and the vmembase at zero patch seems to do the trick. Although I am not sure, whether its a solution or a work- around. > I don't recognize the display, but the problem could just as easily be > in the block layer or in the device driver for your hard drive. > (i.e., the readahead stack calls ext4, which in turn will submit a > read request to the block device layer which then submits the request > to a device driver). Yes, I am aware that it may not be a Ext4 problem at all. Thus I said Ext4 / readahead related (!) backtrace (! not bug) cause that was all I could see on the screen. How else should I have described that backtrace when I can't speculate on what I can not see? > But because you keep referring it to it as an ext4/readahead related > backtrace, you may have disguised the symptom enough that people who > might recognize it as, "Oh, yeah, there was this regression in the > SATA layer", wouldn't recognize it as such from your description. > That's why it's important to be careful how you describe issues; if > you had said, I don't have a complete stack trace, and I don't have > the IP and function where the fault occurred, that might have caused > people to think a bit harder about what might be the problem, instead > of thinking to themselves, "ah, well, the ext4 and readahead parts of > the kernel aren't my problem, so I'll ignore this report". I thought thats what the provided backtrace is for. And I think that any developer can see that it isn't complete. I will include a note that the backtrace is incomplete next time nevertheless. It would be good to have a backtrace viewer and saver that still works in those conditions ;-). And when it just writes it somewhere on the swap partition were a tool can grab it after booting again. But when the kernel is completely messed up, exactly that can be very dangerous. > > I am also seeking help with selecting more suitable commits to test: > > If its a Radeon KMS related freeze and everything points at it, I > > think the offending commit is in the first quarter of what git > > commit shows to me[2]. > > You do know that you can restrict a git bisect to commits that modify > a particular part of the tree, right? e.g., > > git bisect start 2.6.34 2.6.33 -- drivers/gpu/drm/radeon Yes, I have seen that in the git manpage, but since I wasn't absolutely sure, that the freeze is radeon kms/drm related I skipped that step. From what I learned I should have looked at git bisect visualize earlier and selected from commits prior and after that drm kms related merges. That would have spared me quite some time when my suspicion was right, like it turned out to be, and wouldn't have taken many more turn arounds when it was wrong. Next time I know this. Thanks for your help, I appreciate it. Ciao, -- Martin 'Helios' Steigerwald - http://www.Lichtvoll.de GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7