linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Holler <holler@ahsoftware.de>
To: Jan Kara <jack@suse.cz>
Cc: Dan Carpenter <dan.carpenter@oracle.com>, linux-kernel@vger.kernel.org
Subject: Re: kernel BUG at fs/buffer.c:3205 (stable 3.5.3)
Date: Thu, 27 Sep 2012 20:01:00 +0200	[thread overview]
Message-ID: <5064945C.4020403@ahsoftware.de> (raw)
In-Reply-To: <50647CCC.7010407@ahsoftware.de>

Am 27.09.2012 18:20, schrieb Alexander Holler:
> Am 27.09.2012 17:46, schrieb Alexander Holler:
>> Hello,
>>
>> Am 27.09.2012 17:12, schrieb Jan Kara:
>>>    Just some thoughts about your oops:
>>> The assertion which fails is:
>>> BUG_ON(!list_empty(&bh->b_assoc_buffers));
>>>
>>> Now b_assoc_buffers isn't used very much. In particular ext4 which you
>>> seem
>>> to be using doesn't use this list at all (except when mounted in
>>> nojournal
>>> mode but that doesn't seem to be your case). That would point rather
>>> strongly at a memory corruption issue.
>>>
>>> So if you can reproduce the oops, it might be interesting to print
>>> bh->b_assoc_buffers.next and &bh->b_assoc_buffers.next if the list is
>>> found
>>> to be non-empty.
>>
>> Hmm, a loose pointer would explain it all too. Especially the cases when
>> I just have seen wrong content in the archive without having any oops. I
>> try to reproduce it with
>>
>> pr_info("AHO: %p %p\n", bh->b_assoc_buffers.next,
>> &bh->b_assoc_buffers.next);
>>
>> after the BUG_ON().
>>
>> Thanks for the hint. I wasn't already that far to know that
>> b_assoc_buffers isn't used that much.
>
> Hmm, that doesn't look very practicable because b_assoc_buffers seems to
> be used a lot here. ;)
> Maybe I should have mentioned that I'm mounting the source filesystem
> (root with ext4) with
> nodelalloc(rw,noatime,nodelalloc,errors=remount-ro,data=ordered), and to
> backup it, I'm using a bind-mount (mount -o bind / /foo) as source.
>
> But the debug output starts very early on boot, where no bind-mount is
> used:
>
> ---------------------
> Sep 27 18:03:23 krabat udevd[1254]: invalid rule
> '/etc/udev/rules.d/80-aho.rules:26'
> Sep 27 18:03:23 krabat kernel: [    4.562670] usb usb8: New USB device
> found, idVendor=1d6b, idProduct=0001
> Sep 27 18:03:23 krabat kernel: [    4.562671] usb usb8: New USB device
> strings: Mfr=3, Product=2, SerialNumber=1
> Sep 27 18:03:23 krabat systemd-uaccess[1363]: Failed to apply ACL on
> /dev/kvm: Operation not supported
> Sep 27 18:03:23 krabat kernel: [    4.562673] usb usb8: Product: UHCI
> Host Controller
> Sep 27 18:03:23 krabat kernel: [    4.562674] usb usb8: Manufacturer:
> Linux 3.5.4-00009-gfa43f23-dirty uhci_hcd
> Sep 27 18:03:23 krabat kernel: [    4.562676] usb usb8: SerialNumber:
> 0000:00:1d.0
> Sep 27 18:03:23 krabat systemd-uaccess[1716]: Failed to apply ACL on
> /dev/kvm: Operation not supported
> Sep 27 18:03:23 krabat kernel: [    4.563285] hub 8-0:1.0: USB hub found
> Sep 27 18:03:23 krabat kernel: [    4.563288] hub 8-0:1.0: 2 ports detected
> Sep 27 18:03:23 krabat systemd-uaccess[2324]: Failed to apply ACL on
> /dev/snd/timer: Operation not supported
> Sep 27 18:03:23 krabat kernel: [    4.563316] AHO: ffff880212e4b048
> ffff880212e4b048
> Sep 27 18:03:23 krabat kernel: [    4.563318] AHO: ffff880212e4b0b0
> ffff880212e4b0b0
> Sep 27 18:03:23 krabat kernel: [    4.563319] AHO: ffff880212e4b118
> ffff880212e4b118
> ---------------------
>
> And afterwards I see tons of those messages, so it doesn't look usable.
> Anyway, I retry to repdroduce the problem without that debug line, just
> to see if still can reproduce the problem with F17 as userspace (and
> kernel 3.5.4 instead of 3.5.3).

After 2 successful tries in sequence, the third failed (sorry, LANG=de):

---------------------------------------------------------------
[root@krabat bind]# tar cp . | mbuffer | bzip2smp 
 >/mnt/usb3/Krabat.Fedora17.sdb2.27.09.12.tar.bz2
in @ 33.1 MiB/s, out @ 38.3 MiB/s,  888 MiB total, buffer  95% fulltar: 
./tmp/.X11-unix/X0: Socket ignoriert
in @  0.0 KiB/s, out @ 20.8 MiB/s, 24.9 GiB total, buffer  22% full
summary: 24.9 GiByte in 19 min 53.0 sec - average of 21.4 MiB/s
[root@krabat bind]# tar djf /mnt/usb3/Krabat.Fedora17.sdb2.27.09.12.tar.bz2
./var/log/messages: Änderungszeit ist unterschiedlich
./var/log/messages: Größe ist unterschiedlich
./var/tmp/kdecache-aholler/icon-cache.kcache: Änderungszeit ist 
unterschiedlich
./var/tmp/kdecache-aholler/icon-cache.kcache: Unterschiedliche Inhalte
./var/tmp/kdecache-aholler/plasma_theme_oxygen.kcache: Änderungszeit ist 
unterschiedlich
./var/tmp/kdecache-aholler/plasma_theme_oxygen.kcache: Unterschiedliche 
Inhalte
./var/lib/chrony/drift: Änderungszeit ist unterschiedlich
./var/lib/chrony/drift: Unterschiedliche Inhalte
./home/aholler/.kde/share/apps/konqueror/autosave/_1.77: Änderungszeit 
ist unterschiedlich
./home/aholler/.kde/share/apps/konqueror/autosave/_1.77: Größe ist 
unterschiedlich
./home/aholler/.kde/share/apps/konqueror/konq_history: Änderungszeit ist 
unterschiedlich
./home/aholler/.kde/share/apps/konqueror/konq_history: Größe ist 
unterschiedlich
./home/aholler/.kde/share/apps/kcookiejar/cookies: Änderungszeit ist 
unterschiedlich
./home/aholler/thinstation_src-2.0beta2.tar.bz2: Unterschiedliche Inhalte

bzip2: Data integrity error when decompressing.
         Input file = (stdin), output file = (stdout)

It is possible that the compressed file(s) have become corrupted.
You can use the -tvv option to test integrity of such files.

You can use the `bzip2recover' program to attempt to recover
data from undamaged sections of corrupted files.

tar: Unerwartetes Dateiende im Archiv.
tar: Child returned status 2
tar: Error is not recoverable: exiting now
[root@krabat bind]#
---------------------------------------------------------------

This time without any oops, dmesg just shows some

---------------------------------------------------------------
[  111.087356] EXT4-fs (sdc1): mounted filesystem with ordered data 
mode. Opts: (null)
[  672.868948] CPU4: Core temperature above threshold, cpu clock 
throttled (total events = 1)
[  672.868949] CPU0: Core temperature above threshold, cpu clock 
throttled (total events = 1)
[  672.869970] CPU0: Core temperature/speed normal
[  672.869971] CPU4: Core temperature/speed normal
[  688.285419] CPU6: Core temperature above threshold, cpu clock 
throttled (total events = 1)
[  688.285421] CPU2: Core temperature above threshold, cpu clock 
throttled (total events = 1)
[  688.286442] CPU2: Core temperature/speed normal
[  688.286443] CPU6: Core temperature/speed normal
[  698.822614] CPU3: Core temperature above threshold, cpu clock 
throttled (total events = 1)
[  698.822615] CPU7: Core temperature above threshold, cpu clock 
throttled (total events = 1)
[  698.824674] CPU3: Core temperature/speed normal
[  698.824675] CPU7: Core temperature/speed normal
[  706.979633] CPU1: Core temperature above threshold, cpu clock 
throttled (total events = 1)
[  706.979635] CPU5: Core temperature above threshold, cpu clock 
throttled (total events = 1)
[  706.980648] CPU1: Core temperature/speed normal
[  706.980649] CPU5: Core temperature/speed normal
[  899.540485] [Hardware Error]: Machine check events logged
---------------------------------------------------------------

Nothing else. Kernel is 3.5.4 userland now F17.

Regards.


  reply	other threads:[~2012-09-27 18:02 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-14 11:34 kernel BUG at fs/buffer.c:3205 (stable 3.5.3) Alexander Holler
2012-09-25 11:02 ` Dan Carpenter
2012-09-27 11:45   ` Alexander Holler
2012-09-27 15:12     ` Jan Kara
2012-09-27 15:46       ` Alexander Holler
2012-09-27 16:20         ` Alexander Holler
2012-09-27 18:01           ` Alexander Holler [this message]
2012-09-27 18:12             ` Alexander Holler
2012-09-27 20:05             ` Jan Kara
2012-09-28  8:09               ` Alexander Holler
2012-09-27 20:03         ` Jan Kara
2012-09-29 19:07           ` Alexander Holler
2012-10-01  9:10             ` Jan Kara
2012-10-01  9:21               ` Alexander Holler
2012-10-02  9:30                 ` Alexander Holler
2012-10-14  9:10                   ` Alexander Holler
2012-10-14 12:27                     ` Alan Cox
2012-10-15  8:46                       ` Alexander Holler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5064945C.4020403@ahsoftware.de \
    --to=holler@ahsoftware.de \
    --cc=dan.carpenter@oracle.com \
    --cc=jack@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).