Guy, Am Donnerstag 28 August 2014, 10:28:02 schrieb Gui Hecheng: > On Mon, 2014-08-25 at 05:08 +0000, Zooko Wilcox-OHearn wrote: > > Aha. When it is run under valgrind it consistently stops (killing > > valgrind, in fact!) in the same way on every run. > > > > Here's the tail of stdout and stderr when it aborted when run under > > valgrind: > > > > Restoring > > ./sda6-btrfs-restore-3/@home/zooko/.mozilla/firefox/ltjwtkwe.ketotic.org/ > > thumbnails/188888af64f6d2871b0f24e325d8a298.png Restoring > > ./sda6-btrfs-restofailed to inflate: -6 > > > > Full valgrind outputs from such a run is attached to this letter. > > > > I've spent a little time looking at the stack traces in the valgrind > > log, and I *guess* that there is corruption such that the > > decompression fails, and I guess it would be possible to make > > cmds-restore handle corrupted compressedtext better, so that it would > > end up skipping whatever files and directories were unrestorable due > > to corruption. However, I don't immediately see how to proceed. > > > > Regards, > > Hi Zooko, > Here are some pieces for your information: > > For the first: > ==5569== Syscall param pwrite64(buf) points to uninitialised byte(s) > ==5569== at 0x56ABD03: __pwrite_nocancel (syscall-template.S:81) > ==5569== by 0x41F346: search_dir (cmds-restore.c:392) > > It is handled by > https://patchwork.kernel.org/patch/4755441/ > > For the second: > ==5569== Invalid read of size 1 > ==5569== at 0x4C2F95E: memcpy@@GLIBC_2.14 > ==5569== by 0x4388E6: read_extent_buffer (string3.h:51) > ==5569== by 0x41ED6C: search_dir (cmds-restore.c:233) > > It should be handled by > https://patchwork.kernel.org/patch/4792381/ > And it handles Marc's similar problem too. I can confirm that this patch really cures these memleaks, but .... > > And for the last one and the crucial one... > ==5569== Invalid read of size 4 > ==5569== at 0x41E394: decompress (cmds-restore.c:93) > ==5569== by 0x41F291: search_dir (cmds-restore.c:378) > along with > ==5569== Invalid read of size 1 > ==5569== at 0x548DDB6: lzo1x_decompress_safe > ==5569== by 0x41E3BD: decompress (cmds-restore.c:122) > ==5569== by 0x41F291: search_dir (cmds-restore.c:378) > > Sorry, I'm not able to reproduce it yet, it may be just what you've > guessed that corruption happens. But I am sure that there are bugs > around the decompress routine, because I've got "failed to inflate"s too > with a non-corrupted btrfs. I'm going to track it down. this one still exists. It took me a while to reproduce this (actually, find the file which causes it). So we have: ==27292== Invalid read of size 8 ==27292== at 0x57A10D2: lzo1x_decompress_safe (in /usr/lib64/liblzo2.so.2.0.0) ==27292== by 0x41E9ED: decompress (cmds-restore.c:129) ==27292== by 0x41F8A7: search_dir (cmds-restore.c:386) ==27292== by 0x41FFE6: search_dir (cmds-restore.c:916) ==27292== by 0x41FFE6: search_dir (cmds-restore.c:916) ==27292== by 0x41FFE6: search_dir (cmds-restore.c:916) ==27292== by 0x41FFE6: search_dir (cmds-restore.c:916) ==27292== by 0x420C6F: cmd_restore (cmds-restore.c:1319) ==27292== by 0x4042FC: main (btrfs.c:247) ==27292== Address 0x6280afc is 24,572 bytes inside a block of size 24,576 alloc'd ==27292== at 0x4C277AB: malloc (in /usr/lib64/valgrind/vgpreload_memcheck- amd64-linux.so) ==27292== by 0x41F577: search_dir (cmds-restore.c:317) ==27292== by 0x41FFE6: search_dir (cmds-restore.c:916) ==27292== by 0x41FFE6: search_dir (cmds-restore.c:916) ==27292== by 0x41FFE6: search_dir (cmds-restore.c:916) ==27292== by 0x41FFE6: search_dir (cmds-restore.c:916) ==27292== by 0x420C6F: cmd_restore (cmds-restore.c:1319) ==27292== by 0x4042FC: main (btrfs.c:247) ==27292== ==27292== (action on error) vgdb me ... and the attached debug backtrace is (I attached the full bt): Program received signal SIGTRAP, Trace/breakpoint trap. 0x00000000057a10d2 in lzo1x_decompress_safe () from /usr/lib64/liblzo2.so.2 (gdb) bt #0 0x00000000057a10d2 in lzo1x_decompress_safe () from /usr/lib64/liblzo2.so.2 #1 0x000000000041e9ee in decompress_lzo (decompress_len=0x7feff9f60, compress_len=417, outbuf=0x63229a0 "ource/core/dom/webcore_dom.StaticNodeList.o", inbuf=0x6280a6d "\017ource/core/dom/webl\001") at cmds-restore.c:129 #2 decompress (inbuf=inbuf@entry=0x627ab00 "zU\001", outbuf=outbuf@entry=0x631a9a0 ", leaf=0x5fb58d0, fd=4, root=0x61405c0) at cmds-restore.c:386 #4 copy_file (file=0x66a700 "/work/chromium/src/out/Release/.ninja_deps", key=0x7feffb080, fd=4, root=0x61405c0) at cmds-restore.c:659 #5 search_dir (root=root@entry=0x61405c0, key=key@entry=0x7feffc2d0, output_rootdir=output_rootdir@entry=0x7fefffdb0 "/work", in_dir=in_dir@entry=0x6602d70 "/chromium/src/out/Release", mreg=mreg@entry=0x7fefffd60) at cmds-restore.c:840 #6 0x000000000041ffe7 in search_dir (root=root@entry=0x61405c0, key=key@entry=0x7feffd520, output_rootdir=output_rootdir@entry=0x7fefffdb0 "/work", in_dir=in_dir@entry=0x6df4d90 "/chromium/src/out", mreg=mreg@entry=0x7fefffd60) at cmds-restore.c:916 #7 0x000000000041ffe7 in search_dir (root=root@entry=0x61405c0, key=key@entry=0x7feffe770, output_rootdir=output_rootdir@entry=0x7fefffdb0 "/work", in_dir=in_dir@entry=0x65d7080 "/chromium/src", mreg=mreg@entry=0x7fefffd60) at cmds-restore.c:916 #8 0x000000000041ffe7 in search_dir (root=root@entry=0x61405c0, key=key@entry=0x7fefff9c0, output_rootdir=output_rootdir@entry=0x7fefffdb0 "/work", in_dir=in_dir@entry=0x6f35ac0 "/chromium", mreg=mreg@entry=0x7fefffd60) at cmds-restore.c:916 #9 0x000000000041ffe7 in search_dir (root=root@entry=0x61405c0, key=key@entry=0x7fefffe30, output_rootdir=output_rootdir@entry=0x7fefffdb0 "/work", in_dir=in_dir@entry=0x45ab43 "", mreg=mreg@entry=0x7fefffd60) at cmds-restore.c:916 #10 0x0000000000420c70 in cmd_restore (argc=, argv=) at cmds-restore.c:1319 #11 0x00000000004042fd in main (argc=8, argv=0x7feffffa0) at btrfs.c:247 Hope that helps Marc