From mboxrd@z Thu Jan 1 00:00:00 1970 From: "J. R. Okajima" Subject: Q. cache in squashfs? Date: Thu, 24 Jun 2010 11:37:46 +0900 Message-ID: <19486.1277347066@jrobl> Cc: linux-fsdevel@vger.kernel.org To: phillip@lougher.demon.co.uk Return-path: Received: from mtoichi11.ns.itscom.net ([219.110.2.181]:63277 "EHLO mtoichi11.ns.itscom.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753218Ab0FXCiN (ORCPT ); Wed, 23 Jun 2010 22:38:13 -0400 Sender: linux-fsdevel-owner@vger.kernel.org List-ID: Hello Phillip, I've found an intersting issue about squashfs. Please give me a guidance or an advise. In short: Why does squashfs read and decompress the same block several times? Is the nested fs-image always better for squashfs? Long: I created two squashfs images. - from /bin directly by mksquashfs $ mksquashfs /bin /tmp/a.img - from a single ext3 fs image which contains /bin $ dd if=/dev/zero of=/tmp/ext3/img bs=... count=... $ mkfs -t ext3 -F -m 0 -T small -O dir_index /tmp/ext3/img $ sudo mount -o loop /tmp/ext3/img /mnt $ tar -C /bin -cf - . | tar -C /mnt -xpf - $ sudo umount /mnt $ mksquashfs /tmp/ext3/img /tmp/b.img Of course, /tmp/b.img is bigger than /tmp/a.img. It is OK. For these squashfs, I tried profiling the random file read all over the fs. $ find /squashfs -type f > /tmp/l $ seq 10 | time sh -c "while read i; do rl /tmp/l | xargs -r cat & done > /dev/null; wait" ("rl" is a command to randomize lines) For b.img, I have to loopback-mount twice. $ mount -o ro,loop /tmp/b.img /tmp/sq $ mount -o ro,loop /tmp/sq/img /mnt Honestly speaking, I gueesed b.img is slower due to the nested fs overhead. But it shows that b.img (ext3 within squashfs) consumes less CPU cycles and faster. - a.img (plain squashfs) 0.00user 0.14system 0:00.09elapsed 151%CPU (0avgtext+0avgdata 2192maxresident)k 0inputs+0outputs (0major+6184minor)pagefaults 0swaps (oprofile report) samples % image name app name symbol name 710 53.9514 zlib_inflate.ko zlib_inflate inflate_fast 123 9.3465 libc-2.7.so libc-2.7.so (no symbols) 119 9.0426 zlib_inflate.ko zlib_inflate zlib_adler32 106 8.0547 zlib_inflate.ko zlib_inflate zlib_inflate 95 7.2188 ld-2.7.so ld-2.7.so (no symbols) 64 4.8632 oprofiled oprofiled (no symbols) 36 2.7356 dash dash (no symbols) - b.img (ext3 + squashfs) 0.00user 0.01system 0:00.06elapsed 22%CPU (0avgtext+0avgdata 2192maxresident)k 0inputs+0outputs (0major+6134minor)pagefaults 0swaps samples % image name app name symbol name 268 37.0678 zlib_inflate.ko zlib_inflate inflate_fast 126 17.4274 libc-2.7.so libc-2.7.so (no symbols) 106 14.6611 ld-2.7.so ld-2.7.so (no symbols) 57 7.8838 zlib_inflate.ko zlib_inflate zlib_adler32 45 6.2241 oprofiled oprofiled (no symbols) 40 5.5325 dash dash (no symbols) 33 4.5643 zlib_inflate.ko zlib_inflate zlib_inflate The biggest difference is to decompress the blocks. (Since /bin is used for this sample, the difference is not so big. But if I used antoher dir which has much more files than /bin, then the difference grows too). I don't think the difference of fs-layout or metadata is a problem. Actually inserting debug-prints to show the block index in squashfs_read_data(), it shows squashfs reads the same block multiple times from a.img. int squashfs_read_data(struct super_block *sb, void **buffer, u64 index, int length, u64 *next_index, int srclength, int pages) { ::: // for datablock for (b = 0; bytes < length; b++, cur_index++) { bh[b] = sb_getblk(sb, cur_index); + pr_info("%llu\n", cur_index); if (bh[b] == NULL) goto block_release; bytes += msblk->devblksize; } ll_rw_block(READ, b, bh); ::: // for metadata for (; bytes < length; b++) { bh[b] = sb_getblk(sb, ++cur_index); + pr_info("%llu\n", cur_index); if (bh[b] == NULL) goto block_release; bytes += msblk->devblksize; } ll_rw_block(READ, b - 1, bh + 1); ::: } In case of b.img, the same block is read several times too. But the number of times is much smaller than a.img. I am intrested where did the difference come from. Do you think the loopback block device in the middle cached the decompressed block effectively? - a.img squashfs + loop0 + disk - b.img ext3 + loop1 <-- so effective? + squashfs + loop0 + disk In other word, is inserting a loopback mount always effective for all squashfs? Thanx for reading long mail J. R. Okajima