All of lore.kernel.org
 help / color / mirror / Atom feed
From: bugzilla-daemon@bugzilla.kernel.org
To: linux-f2fs-devel@lists.sourceforge.net
Subject: [f2fs-dev] [Bug 214009] New: Compression has no real effect in disk usage
Date: Mon, 09 Aug 2021 14:08:17 +0000	[thread overview]
Message-ID: <bug-214009-202145@https.bugzilla.kernel.org/> (raw)

https://bugzilla.kernel.org/show_bug.cgi?id=214009

            Bug ID: 214009
           Summary: Compression has no real effect in disk usage
           Product: File System
           Version: 2.5
    Kernel Version: 5.13
          Hardware: All
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: f2fs
          Assignee: filesystem_f2fs@kernel-bugs.kernel.org
          Reporter: bezirg@gmail.com
        Regression: No

I run into problems of measuring the used/free space
of an f2fs partition with transparent compression enable, which leads me to
suspect that there is no transparent compression
happening after all.

I begin with a 1G partition to initialize f2fs with:

$ mkfs.f2fs -O extra_attr,compression /dev/sda2
F2FS-tools: mkfs.f2fs Ver: 1.14.0 (2020-08-24)
Info: Disable heap-based policy
Info: Debug level = 0
Info: Trim is enabled
Info: [/dev/sda2] Disk Model: nal USB 3.0
Info: Segments per section = 1
Info: Sections per zone = 1
Info: sector size = 512
Info: total sectors = 2097152 (1024 MB)
Info: zone aligned segment0 blkaddr: 512
Info: format version with
"Linux version 5.13.7-arch1-1 (linux@archlinux) (gcc (GCC) 11.1.0, GNU ld (GNU
Binutils) 2.36.1) #1 SMP PREEMPT Sat, 31 Jul 2021 13:18:52 +0000"
Info: [/dev/sda2] Discarding device
Info: This device doesn't support BLKSECDISCARD
Info: This device doesn't support BLKDISCARD
Info: Overprovision ratio = 6.360%
Info: Overprovision segments = 68 (GC reserved = 39)
Info: format successful

Let's mount with compression and see the usage/free space:

$ mount -o nodiscard,compress_algorithm=lz4,compress_extension=* /dev/sda2
/mnt/usb
$ cat /etc/mtab | grep /dev/sda2
/dev/sda2 /mnt/usb f2fs
rw,lazytime,relatime,background_gc=on,discard,no_heap,user_xattr,inline_xattr,acl,inline_data,inline_dentry,flush_merge,extent_cache,mode=adaptive,active_logs=6,alloc_mode=reuse,checkpoint_merge,fsync_mode=posix,compress_algorithm=lz4,compress_log_size=2,compress_extension=*,compress_mode=fs
0 0

$ df -hT /mnt/usb

Filesystem     Type  Size  Used Avail Use% Mounted on
/dev/sda2      f2fs 1022M  155M  868M  16% /mnt/usb

I understand that there is some initial overprovision overhead and some other
f2fs overhead that I am not aware of (total 155M).

Next, I create a 500M file, filled with zeroes.

$ dd bs=1M count=500 if=/dev/zero of=/tmp/empty

500+0 records in
500+0 records out
524288000 bytes (524 MB, 500 MiB) copied, 0.196941 s, 2.7 GB/s

The file is highly compressible in lz4 as you can see to around 2M:

$ lz4 /tmp/empty
Compressed filename will be : empty.lz4
Compressed 524288000 bytes into 2057890 bytes ==> 0.39%

I then transfer the file over to the f2fs partition, hoping
that transparent compression will happen:

$ cp /tmp/empty /mnt/usb/empty

The sysfs claims that "transparent compression" happened during this copy:

$ cat /sys/fs/f2fs/sda2/{compr_new_inode,compr_saved_block,compr_written_block}
1
96000
32000

Yet at this point, the df program thinks otherwise:

$ df -hT /mnt/usb

Filesystem     Type  Size  Used Avail Use% Mounted on
/dev/sda2      f2fs 1022M  655M  368M  65% /mnt/usb

As you can see, the reported used space increased by 500M (155M previously +
500M of the file).
I would expect a usage increase of only around 2M, since that is how the
default lz4 compresses into.

I already knew that for individual files the `du` program is not reliable and
in case of btrfs, the `compsize` program should be used instead;
however, the `df` utility works just fine for btrfs+compression.
Let's assume that `df` is reliable for btrfs but is lying in case of f2fs.
If I would then copy again the same 500M file into the f2fs partition, the
partition runs out of space!

$ cp /tmp/empty /mnt/usb/empty2
cp: error writing '/mnt/usb/empty2': No space left on device

$ df -hT /mnt/usb
Filesystem     Type  Size  Used Avail Use% Mounted on
/dev/sda2      f2fs 1022M 1022M     0 100% /mnt/usb

The sysfs continues believing that the second compression happened:

$ cat /sys/fs/f2fs/sda2/{compr_new_inode,compr_saved_block,compr_written_block}

2
166488
55496

I see three possible cases happening:

1) The df program is reporting wrong usage/free space.
2) the f2fs sysfs is lying about performing compression.
3) f2fs "thinks" that there is no space left to allocate, although there is,
since the contents are compressed good.

This is tested and happens both for lz4 and zstd compression of f2fs.
Tested with:
mkfs.f2fs 1.14.0 (2020-08-24)
Linux 5.13.7

I was inspired by other people running into similar problem:

<https://www.reddit.com/r/filesystems/comments/ljzn7i/f2fs_compression_not_compressing>
<https://forums.gentoo.org/viewtopic-p-8485606.html?sid=e6384908dade712e3f8eaeeb7cf1242b>

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

             reply	other threads:[~2021-08-09 14:08 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-09 14:08 bugzilla-daemon [this message]
2021-08-09 14:35 ` [f2fs-dev] [Bug 214009] Compression has no real effect in disk usage bugzilla-daemon
2021-08-09 14:41 ` bugzilla-daemon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-214009-202145@https.bugzilla.kernel.org/ \
    --to=bugzilla-daemon@bugzilla.kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.