All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Murphy <lists@colorremedies.com>
To: Leszek Dubiel <leszek@dubiel.pl>
Cc: Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: very slow "btrfs dev delete" 3x6Tb, 7Tb of data
Date: Fri, 3 Jan 2020 12:02:41 -0700	[thread overview]
Message-ID: <CAJCQCtSpFyfHAWVth5PuvjJtiHPfN52WzspOdsvLrJxMdbcirw@mail.gmail.com> (raw)
In-Reply-To: <7c82851a-21cf-2a66-8d1c-42d57ca0538f@dubiel.pl>

On Fri, Jan 3, 2020 at 7:39 AM Leszek Dubiel <leszek@dubiel.pl> wrote:
>
> ** number of files by given size
>
> root@wawel:/mnt/root/orion# cat disk_usage | perl -MData::Dumper -e
> '$Data::Dumper::Sortkeys = 1; while (<>) { chomp; my ($byt, $nam) =
> split /\t/, $_, -1; if (index("$las/", $nam) == 0) { $dir++; } else {
> $filtot++; for $p (1 .. 99) { if ($byt < 10 ** $p) { $fil{"num of files
> size <10^$p"}++; last; } } }; $las = $nam; }; print "\ndirectories:
> $dir\ntotal num of files: $filtot\n", "\nnumber of files grouped by
> size: \n", Dumper(\%fil) '
>
> directories: 1314246
> total num of files: 10123960
>
> number of files grouped by size:
> $VAR1 = {
>            'num of files size <10^1' => 3325886,
>            'num of files size <10^2' => 3709276,
>            'num of files size <10^3' => 789852,
>            'num of files size <10^4' => 1085927,
>            'num of files size <10^5' => 650571,
>            'num of files size <10^6' => 438717,
>            'num of files size <10^7' => 116757,
>            'num of files size <10^8' => 6638,
>            'num of files size <10^9' => 323
>            'num of files size <10^10' => 13,

Is that really ~7.8 million files at or less than 1KiB?? (totalling
the first three)

Compression may not do much with such small files, and also I'm not
sure which algorithm would do the best job. They all probably want a
lot more than 1KiB to become efficient.

But nodesize 64KiB might be a big deal...worth testing.




-- 
Chris Murphy

  reply	other threads:[~2020-01-03 19:03 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-25 22:35 very slow "btrfs dev delete" 3x6Tb, 7Tb of data Leszek Dubiel
2019-12-26  5:08 ` Qu Wenruo
2019-12-26 13:17   ` Leszek Dubiel
2019-12-26 13:44     ` Remi Gauvin
2019-12-26 14:05       ` Leszek Dubiel
2019-12-26 14:21         ` Remi Gauvin
2019-12-26 15:42           ` Leszek Dubiel
2019-12-26 22:40         ` Chris Murphy
2019-12-26 22:58           ` Leszek Dubiel
2019-12-28 17:04             ` Leszek Dubiel
2019-12-28 20:23               ` Zygo Blaxell
2020-01-02 18:37                 ` Leszek Dubiel
2020-01-02 21:57                   ` Chris Murphy
2020-01-02 22:39                     ` Leszek Dubiel
2020-01-02 23:22                       ` Chris Murphy
2020-01-03  9:08                         ` Leszek Dubiel
2020-01-03 19:15                           ` Chris Murphy
2020-01-03 14:39                         ` Leszek Dubiel
2020-01-03 19:02                           ` Chris Murphy [this message]
2020-01-03 20:59                             ` Leszek Dubiel
2020-01-04  5:38                         ` Zygo Blaxell
2020-01-07 18:44                           ` write amplification, was: " Chris Murphy
2020-01-07 19:26                             ` Holger Hoffstätte
2020-01-07 23:32                             ` Zygo Blaxell
2020-01-07 23:53                               ` Chris Murphy
2020-01-08  1:41                                 ` Zygo Blaxell
2020-01-08  2:54                                   ` Chris Murphy
2020-01-06 11:14                     ` Leszek Dubiel
2020-01-07  0:21                       ` Chris Murphy
2020-01-07  7:09                         ` Leszek Dubiel
2019-12-26 22:15 ` Chris Murphy
2019-12-26 22:48   ` Leszek Dubiel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJCQCtSpFyfHAWVth5PuvjJtiHPfN52WzspOdsvLrJxMdbcirw@mail.gmail.com \
    --to=lists@colorremedies.com \
    --cc=leszek@dubiel.pl \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.