Writeback cache all used.

* Writeback cache all used.
       [not found] <1012241948.1268315.1680082721600.ref@mail.yahoo.com>
@ 2023-03-29  9:38 ` Adriano Silva
  2023-03-29 19:18   ` Eric Wheeler
  2023-03-30  4:55   ` Martin McClure
  0 siblings, 2 replies; 28+ messages in thread
From: Adriano Silva @ 2023-03-29  9:38 UTC (permalink / raw)
  To: Bcache Linux

Hey guys,

I'm using bcache to support Ceph. Ten Cluster nodes have a bcache device each consisting of an HDD block device and an NVMe cache. But I am noticing what I consider to be a problem: My cache is 100% used even though I still have 80% of the space available on my HDD.

It is true that there is more data written than would fit in the cache. However, I imagine that most of them should only be on the HDD and not in the cache, as they are cold data, almost never used.

I noticed that there was a significant drop in performance on the disks (writes) and went to check. Benchmark tests confirmed this. Then I noticed that there was 100% cache full and 85% cache evictable. There was a bit of dirty cache. I found an internet message talking about the garbage collector, so I tried the following:

echo 1 > /sys/block/bcache0/bcache/cache/internal/trigger_gc

That doesn't seem to have helped.

Then I collected the following data:

--- bcache ---
Device /dev/sdc (8:32)
UUID 38e81dff-a7c9-449f-9ddd-182128a19b69
Block Size 4.00KiB
Bucket Size 256.00KiB
Congested? False
Read Congestion 0.0ms
Write Congestion 0.0ms
Total Cache Size 553.31GiB
Total Cache Used 547.78GiB (99%)
Total Unused Cache 5.53GiB (1%)
Dirty Data 0B (0%)
Evictable Cache 503.52GiB (91%)
Replacement Policy [lru] fifo random
Cache Mode writethrough [writeback] writearound none
Total Hits 33361829 (99%)
Total Missions 185029
Total Bypass Hits 6203 (100%)
Total Bypass Misses 0
Total Bypassed 59.20MiB
--- Cache Device ---
   Device /dev/nvme0n1p1 (259:1)
   Size 553.31GiB
   Block Size 4.00KiB
   Bucket Size 256.00KiB
   Replacement Policy [lru] fifo random
   Discard? False
   I/O Errors 0
   Metadata Written 395.00GiB
   Data Written 1.50 TiB
   Buckets 2266376
   Cache Used 547.78GiB (99%)
   Cache Unused 5.53GiB (0%)
--- Backing Device ---
   Device /dev/sdc (8:32)
   Size 5.46TiB
   Cache Mode writethrough [writeback] writearound none
   Readhead
   Sequential Cutoff 0B
   Sequential merge? False
   state clean
   Writeback? true
   Dirty Data 0B
   Total Hits 32903077 (99%)
   Total Missions 185029
   Total Bypass Hits 6203 (100%)
   Total Bypass Misses 0
   Total Bypassed 59.20MiB

The dirty data has disappeared. But the cache remains 99% utilization, down just 1%. Already the evictable cache increased to 91%!

The impression I have is that this harms the write cache. That is, if I need to write again, the data goes straight to the HDD disks, as there is no space available in the Cache.

Shouldn't bcache remove the least used part of the cache?

Does anyone know why this isn't happening?

I may be talking nonsense, but isn't there a way to tell bcache to keep a write-free space rate in the cache automatically? Or even if it was manually by some command that I would trigger at low disk access times?

Thanks!

^ permalink raw reply	[flat|nested] 28+ messages in thread