All of lore.kernel.org
 help / color / mirror / Atom feed
* dm-cache coherence issue
@ 2017-06-24 13:56 Johannes Bauer
  2017-06-24 18:21 ` Johannes Bauer
  2017-06-26 11:33 ` Joe Thornber
  0 siblings, 2 replies; 9+ messages in thread
From: Johannes Bauer @ 2017-06-24 13:56 UTC (permalink / raw)
  To: device-mapper development

Hello list,

I hope this is the correct place to ask my question. If not, I'd
appreciate a quick word where to better ask this and I'll be on my way.

I've setup a dm-cache setup and am trying to understand the
coherence/consistency between the origin device and cached device. For
this, I have setup a small usecase in which I have a 8 GiB "origin"
loopback device, a 1 GiB "cache/metadata" device:

dmsetup create TEST-dirty --table '0 2086912 linear /dev/loop1 0'
dmsetup create TEST-meta --table '0 10240 linear /dev/loop1 2086912'
dmsetup create TEST-device --table '0 16777216 cache
/dev/mapper/TEST-meta /dev/mapper/TEST-dirty /dev/loop0 512 1 writeback
default 0'

Then I calculate CRC32 of /dev/loop0 (origin device) and cached device
(/dev/mapper/TEST-device). They, in the current state (dirty pages!) differ:

./fast_crc32 -d /dev/loop0 /dev/mapper/TEST-device
Will also calculate CRC of block devices.
/dev/loop0 c2b7d8fd
/dev/mapper/TEST-device f34cf77a


Infos about the state:

Cache device size  : 8.00 GiB
Metadata block size: 4.00 kiB
Metadata usage     : 88.0 kiB / 5.00 MiB
Cache block size   : 256 kiB
Cache usage        : 807 MiB / 1019 MiB
Read hitrate       : 1.7% (34041 of 1984328)
Write hitrate      : 7.9% (2225 of 28312)
Demotions          : 0
Promotions         : 1050
Dirty              : 512 kiB (2 blocks)
Policy             : smq
Features           : writeback
Core arguments     : migration_threshold = 2048

This is expected so far. Now I try to completely flush/decommision the
cache:

dmsetup suspend TEST-device
dmsetup reload TEST-device --table '0 16777216 cache 253:5 253:4 7:0 512
0 cleaner 0'
dmsetup resume TEST-device
dmsetup wait TEST-device

Checking the state, all dirty pages are flushed:

Cache device size  : 8.00 GiB
Metadata block size: 4.00 kiB
Metadata usage     : 88.0 kiB / 5.00 MiB
Cache block size   : 256 kiB
Cache usage        : 807 MiB / 1019 MiB
Read hitrate       : 2.0% (40539 of 2049906)
Write hitrate      : 7.9% (2225 of 28312)
Demotions          : 0
Promotions         : 0
Dirty              : 0 bytes (0 blocks)
Policy             : cleaner
Features           : writeback
Core arguments     : migration_threshold = 2048

However, the checksums of origin and cached device STILL differ!

./fast_crc32 -d /dev/loop0 /dev/mapper/TEST-device
Will also calculate CRC of block devices.
/dev/loop0 c2b7d8fd
/dev/mapper/TEST-device f34cf77a

When I remove the TEST-device, however:

dmsetup remove TEST-device

Then, he device is synchronized:

./fast_crc32 -d /dev/loop0
Will also calculate CRC of block devices.
/dev/loop0 f34cf77a

So I seem to have a very basic misunderstanding of what the cleaner
policy/dirty pages mean. Is there a way to force the cache to flush
entirely? Apparently, "dmsetup wait" and/or "sync" don't do the job.

Also, I've encountered a couple of times now that after switching to the
"cleaner" policy, the "dmsetup wait" call hangs -- even though there are
definitely no hanging open I/O dependencies (no FS on these devices,
purely for testing). Why would this happen?

I'm using 4.10.6 on x86_64, BTW.

Any help greatly appreciated.
Best regards,
Johannes

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: dm-cache coherence issue
  2017-06-24 13:56 dm-cache coherence issue Johannes Bauer
@ 2017-06-24 18:21 ` Johannes Bauer
  2017-06-26 11:33 ` Joe Thornber
  1 sibling, 0 replies; 9+ messages in thread
From: Johannes Bauer @ 2017-06-24 18:21 UTC (permalink / raw)
  To: device-mapper development

On 24.06.2017 15:56, Johannes Bauer wrote:

> So I seem to have a very basic misunderstanding of what the cleaner
> policy/dirty pages mean. Is there a way to force the cache to flush
> entirely? Apparently, "dmsetup wait" and/or "sync" don't do the job.

I'd like to expand on this, since I discovered something that worries me
a bit just now:

I do have a dm-cache setup for my main drive as well (3 TB HDD cached by
128 GB SSD). On top of dm-cache runs dm-crypt (LUKS). When playing
around with my root fs, I tried to discard the whole cache. Therefore, I
did -- with the LUKS container NOT open:

dmsetup suspend cache-device
dmsetup reload cache-device --table '0 5858433935 cache 253:1 253:0 8:19
512 0 cleaner 0'
dmsetup resume cache-device
dmsetup wait cache-device

Again, "dmsetup wait" hung, with no obvious I/O being done (i.e., all
disks were idle). I Ctrl-Ced out of it. The status showed no dirty pages.

Since I was suspicious, I left the device with the "cleaner" policy and
opened the LUKS container, then did an e2fsck -fn on the opened device.
No errors found.

Then I closed the LUKS container and performed "dmsetup remove" on the
cache device (as well as the two other linear mappings).

Then I re-opened LUKS on the origin device and ran e2fsck -fn there.
File system errors!

I am fully prepared to restore everything from backup. However, is this
normal behavior or an issue? How am I supposed to discard an attached
cache? How do I get the origin device back in sync with the cached device?

Best regards,
Johannes

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: dm-cache coherence issue
  2017-06-24 13:56 dm-cache coherence issue Johannes Bauer
  2017-06-24 18:21 ` Johannes Bauer
@ 2017-06-26 11:33 ` Joe Thornber
  2017-06-26 15:58   ` Joe Thornber
  1 sibling, 1 reply; 9+ messages in thread
From: Joe Thornber @ 2017-06-26 11:33 UTC (permalink / raw)
  To: Johannes Bauer; +Cc: device-mapper development

On Sat, Jun 24, 2017 at 03:56:54PM +0200, Johannes Bauer wrote:
> So I seem to have a very basic misunderstanding of what the cleaner
> policy/dirty pages mean. Is there a way to force the cache to flush
> entirely? Apparently, "dmsetup wait" and/or "sync" don't do the job.

Your understanding is correct.  There is a tool in the latest thinp
tools release called cache_writeback that does offline decommissioning
of a cache.

I'll try and reproduce your scenario and get back to you.

- Joe

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: dm-cache coherence issue
  2017-06-26 11:33 ` Joe Thornber
@ 2017-06-26 15:58   ` Joe Thornber
  2017-06-26 19:08     ` Johannes Bauer
  0 siblings, 1 reply; 9+ messages in thread
From: Joe Thornber @ 2017-06-26 15:58 UTC (permalink / raw)
  To: Johannes Bauer, device-mapper development

On Mon, Jun 26, 2017 at 12:33:42PM +0100, Joe Thornber wrote:
> On Sat, Jun 24, 2017 at 03:56:54PM +0200, Johannes Bauer wrote:
> > So I seem to have a very basic misunderstanding of what the cleaner
> > policy/dirty pages mean. Is there a way to force the cache to flush
> > entirely? Apparently, "dmsetup wait" and/or "sync" don't do the job.
> 
> I'll try and reproduce your scenario and get back to you.

Here's a similar scenario that I've added to the dm test suite:

https://github.com/jthornber/device-mapper-test-suite/commit/457e889b0c4d510609c0d7464af07f2ebee20768

It goes through all the steps you need to use to decommission a cache
with dirty blocks.  Namely:

- switch to writethrough mode (so new io can't create new dirty blocks)
- switch to the cleaner policy
- wait for clean
- remove the cache device before accessing the origin directly

I've run it with a recent kernel, and one from about a year ago and it
passes fine.

- Joe

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: dm-cache coherence issue
  2017-06-26 15:58   ` Joe Thornber
@ 2017-06-26 19:08     ` Johannes Bauer
  2017-06-26 19:56       ` Mike Snitzer
  0 siblings, 1 reply; 9+ messages in thread
From: Johannes Bauer @ 2017-06-26 19:08 UTC (permalink / raw)
  To: dm-devel; +Cc: thornber

Hi Joe,

On 26.06.2017 17:58, Joe Thornber wrote:
> On Mon, Jun 26, 2017 at 12:33:42PM +0100, Joe Thornber wrote:
>> On Sat, Jun 24, 2017 at 03:56:54PM +0200, Johannes Bauer wrote:
>>> So I seem to have a very basic misunderstanding of what the cleaner
>>> policy/dirty pages mean. Is there a way to force the cache to flush
>>> entirely? Apparently, "dmsetup wait" and/or "sync" don't do the job.
>>
>> I'll try and reproduce your scenario and get back to you.
> 
> Here's a similar scenario that I've added to the dm test suite:
> 
> https://github.com/jthornber/device-mapper-test-suite/commit/457e889b0c4d510609c0d7464af07f2ebee20768
> 
> It goes through all the steps you need to use to decommission a cache
> with dirty blocks.  Namely:
> 
> - switch to writethrough mode (so new io can't create new dirty blocks)
> - switch to the cleaner policy
> - wait for clean
> - remove the cache device before accessing the origin directly

Interesting, I did *not* change to writethrough. However, there
shouldn't have been any I/O on the device (it was not accessed by
anything after I switched to the cleaner policy).

On the advice of Zdenek Kabelac, who messaged me off-list, I therefore
have indeed been convinced that using dm-cache "by hand" maybe is too
dangerous (i.e., I had not accounted for "repairing" of a cache and am
really still unsure what LVM does that) and that lvmcache is a better
choice -- even though Ubuntu supports it really crappily for root devices:
https://bugs.launchpad.net/ubuntu/+source/lvm2/+bug/1423796 a shame,
it's been like this for over two years.

Anyways, I'll try to replicate my scenario again because I'm actually
quite sure that I did everything correctly (I did it a few times).

Thanks for you help,
Johannes

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: dm-cache coherence issue
  2017-06-26 19:08     ` Johannes Bauer
@ 2017-06-26 19:56       ` Mike Snitzer
  2017-06-26 20:36         ` Johannes Bauer
  0 siblings, 1 reply; 9+ messages in thread
From: Mike Snitzer @ 2017-06-26 19:56 UTC (permalink / raw)
  To: Johannes Bauer; +Cc: dm-devel, thornber

On Mon, Jun 26 2017 at  3:08pm -0400,
Johannes Bauer <dfnsonfsduifb@gmx.de> wrote:

> Hi Joe,
> 
> On 26.06.2017 17:58, Joe Thornber wrote:
> > On Mon, Jun 26, 2017 at 12:33:42PM +0100, Joe Thornber wrote:
> >> On Sat, Jun 24, 2017 at 03:56:54PM +0200, Johannes Bauer wrote:
> >>> So I seem to have a very basic misunderstanding of what the cleaner
> >>> policy/dirty pages mean. Is there a way to force the cache to flush
> >>> entirely? Apparently, "dmsetup wait" and/or "sync" don't do the job.
> >>
> >> I'll try and reproduce your scenario and get back to you.
> > 
> > Here's a similar scenario that I've added to the dm test suite:
> > 
> > https://github.com/jthornber/device-mapper-test-suite/commit/457e889b0c4d510609c0d7464af07f2ebee20768
> > 
> > It goes through all the steps you need to use to decommission a cache
> > with dirty blocks.  Namely:
> > 
> > - switch to writethrough mode (so new io can't create new dirty blocks)
> > - switch to the cleaner policy
> > - wait for clean
> > - remove the cache device before accessing the origin directly
> 
> Interesting, I did *not* change to writethrough. However, there
> shouldn't have been any I/O on the device (it was not accessed by
> anything after I switched to the cleaner policy).
> 
> On the advice of Zdenek Kabelac, who messaged me off-list, I therefore
> have indeed been convinced that using dm-cache "by hand" maybe is too
> dangerous (i.e., I had not accounted for "repairing" of a cache and am
> really still unsure what LVM does that) and that lvmcache is a better
> choice -- even though Ubuntu supports it really crappily for root devices:
> https://bugs.launchpad.net/ubuntu/+source/lvm2/+bug/1423796 a shame,
> it's been like this for over two years.
> 
> Anyways, I'll try to replicate my scenario again because I'm actually
> quite sure that I did everything correctly (I did it a few times).

Except you didn't first switch to writethrough -- which is _not_
correct.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: dm-cache coherence issue
  2017-06-26 19:56       ` Mike Snitzer
@ 2017-06-26 20:36         ` Johannes Bauer
  2017-06-26 21:34           ` Mike Snitzer
  2017-06-27  9:44           ` Joe Thornber
  0 siblings, 2 replies; 9+ messages in thread
From: Johannes Bauer @ 2017-06-26 20:36 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: dm-devel, thornber

On 26.06.2017 21:56, Mike Snitzer wrote:

>> Interesting, I did *not* change to writethrough. However, there
>> shouldn't have been any I/O on the device (it was not accessed by
>> anything after I switched to the cleaner policy).
[...]
>> Anyways, I'll try to replicate my scenario again because I'm actually
>> quite sure that I did everything correctly (I did it a few times).
> 
> Except you didn't first switch to writethrough -- which is _not_
> correct.

Absolutely, very good to know. So even without any I/O being request,
dm-cache is allowed to "hold back" pages as long as the dm-cache device
is in writeback mode? Would this also explain why the "dmsetup wait"
hung indefinitely?

I do think I followed a tutorial that I found on the net regarding this.
Scary that such a crucial fact is missing there. The fact that dirty
pages are reported as zero just gives the impression that everything is
coherent, when in fact it's not.

Regardless, I find this and your explanation extremely interesting and
want to thank you for clearing this up. Very fascinating topic indeed.

Best regards,
Johannes

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: dm-cache coherence issue
  2017-06-26 20:36         ` Johannes Bauer
@ 2017-06-26 21:34           ` Mike Snitzer
  2017-06-27  9:44           ` Joe Thornber
  1 sibling, 0 replies; 9+ messages in thread
From: Mike Snitzer @ 2017-06-26 21:34 UTC (permalink / raw)
  To: Johannes Bauer; +Cc: dm-devel, thornber

On Mon, Jun 26 2017 at  4:36pm -0400,
Johannes Bauer <dfnsonfsduifb@gmx.de> wrote:

> On 26.06.2017 21:56, Mike Snitzer wrote:
> 
> >> Interesting, I did *not* change to writethrough. However, there
> >> shouldn't have been any I/O on the device (it was not accessed by
> >> anything after I switched to the cleaner policy).
> [...]
> >> Anyways, I'll try to replicate my scenario again because I'm actually
> >> quite sure that I did everything correctly (I did it a few times).
> > 
> > Except you didn't first switch to writethrough -- which is _not_
> > correct.
> 
> Absolutely, very good to know. So even without any I/O being request,
> dm-cache is allowed to "hold back" pages as long as the dm-cache device
> is in writeback mode?

s/pages/blocks/

The "dmsetup status" output for a DM cache device is showing dirty
accounting is in terms of cache blocks.

> Would this also explain why the "dmsetup wait" hung indefinitely?

You need to read the dmsetup man page, dmsetup wait" has _nothing_ to do
with waiting for IO to complete.  It is about DM events, without
specifying an event_nr you're just waiting for the device's event
counter to increment (which may never happen if you aren't doing
anything that'd trigger an event).  See:

"       wait [--noflush] device_name [event_nr]
              Sleeps until the event counter for device_name exceeds
              event_nr.  Use -v to see the event number returned.  To
              wait until the next event is triggered, use info to find
              the last event number.  With --noflush, the thin target
              (from version 1.3.0) doesn't commit any
              outstanding changes to disk before reporting its statistics."

> I do think I followed a tutorial that I found on the net regarding this.
> Scary that such a crucial fact is missing there. The fact that dirty
> pages are reported as zero just gives the impression that everything is
> coherent, when in fact it's not.

I'll concede that it is weird that you're seeing a different md5sum for
the origin vs the cache (that is in writeback mode yet reports 0 dirty
blocks).

But I think there is some important detail that would explain it; sadly
I'd need to dig in and reproduce on a testbed to identify it.  Maybe Joe
will be able to offer a quick answer?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: dm-cache coherence issue
  2017-06-26 20:36         ` Johannes Bauer
  2017-06-26 21:34           ` Mike Snitzer
@ 2017-06-27  9:44           ` Joe Thornber
  1 sibling, 0 replies; 9+ messages in thread
From: Joe Thornber @ 2017-06-27  9:44 UTC (permalink / raw)
  To: Johannes Bauer; +Cc: dm-devel, Mike Snitzer

On Mon, Jun 26, 2017 at 10:36:23PM +0200, Johannes Bauer wrote:
> On 26.06.2017 21:56, Mike Snitzer wrote:
> 
> >> Interesting, I did *not* change to writethrough. However, there
> >> shouldn't have been any I/O on the device (it was not accessed by
> >> anything after I switched to the cleaner policy).
> [...]
> >> Anyways, I'll try to replicate my scenario again because I'm actually
> >> quite sure that I did everything correctly (I did it a few times).
> > 
> > Except you didn't first switch to writethrough -- which is _not_
> > correct.
> 
> Absolutely, very good to know. So even without any I/O being request,
> dm-cache is allowed to "hold back" pages as long as the dm-cache device
> is in writeback mode? Would this also explain why the "dmsetup wait"
> hung indefinitely?

Some things to try to see if it makes a difference:

- unmount the cache before checksumming it so we know there's no IO from
  the page cache going in.

- deactivate the cache before checksumming the origin.

- Stop using encryption on top of cache and see if that makes a
  difference.

- Use 'dmsetup wait' properly, as Mike mentioned.  See the following code
  from dmtest:

    https://github.com/jthornber/device-mapper-test-suite/blob/master/lib/dmtest/cache_utils.rb#L28
    https://github.com/jthornber/device-mapper-test-suite/blob/master/lib/dmtest/device-mapper/event_tracker.rb

- use md5sum rather than your checksum program.  Humour me.



- Joe

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-06-27  9:44 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-24 13:56 dm-cache coherence issue Johannes Bauer
2017-06-24 18:21 ` Johannes Bauer
2017-06-26 11:33 ` Joe Thornber
2017-06-26 15:58   ` Joe Thornber
2017-06-26 19:08     ` Johannes Bauer
2017-06-26 19:56       ` Mike Snitzer
2017-06-26 20:36         ` Johannes Bauer
2017-06-26 21:34           ` Mike Snitzer
2017-06-27  9:44           ` Joe Thornber

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.