All of lore.kernel.org
 help / color / mirror / Atom feed
* dm-cache failure semantics in write-back mode
@ 2015-02-17 12:15 Thanos Makatos
  2015-03-03 15:53 ` Thanos Makatos
  0 siblings, 1 reply; 7+ messages in thread
From: Thanos Makatos @ 2015-02-17 12:15 UTC (permalink / raw)
  To: dm-devel


[-- Attachment #1.1: Type: text/plain, Size: 946 bytes --]

Hi,

I'm trying to understand the failure semantics of dm-cache in write-back
mode. In Documentation/device-mapper/cache.txt it is stated:

"On-disk metadata is committed every time a FLUSH or FUA bio is written.
If no such requests are made then commits will occur every second.  This
means the cache behaves like a physical disk that has a volatile write
cache.  If power is lost you may lose some recent writes.  The metadata
should always be consistent in spite of any crash."

Which I admit confuses me. Assumie that no FLUSH/FUA requerst is issued
(e.g. the user of the cached device is a Windows VM) and a failure occurs
(e.g. there is a power failure but both the HDD and the SSD are fine)
immediatelly after a write I/O request, but before on-disk metadata get
commited (e.g. the failure occurs less than a second after the write I/O
request was completed). After the hosts reboots, is this completed write
I/O request going to be lost?

[-- Attachment #1.2: Type: text/html, Size: 1119 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: dm-cache failure semantics in write-back mode
  2015-02-17 12:15 dm-cache failure semantics in write-back mode Thanos Makatos
@ 2015-03-03 15:53 ` Thanos Makatos
  2015-03-03 16:19   ` Joe Thornber
  0 siblings, 1 reply; 7+ messages in thread
From: Thanos Makatos @ 2015-03-03 15:53 UTC (permalink / raw)
  To: dm-devel; +Cc: heinzm, thornber, snitzer


[-- Attachment #1.1: Type: text/plain, Size: 1271 bytes --]

On Tue, Feb 17, 2015 at 12:15 PM, Thanos Makatos <thanos.makatos@onapp.com>
wrote:

> Hi,
>
> I'm trying to understand the failure semantics of dm-cache in write-back
> mode. In Documentation/device-mapper/cache.txt it is stated:
>
> "On-disk metadata is committed every time a FLUSH or FUA bio is written.
> If no such requests are made then commits will occur every second.  This
> means the cache behaves like a physical disk that has a volatile write
> cache.  If power is lost you may lose some recent writes.  The metadata
> should always be consistent in spite of any crash."
>
> Which I admit confuses me. Assumie that no FLUSH/FUA requerst is issued
> (e.g. the user of the cached device is a Windows VM) and a failure occurs
> (e.g. there is a power failure but both the HDD and the SSD are fine)
> immediatelly after a write I/O request, but before on-disk metadata get
> commited (e.g. the failure occurs less than a second after the write I/O
> request was completed). After the hosts reboots, is this completed write
> I/O request going to be lost?
>

I haven't gotten any reply to this so I'll try to rephrase: If the user of
the cache doesn't issue FLUSH/FUA, is there a chance of (irreversible) data
loss in the event of a kernel crash or power failure?

[-- Attachment #1.2: Type: text/html, Size: 1740 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: dm-cache failure semantics in write-back mode
  2015-03-03 15:53 ` Thanos Makatos
@ 2015-03-03 16:19   ` Joe Thornber
  2015-03-03 17:13     ` Thanos Makatos
  0 siblings, 1 reply; 7+ messages in thread
From: Joe Thornber @ 2015-03-03 16:19 UTC (permalink / raw)
  To: Thanos Makatos; +Cc: heinzm, dm-devel, snitzer

On Tue, Mar 03, 2015 at 03:53:39PM +0000, Thanos Makatos wrote:
> On Tue, Feb 17, 2015 at 12:15 PM, Thanos Makatos <thanos.makatos@onapp.com>
> wrote:
> 
> > Hi,
> >
> > I'm trying to understand the failure semantics of dm-cache in write-back
> > mode. In Documentation/device-mapper/cache.txt it is stated:
> >
> > "On-disk metadata is committed every time a FLUSH or FUA bio is written.
> > If no such requests are made then commits will occur every second.  This
> > means the cache behaves like a physical disk that has a volatile write
> > cache.  If power is lost you may lose some recent writes.  The metadata
> > should always be consistent in spite of any crash."
> >
> > Which I admit confuses me. Assumie that no FLUSH/FUA requerst is issued
> > (e.g. the user of the cached device is a Windows VM) and a failure occurs
> > (e.g. there is a power failure but both the HDD and the SSD are fine)
> > immediatelly after a write I/O request, but before on-disk metadata get
> > commited (e.g. the failure occurs less than a second after the write I/O
> > request was completed). After the hosts reboots, is this completed write
> > I/O request going to be lost?
> >
> 
> I haven't gotten any reply to this so I'll try to rephrase: If the user of
> the cache doesn't issue FLUSH/FUA, is there a chance of (irreversible) data
> loss in the event of a kernel crash or power failure?

If the mappings change, ie. something is promoted to the cache, or
demoted from it.  Then the metadata update is committed and updated
before the triggering io is issued.

So power failure will not result in loss of a mapping, it may result
in loss of data if your physical device has a write cache.  But this
is also the case with the cache.

- Joe

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: dm-cache failure semantics in write-back mode
  2015-03-03 16:19   ` Joe Thornber
@ 2015-03-03 17:13     ` Thanos Makatos
  2015-03-03 17:28       ` Joe Thornber
  2015-03-03 21:58       ` Spelic
  0 siblings, 2 replies; 7+ messages in thread
From: Thanos Makatos @ 2015-03-03 17:13 UTC (permalink / raw)
  To: Thanos Makatos, dm-devel, heinzm, snitzer


[-- Attachment #1.1: Type: text/plain, Size: 2185 bytes --]

Thanks, Joe. So just to make sure I've understood correctly, if the SSD
cache is configured as a write-back cache but the device cache is
disabled/set to write-though on the HDD and the SSD, then there is no risk
of data loss in the event of a failure. Is my understanding correct?

On Tue, Mar 3, 2015 at 4:19 PM, Joe Thornber <thornber@redhat.com> wrote:

> On Tue, Mar 03, 2015 at 03:53:39PM +0000, Thanos Makatos wrote:
> > On Tue, Feb 17, 2015 at 12:15 PM, Thanos Makatos <
> thanos.makatos@onapp.com>
> > wrote:
> >
> > > Hi,
> > >
> > > I'm trying to understand the failure semantics of dm-cache in
> write-back
> > > mode. In Documentation/device-mapper/cache.txt it is stated:
> > >
> > > "On-disk metadata is committed every time a FLUSH or FUA bio is
> written.
> > > If no such requests are made then commits will occur every second.
> This
> > > means the cache behaves like a physical disk that has a volatile write
> > > cache.  If power is lost you may lose some recent writes.  The metadata
> > > should always be consistent in spite of any crash."
> > >
> > > Which I admit confuses me. Assumie that no FLUSH/FUA requerst is issued
> > > (e.g. the user of the cached device is a Windows VM) and a failure
> occurs
> > > (e.g. there is a power failure but both the HDD and the SSD are fine)
> > > immediatelly after a write I/O request, but before on-disk metadata get
> > > commited (e.g. the failure occurs less than a second after the write
> I/O
> > > request was completed). After the hosts reboots, is this completed
> write
> > > I/O request going to be lost?
> > >
> >
> > I haven't gotten any reply to this so I'll try to rephrase: If the user
> of
> > the cache doesn't issue FLUSH/FUA, is there a chance of (irreversible)
> data
> > loss in the event of a kernel crash or power failure?
>
> If the mappings change, ie. something is promoted to the cache, or
> demoted from it.  Then the metadata update is committed and updated
> before the triggering io is issued.
>
> So power failure will not result in loss of a mapping, it may result
> in loss of data if your physical device has a write cache.  But this
> is also the case with the cache.
>
> - Joe
>

[-- Attachment #1.2: Type: text/html, Size: 2900 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: dm-cache failure semantics in write-back mode
  2015-03-03 17:13     ` Thanos Makatos
@ 2015-03-03 17:28       ` Joe Thornber
  2015-03-03 21:58       ` Spelic
  1 sibling, 0 replies; 7+ messages in thread
From: Joe Thornber @ 2015-03-03 17:28 UTC (permalink / raw)
  To: device-mapper development; +Cc: Thanos Makatos, heinzm, snitzer

On Tue, Mar 03, 2015 at 05:13:37PM +0000, Thanos Makatos wrote:
> Thanks, Joe. So just to make sure I've understood correctly, if the SSD
> cache is configured as a write-back cache but the device cache is
> disabled/set to write-though on the HDD and the SSD, then there is no risk
> of data loss in the event of a failure. Is my understanding correct?

IO that has completed will have really hit the disk.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: dm-cache failure semantics in write-back mode
  2015-03-03 17:13     ` Thanos Makatos
  2015-03-03 17:28       ` Joe Thornber
@ 2015-03-03 21:58       ` Spelic
  2015-03-05 14:06         ` Thanos Makatos
  1 sibling, 1 reply; 7+ messages in thread
From: Spelic @ 2015-03-03 21:58 UTC (permalink / raw)
  To: device-mapper development

On 03/03/2015 18:13, Thanos Makatos wrote:
> Thanks, Joe. So just to make sure I've understood correctly, if the 
> SSD cache is configured as a write-back cache but the device cache is 
> disabled/set to write-though on the HDD and the SSD, then there is no 
> risk of data loss in the event of a failure. Is my understanding correct?
>

You don't seem to understand the semantics of flush.
Writes on filesystems and databases are made more or less like a 
copy-on-write semantic:
- New data is written elsewhere
- a flush is issued
- when flush returns you are sure that such new data is on stable 
storage (disk platters or similar)
- change one pointer to point from old data to new data (so small that 
this change is atomic)
- flush again
when this flush returns you are sure that the data on-disk is updated.

Now you understand why dm-cache has the semantics that it has, which is 
the same semantics as DRAM caches on HDDs.

Applications writers have to follow the semantics described above, in 
order to have "atomic" updates on disk. This is true with dm-cache but 
also with DRAM caches on HDDs.
Partial data losses are irrelevant if the above logic is followed.

S.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: dm-cache failure semantics in write-back mode
  2015-03-03 21:58       ` Spelic
@ 2015-03-05 14:06         ` Thanos Makatos
  0 siblings, 0 replies; 7+ messages in thread
From: Thanos Makatos @ 2015-03-05 14:06 UTC (permalink / raw)
  To: device-mapper development


[-- Attachment #1.1: Type: text/plain, Size: 1627 bytes --]

Thanks for clarifying this. My concern is that an "application writer" can
be a Windows VM via qemu and I don't know if flush requests are supported
in the specific qemu version we're using, I'm in the process of confirming
that.

On Tue, Mar 3, 2015 at 9:58 PM, Spelic <spelic@shiftmail.org> wrote:

> On 03/03/2015 18:13, Thanos Makatos wrote:
>
>> Thanks, Joe. So just to make sure I've understood correctly, if the SSD
>> cache is configured as a write-back cache but the device cache is
>> disabled/set to write-though on the HDD and the SSD, then there is no risk
>> of data loss in the event of a failure. Is my understanding correct?
>>
>>
> You don't seem to understand the semantics of flush.
> Writes on filesystems and databases are made more or less like a
> copy-on-write semantic:
> - New data is written elsewhere
> - a flush is issued
> - when flush returns you are sure that such new data is on stable storage
> (disk platters or similar)
> - change one pointer to point from old data to new data (so small that
> this change is atomic)
> - flush again
> when this flush returns you are sure that the data on-disk is updated.
>
> Now you understand why dm-cache has the semantics that it has, which is
> the same semantics as DRAM caches on HDDs.
>
> Applications writers have to follow the semantics described above, in
> order to have "atomic" updates on disk. This is true with dm-cache but also
> with DRAM caches on HDDs.
> Partial data losses are irrelevant if the above logic is followed.
>
> S.
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
>

[-- Attachment #1.2: Type: text/html, Size: 2296 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-03-05 14:06 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-17 12:15 dm-cache failure semantics in write-back mode Thanos Makatos
2015-03-03 15:53 ` Thanos Makatos
2015-03-03 16:19   ` Joe Thornber
2015-03-03 17:13     ` Thanos Makatos
2015-03-03 17:28       ` Joe Thornber
2015-03-03 21:58       ` Spelic
2015-03-05 14:06         ` Thanos Makatos

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.