All of lore.kernel.org
 help / color / mirror / Atom feed
* [dm-devel] dm-clone: Request option to send discard to source device during hydration
@ 2023-03-27 20:24 Gwendal Grignou
  2023-03-28 16:20 ` Mike Snitzer
  0 siblings, 1 reply; 3+ messages in thread
From: Gwendal Grignou @ 2023-03-27 20:24 UTC (permalink / raw)
  To: dm-devel; +Cc: Sarthak Kukreti, Daniil Lunev

On ChromeOS, we are working on migrating file backed loopback devices
to thinpool logical volumes using dm-clone on the Chromebook local
SSD.
Dm-clone hydration workflow is a great fit but the design of dm-clone
assumes a read-only source device. Data present in the backing file
will be copied to the new logical volume but can be safely deleted
only when the hydration process is complete. During migration, some
data will be duplicated and usage on the Chromebook SSD will
unnecessarily increase.
Would it be reasonable to add a discard option when enabling the
hydration process to discard data as we go on the source device?
2 implementations are possible:
a- add a state to the hydration state machine to ensure a region is
discarded before considering another region.
b- a simpler implementation where the discard is sent asynchronously
at the end of a region copy. It may not complete successfully (in case
the device crashes during the hydration for instance), but will vastly
reduce the amount of data left  in the source device at the end of the
hydration.

I prefer b) as it is easier to implement, but a) is cleaner from a
usage point of view.

Gwendal.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [dm-devel] dm-clone: Request option to send discard to source device during hydration
  2023-03-27 20:24 [dm-devel] dm-clone: Request option to send discard to source device during hydration Gwendal Grignou
@ 2023-03-28 16:20 ` Mike Snitzer
  2023-03-29 17:42   ` Nikos Tsironis
  0 siblings, 1 reply; 3+ messages in thread
From: Mike Snitzer @ 2023-03-28 16:20 UTC (permalink / raw)
  To: Gwendal Grignou; +Cc: Sarthak Kukreti, dm-devel, Daniil Lunev, Nikos Tsironis

On Mon, Mar 27 2023 at  4:24P -0400,
Gwendal Grignou <gwendal@chromium.org> wrote:

> On ChromeOS, we are working on migrating file backed loopback devices
> to thinpool logical volumes using dm-clone on the Chromebook local
> SSD.
> Dm-clone hydration workflow is a great fit but the design of dm-clone
> assumes a read-only source device. Data present in the backing file
> will be copied to the new logical volume but can be safely deleted
> only when the hydration process is complete. During migration, some
> data will be duplicated and usage on the Chromebook SSD will
> unnecessarily increase.
> Would it be reasonable to add a discard option when enabling the
> hydration process to discard data as we go on the source device?
> 2 implementations are possible:
> a- add a state to the hydration state machine to ensure a region is
> discarded before considering another region.
> b- a simpler implementation where the discard is sent asynchronously
> at the end of a region copy. It may not complete successfully (in case
> the device crashes during the hydration for instance), but will vastly
> reduce the amount of data left  in the source device at the end of the
> hydration.
> 
> I prefer b) as it is easier to implement, but a) is cleaner from a
> usage point of view.

In general, discards may not complete for any number of reasons. So
while a) gives you finer-grained potential for space being
deallocated, b) would likely suffice given that a device crash is
pretty unlikely (at least I would think).  And in the case of file
backed loopback devices, independent of dm-clone, you can just issue
discard(s) to all free space after a crash?

However you elect to do it, you'd do well to make it an optional
"discard_rw_src" (or some better name) feature that is configured when
you load the dm-clone target.

Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [dm-devel] dm-clone: Request option to send discard to source device during hydration
  2023-03-28 16:20 ` Mike Snitzer
@ 2023-03-29 17:42   ` Nikos Tsironis
  0 siblings, 0 replies; 3+ messages in thread
From: Nikos Tsironis @ 2023-03-29 17:42 UTC (permalink / raw)
  To: Mike Snitzer, Gwendal Grignou
  Cc: Sarthak Kukreti, dm-devel, ntsironis, Daniil Lunev

On 3/28/23 19:20, Mike Snitzer wrote:
> On Mon, Mar 27 2023 at  4:24P -0400,
> Gwendal Grignou <gwendal@chromium.org> wrote:
> 
>> On ChromeOS, we are working on migrating file backed loopback devices
>> to thinpool logical volumes using dm-clone on the Chromebook local
>> SSD.
>> Dm-clone hydration workflow is a great fit but the design of dm-clone
>> assumes a read-only source device. Data present in the backing file
>> will be copied to the new logical volume but can be safely deleted
>> only when the hydration process is complete. During migration, some
>> data will be duplicated and usage on the Chromebook SSD will
>> unnecessarily increase.
>> Would it be reasonable to add a discard option when enabling the
>> hydration process to discard data as we go on the source device?
>> 2 implementations are possible:
>> a- add a state to the hydration state machine to ensure a region is
>> discarded before considering another region.
>> b- a simpler implementation where the discard is sent asynchronously
>> at the end of a region copy. It may not complete successfully (in case
>> the device crashes during the hydration for instance), but will vastly
>> reduce the amount of data left  in the source device at the end of the
>> hydration.
>>
>> I prefer b) as it is easier to implement, but a) is cleaner from a
>> usage point of view.
> 
> In general, discards may not complete for any number of reasons. So
> while a) gives you finer-grained potential for space being
> deallocated, b) would likely suffice given that a device crash is
> pretty unlikely (at least I would think).  And in the case of file
> backed loopback devices, independent of dm-clone, you can just issue
> discard(s) to all free space after a crash?
> 
> However you elect to do it, you'd do well to make it an optional
> "discard_rw_src" (or some better name) feature that is configured when
> you load the dm-clone target.
> 

I agree with Mike, but I also want to note the following.

dm-clone commits its on-disk metadata periodically every second, and
every time a FLUSH or FUA bio is written. This is done to improve
performance.

This means the dm-clone device behaves like a physical disk that has a
volatile write cache. If power is lost you may lose some recent writes,
_and_ dm-clone might need to rehydrate some regions.

So, you can't discard a region on the source device after the copy
operation has finished, because then the following scenario will result
in data corruption:

1. dm-clone hydrates a region
2. dm-clone discards the region on the source device, either
    synchronously (a) or asynchronously (b)
3. The system crashes before the metadata is committed
4. The system comes up, and dm-clone rehydrates the region, because it
    thinks it has not been hydrated yet
5. The source device might contain garbage for this region, since we
    discarded it previously
6. You have data corruption

So, you can only discard hydrated regions for which the metadata have
been committed on disk.

I think you could discard hydrated regions on the source device
periodically, right after committing the metadata.

dm-clone keeps track of the regions hydrated during each metadata
transaction, so after committing the metadata for the current
transaction, you could also sent an asynchronous discard for these
regions.

Nikos.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-03-29 17:44 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-27 20:24 [dm-devel] dm-clone: Request option to send discard to source device during hydration Gwendal Grignou
2023-03-28 16:20 ` Mike Snitzer
2023-03-29 17:42   ` Nikos Tsironis

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.