* [dm-devel] [PATCH] dm: call remove_single_exception_chunk before commit_merge
@ 2023-03-10 10:05 Jiazi.Li
2023-03-22 20:15 ` Mikulas Patocka
0 siblings, 1 reply; 2+ messages in thread
From: Jiazi.Li @ 2023-03-10 10:05 UTC (permalink / raw)
To: Alasdair Kergon, Mike Snitzer; +Cc: Jiazi.Li, dm-devel
Assume that the metadata of cow on the disk is corrupted after init
for some reason:
old chunk-id new chunk-id
0 2
...
x ---> 0 y
After starting merge, old chunk 0 will be updated twice, and old
chunk x will not be updated.
And dm-snap will print err log after merge new chunk 2 to old chunk 0:
<3>[ 731.921642] (1)[4092:kworker/1:0]device-mapper: snapshots:
Corruption detected: exception for block 0 is on disk but not in memory
then set snap->merge_failed to true.
If userspace use "sectors_allocated == metadata_sectors" to determine
whether the merge is completed, there maybe the following race that
makes the userspace unable to know merge fail event:
kernel merge kworker userspace process
merge_callback
->commit_merge
get snapshot_status by ioctl
->remove_single_exception_chunk
set merge_failed to true
think merge has been completed,
switch device to another target
Could we call remove_single_exception_chunk first to solve this race?
Signed-off-by: Jiazi.Li <jiazi.li@transsion.com>
---
drivers/md/dm-snap.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/md/dm-snap.c b/drivers/md/dm-snap.c
index f766c21408f1..f658d05752f2 100644
--- a/drivers/md/dm-snap.c
+++ b/drivers/md/dm-snap.c
@@ -1141,15 +1141,15 @@ static void merge_callback(int read_err, unsigned long write_err, void *context)
goto shut;
}
+ if (remove_single_exception_chunk(s) < 0)
+ goto shut;
+
if (s->store->type->commit_merge(s->store,
s->num_merging_chunks) < 0) {
DMERR("Write error in exception store: shutting down merge");
goto shut;
}
- if (remove_single_exception_chunk(s) < 0)
- goto shut;
-
snapshot_merge_next_chunks(s);
return;
--
2.17.1
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [dm-devel] [PATCH] dm: call remove_single_exception_chunk before commit_merge
2023-03-10 10:05 [dm-devel] [PATCH] dm: call remove_single_exception_chunk before commit_merge Jiazi.Li
@ 2023-03-22 20:15 ` Mikulas Patocka
0 siblings, 0 replies; 2+ messages in thread
From: Mikulas Patocka @ 2023-03-22 20:15 UTC (permalink / raw)
To: Jiazi.Li; +Cc: Jiazi.Li, dm-devel, Mike Snitzer, Alasdair Kergon
Hi
I don't feel strong need to fix it.
You claim that if you corrupt the snapshot metadata, there is a small race
condition where the corruption may not be reported to the user.
If you corrupt the snapshot metadata, you obviously get corrupted result
of the merge. "garbage in - garbage out". I don't think that we need to
improve handling of corrupted metadata.
I wrote this code 13 years ago, then I forgot the details about it, and I
don't feel confident touching it unless there is some strong reason.
Mikulas
On Fri, 10 Mar 2023, Jiazi.Li wrote:
> Assume that the metadata of cow on the disk is corrupted after init
> for some reason:
> old chunk-id new chunk-id
> 0 2
> ...
> x ---> 0 y
> After starting merge, old chunk 0 will be updated twice, and old
> chunk x will not be updated.
> And dm-snap will print err log after merge new chunk 2 to old chunk 0:
>
> <3>[ 731.921642] (1)[4092:kworker/1:0]device-mapper: snapshots:
> Corruption detected: exception for block 0 is on disk but not in memory
> then set snap->merge_failed to true.
>
> If userspace use "sectors_allocated == metadata_sectors" to determine
> whether the merge is completed, there maybe the following race that
> makes the userspace unable to know merge fail event:
>
> kernel merge kworker userspace process
> merge_callback
> ->commit_merge
> get snapshot_status by ioctl
> ->remove_single_exception_chunk
> set merge_failed to true
> think merge has been completed,
> switch device to another target
>
> Could we call remove_single_exception_chunk first to solve this race?
>
> Signed-off-by: Jiazi.Li <jiazi.li@transsion.com>
> ---
> drivers/md/dm-snap.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/md/dm-snap.c b/drivers/md/dm-snap.c
> index f766c21408f1..f658d05752f2 100644
> --- a/drivers/md/dm-snap.c
> +++ b/drivers/md/dm-snap.c
> @@ -1141,15 +1141,15 @@ static void merge_callback(int read_err, unsigned long write_err, void *context)
> goto shut;
> }
>
> + if (remove_single_exception_chunk(s) < 0)
> + goto shut;
> +
> if (s->store->type->commit_merge(s->store,
> s->num_merging_chunks) < 0) {
> DMERR("Write error in exception store: shutting down merge");
> goto shut;
> }
>
> - if (remove_single_exception_chunk(s) < 0)
> - goto shut;
> -
> snapshot_merge_next_chunks(s);
>
> return;
> --
> 2.17.1
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://listman.redhat.com/mailman/listinfo/dm-devel
>
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2023-03-22 20:15 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-10 10:05 [dm-devel] [PATCH] dm: call remove_single_exception_chunk before commit_merge Jiazi.Li
2023-03-22 20:15 ` Mikulas Patocka
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).