* [PATCH v2] mdadm/Detail: show correct state for cluster-md array
@ 2020-07-22 7:11 Zhao Heming
2020-07-22 7:20 ` heming.zhao
0 siblings, 1 reply; 5+ messages in thread
From: Zhao Heming @ 2020-07-22 7:11 UTC (permalink / raw)
To: linux-raid; +Cc: Zhao Heming, neilb, jes
After kernel md module commit 480523feae581, in md-cluster env,
mddev->in_sync always zero, it will make array.state never set
up MD_SB_CLEAN. it causes "mdadm -D /dev/mdX" show state 'active'
all the time.
bitmap.c: add a new API IsBitmapDirty() to support inquiry bitmap
dirty or clean.
Signed-off-by: Zhao Heming <heming.zhao@suse.com>
---
v2:
- Detail.c: change to read only one device.
- bitmap.c: modify IsBitmapDirty() to check all bitmap on the selected device.
---
Detail.c | 20 +++++++++++++++++++-
bitmap.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
mdadm.h | 1 +
3 files changed, 75 insertions(+), 1 deletion(-)
diff --git a/Detail.c b/Detail.c
index 24eeba0..cb5ce7d 100644
--- a/Detail.c
+++ b/Detail.c
@@ -495,8 +495,26 @@ int Detail(char *dev, struct context *c)
sra->array_state);
else
arrayst = "clean";
- } else
+ } else {
arrayst = "active";
+ if (array.state & (1<<MD_SB_CLUSTERED)) {
+ for (d = 0; d < max_disks * 2; d++) {
+ char *dv;
+ mdu_disk_info_t disk = disks[d];
+
+ if (d >= array.raid_disks * 2 &&
+ disk.major == 0 && disk.minor == 0)
+ continue;
+ if ((d & 1) && disk.major == 0 && disk.minor == 0)
+ continue;
+ dv = map_dev_preferred(disk.major, disk.minor, 0,
+ c->prefer);
+ if (dv && !IsBitmapDirty(dv))
+ arrayst = "clean";
+ break;
+ }
+ }
+ }
printf(" State : %s%s%s%s%s%s%s \n",
arrayst, st,
diff --git a/bitmap.c b/bitmap.c
index e38cb96..1095dc8 100644
--- a/bitmap.c
+++ b/bitmap.c
@@ -368,6 +368,61 @@ free_info:
return rv;
}
+int IsBitmapDirty(char *filename)
+{
+ /*
+ * Read the bitmap file
+ * This function is currently for cluster-md only.
+ * Return: 1(dirty), 0 (clean), -1(error)
+ */
+
+ int fd = -1, rv = 0, i;
+ struct supertype *st = NULL;
+ bitmap_info_t *info = NULL;
+ bitmap_super_t *sb = NULL;
+
+ fd = bitmap_file_open(filename, &st, 0);
+ free(st);
+ if (fd < 0)
+ goto out;
+
+ info = bitmap_fd_read(fd, 0);
+ close(fd);
+ if (!info)
+ goto out;
+
+ sb = &info->sb;
+ for (i = 0; i < (int)sb->nodes; i++) {
+ st = NULL;
+ free(info);
+ info = NULL;
+
+ fd = bitmap_file_open(filename, &st, i);
+ free(st);
+ if (fd < 0)
+ goto out;
+
+ info = bitmap_fd_read(fd, 0);
+ close(fd);
+ if (!info)
+ goto out;
+
+ sb = &info->sb;
+ if (sb->magic != BITMAP_MAGIC) { /* invalid bitmap magic */
+ free(info);
+ goto out;
+ }
+
+ if (info->dirty_bits)
+ rv = 1;
+ }
+
+ free(info);
+ return rv;
+out:
+ return -1;
+}
+
int CreateBitmap(char *filename, int force, char uuid[16],
unsigned long chunksize, unsigned long daemon_sleep,
unsigned long write_behind,
diff --git a/mdadm.h b/mdadm.h
index 399478b..ba8ba91 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -1447,6 +1447,7 @@ extern int CreateBitmap(char *filename, int force, char uuid[16],
unsigned long long array_size,
int major);
extern int ExamineBitmap(char *filename, int brief, struct supertype *st);
+extern int IsBitmapDirty(char *filename);
extern int Write_rules(char *rule_name);
extern int bitmap_update_uuid(int fd, int *uuid, int swap);
--
2.25.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2] mdadm/Detail: show correct state for cluster-md array
2020-07-22 7:11 [PATCH v2] mdadm/Detail: show correct state for cluster-md array Zhao Heming
@ 2020-07-22 7:20 ` heming.zhao
2020-07-26 8:14 ` Wols Lists
0 siblings, 1 reply; 5+ messages in thread
From: heming.zhao @ 2020-07-22 7:20 UTC (permalink / raw)
To: linux-raid; +Cc: neilb, jes
During I was creating patch, I found the ExamineBitmap() has memory leak issue.
I am not sure whether the leak issue should be fixed.
(Because when mdadm cmd finish, all leaked memory will be released).
The IsBitmapDirty() used some of ExamineBitmap() code, and I only fixed leaked issue in IsBitmapDirty().
Thanks,
heming
On 7/22/20 3:11 PM, Zhao Heming wrote:
> After kernel md module commit 480523feae581, in md-cluster env,
> mddev->in_sync always zero, it will make array.state never set
> up MD_SB_CLEAN. it causes "mdadm -D /dev/mdX" show state 'active'
> all the time.
>
> bitmap.c: add a new API IsBitmapDirty() to support inquiry bitmap
> dirty or clean.
>
> Signed-off-by: Zhao Heming <heming.zhao@suse.com>
> ---
> v2:
> - Detail.c: change to read only one device.
> - bitmap.c: modify IsBitmapDirty() to check all bitmap on the selected device.
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] mdadm/Detail: show correct state for cluster-md array
2020-07-22 7:20 ` heming.zhao
@ 2020-07-26 8:14 ` Wols Lists
2020-07-26 9:22 ` heming.zhao
0 siblings, 1 reply; 5+ messages in thread
From: Wols Lists @ 2020-07-26 8:14 UTC (permalink / raw)
To: heming.zhao, linux-raid; +Cc: neilb, jes
On 22/07/20 08:20, heming.zhao@suse.com wrote:
> During I was creating patch, I found the ExamineBitmap() has memory leak issue.
> I am not sure whether the leak issue should be fixed.
> (Because when mdadm cmd finish, all leaked memory will be released).
> The IsBitmapDirty() used some of ExamineBitmap() code, and I only fixed leaked issue in IsBitmapDirty().
>
My gut feel?
Firstly, "do things right" - it should be fixed.
Second - are you sure this code is not run while mdadm is running as a
daemon? It's all very well saying it will be released, but but mdadm
could be running for a looonnngg time.
Cheers,
Wol
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] mdadm/Detail: show correct state for cluster-md array
2020-07-26 8:14 ` Wols Lists
@ 2020-07-26 9:22 ` heming.zhao
2020-07-26 14:21 ` Wols Lists
0 siblings, 1 reply; 5+ messages in thread
From: heming.zhao @ 2020-07-26 9:22 UTC (permalink / raw)
To: Wols Lists, linux-raid; +Cc: neilb, jes
Hello Wols,
I just started to learn mdadm code. Maybe there are some historical reasons to keep leaked issue.
I guess your said daemon mode is: "mdadm --monitor --daemonise ...".
After very quickly browsing the code in Monitor.c, these mode check /proc/mdstat, send ioctl GET_ARRAY_INFO, and
read some /sys/block/mdX/md/xx files. There is no way to call ExamineBitmap().
In currently mdadm code, the only way to call ExamineBitmap() is by cmd "mdadm -X /dev/sdX". So as my last mail said, when the mdadm program finish, all leaked memory will be released.
And last week, before I send v2 patch, I try to use valgrind to check memory related issue, there are many places to leak. e.g.
```
<1>
# valgrind --leak-check=full ./mdadm -D /dev/md0
... ...
==3929==
==3929== HEAP SUMMARY:
==3929== in use at exit: 12,991 bytes in 190 blocks
==3929== total heap usage: 354 allocs, 164 frees, 2,414,075 bytes allocated
==3929==
==3929== 184 bytes in 1 blocks are definitely lost in loss record 15 of 24
==3929== at 0x4C306B5: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==3929== by 0x47EB4C: xcalloc (xmalloc.c:62)
==3929== by 0x4495E2: match_metadata_desc1 (super1.c:2316)
==3929== by 0x4125CE: super_by_fd (util.c:1213)
==3929== by 0x424E53: Detail (Detail.c:103)
==3929== by 0x408AAA: misc_list (mdadm.c:1970)
==3929== by 0x407CEF: main (mdadm.c:1640)
==3929==
==3929== LEAK SUMMARY:
==3929== definitely lost: 184 bytes in 1 blocks
==3929== indirectly lost: 0 bytes in 0 blocks
==3929== possibly lost: 0 bytes in 0 blocks
==3929== still reachable: 12,807 bytes in 189 blocks
==3929== suppressed: 0 bytes in 0 blocks
==3929== Reachable blocks (those to which a pointer was found) are not shown.
==3929== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==3929==
==3929== For lists of detected and suppressed errors, rerun with: -s
==3929== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
<2>
valgrind --leak-check=full ./mdadm -X /dev/sda
... ...
==4077==
==4077== HEAP SUMMARY:
==4077== in use at exit: 8,944 bytes in 58 blocks
==4077== total heap usage: 161 allocs, 103 frees, 458,399 bytes allocated
==4077==
==4077== 184 bytes in 1 blocks are definitely lost in loss record 13 of 19
==4077== at 0x4C306B5: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==4077== by 0x47EB4C: xcalloc (xmalloc.c:62)
==4077== by 0x412885: guess_super_type (util.c:1290)
==4077== by 0x47359F: guess_super (mdadm.h:1222)
==4077== by 0x473C1C: bitmap_file_open (bitmap.c:205)
==4077== by 0x473DB1: ExamineBitmap (bitmap.c:253)
==4077== by 0x408B62: misc_list (mdadm.c:1988)
==4077== by 0x407CEF: main (mdadm.c:1640)
==4077==
==4077== 736 bytes in 4 blocks are definitely lost in loss record 15 of 19
==4077== at 0x4C306B5: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==4077== by 0x47EB4C: xcalloc (xmalloc.c:62)
==4077== by 0x412885: guess_super_type (util.c:1290)
==4077== by 0x47359F: guess_super (mdadm.h:1222)
==4077== by 0x473C1C: bitmap_file_open (bitmap.c:205)
==4077== by 0x4742A5: ExamineBitmap (bitmap.c:337)
==4077== by 0x408B62: misc_list (mdadm.c:1988)
==4077== by 0x407CEF: main (mdadm.c:1640)
==4077==
==4077== LEAK SUMMARY:
==4077== definitely lost: 920 bytes in 5 blocks
==4077== indirectly lost: 0 bytes in 0 blocks
==4077== possibly lost: 0 bytes in 0 blocks
==4077== still reachable: 8,024 bytes in 53 blocks
==4077== suppressed: 0 bytes in 0 blocks
==4077== Reachable blocks (those to which a pointer was found) are not shown.
==4077== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==4077==
==4077== For lists of detected and suppressed errors, rerun with: -s
==4077== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
<3>
# valgrind --leak-check=full ./mdadm -a /dev/md0 /dev/sdc
... ...
==4096== Warning: noted but unhandled ioctl 0x1269 with no size/direction hints.
==4096== This could cause spurious value errors to appear.
==4096== See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
mdadm: added /dev/sdc
==4096== Syscall param write(buf) points to uninitialised byte(s)
==4096== at 0x512C244: write (in /lib64/libc-2.26.so)
==4096== by 0x57FB706: ??? (in /usr/lib64/libdlm_lt.so.3.0)
==4096== by 0x57FC0F1: dlm_ls_unlock (in /usr/lib64/libdlm_lt.so.3.0)
==4096== by 0x40F84E: cluster_release_dlmlock (util.c:198)
==4096== by 0x40837B: main (mdadm.c:1780)
==4096== Address 0x1ffefffc0e is on thread 1's stack
==4096== in frame #2, created by dlm_ls_unlock (???:)
==4096==
==4096== Syscall param write(buf) points to uninitialised byte(s)
==4096== at 0x512C244: write (in /lib64/libc-2.26.so)
==4096== by 0x57FC4E0: dlm_release_lockspace (in /usr/lib64/libdlm_lt.so.3.0)
==4096== by 0x40F906: cluster_release_dlmlock (util.c:218)
==4096== by 0x40837B: main (mdadm.c:1780)
==4096== Address 0x1ffeffeb5e is on thread 1's stack
==4096== in frame #1, created by dlm_release_lockspace (???:)
==4096==
==4096==
==4096== HEAP SUMMARY:
==4096== in use at exit: 13,737 bytes in 197 blocks
==4096== total heap usage: 278 allocs, 81 frees, 3,253,146 bytes allocated
==4096==
==4096== 184 bytes in 1 blocks are definitely lost in loss record 19 of 30
==4096== at 0x4C306B5: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==4096== by 0x47EB4C: xcalloc (xmalloc.c:62)
==4096== by 0x4495E2: match_metadata_desc1 (super1.c:2316)
==4096== by 0x4125CE: super_by_fd (util.c:1213)
==4096== by 0x419258: Manage_subdevs (Manage.c:1344)
==4096== by 0x407398: main (mdadm.c:1477)
==4096==
==4096== 184 bytes in 1 blocks are definitely lost in loss record 20 of 30
==4096== at 0x4C306B5: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==4096== by 0x47EB4C: xcalloc (xmalloc.c:62)
==4096== by 0x4127E7: dup_super (util.c:1268)
==4096== by 0x417D73: Manage_add (Manage.c:813)
==4096== by 0x419C3F: Manage_subdevs (Manage.c:1564)
==4096== by 0x407398: main (mdadm.c:1477)
==4096==
==4096== LEAK SUMMARY:
==4096== definitely lost: 368 bytes in 2 blocks
==4096== indirectly lost: 0 bytes in 0 blocks
==4096== possibly lost: 0 bytes in 0 blocks
==4096== still reachable: 13,369 bytes in 195 blocks
==4096== suppressed: 0 bytes in 0 blocks
==4096== Reachable blocks (those to which a pointer was found) are not shown.
==4096== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==4096==
==4096== Use --track-origins=yes to see where uninitialised values come from
==4096== For lists of detected and suppressed errors, rerun with: -s
==4096== ERROR SUMMARY: 5 errors from 5 contexts (suppressed: 0 from 0)
```
Thanks,
heming
On 7/26/20 4:14 PM, Wols Lists wrote:
> On 22/07/20 08:20, heming.zhao@suse.com wrote:
>> During I was creating patch, I found the ExamineBitmap() has memory leak issue.
>> I am not sure whether the leak issue should be fixed.
>> (Because when mdadm cmd finish, all leaked memory will be released).
>> The IsBitmapDirty() used some of ExamineBitmap() code, and I only fixed leaked issue in IsBitmapDirty().
>>
> My gut feel?
>
> Firstly, "do things right" - it should be fixed.
> Second - are you sure this code is not run while mdadm is running as a
> daemon? It's all very well saying it will be released, but but mdadm
> could be running for a looonnngg time.
>
> Cheers,
> Wol
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] mdadm/Detail: show correct state for cluster-md array
2020-07-26 9:22 ` heming.zhao
@ 2020-07-26 14:21 ` Wols Lists
0 siblings, 0 replies; 5+ messages in thread
From: Wols Lists @ 2020-07-26 14:21 UTC (permalink / raw)
To: heming.zhao, linux-raid; +Cc: neilb, jes
On 26/07/20 10:22, heming.zhao@suse.com wrote:
> Hello Wols,
>
> I just started to learn mdadm code. Maybe there are some historical reasons to keep leaked issue.
> I guess your said daemon mode is: "mdadm --monitor --daemonise ...".
> After very quickly browsing the code in Monitor.c, these mode check /proc/mdstat, send ioctl GET_ARRAY_INFO, and
> read some /sys/block/mdX/md/xx files. There is no way to call ExamineBitmap().
> In currently mdadm code, the only way to call ExamineBitmap() is by cmd "mdadm -X /dev/sdX". So as my last mail said, when the mdadm program finish, all leaked memory will be released.
> And last week, before I send v2 patch, I try to use valgrind to check memory related issue, there are many places to leak. e.g.
You're learning the mdadm code? Personally, I find it hard to learn
stuff if there's no purpose behind what I'm studying. Treat it as a
learning exercise and fix all the leaks ;-)
As they say, if a job's worth doing it's worth doing well, and you can
learn what stuff is doing while you're working your way through it.
I need to learn my way around mdadm, and I've got a task in mind that'll
teach me a load of it, so all being well I'll soon be following in your
footsteps ...
Cheers,
Wol
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-07-26 14:21 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-22 7:11 [PATCH v2] mdadm/Detail: show correct state for cluster-md array Zhao Heming
2020-07-22 7:20 ` heming.zhao
2020-07-26 8:14 ` Wols Lists
2020-07-26 9:22 ` heming.zhao
2020-07-26 14:21 ` Wols Lists
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.