All of lore.kernel.org
 help / color / mirror / Atom feed
* Collecting aged XFS profiles
@ 2017-07-16  0:11 Saurabh Kadekodi
  2017-07-16  2:57 ` Eric Sandeen
  2017-07-17 19:00 ` Stefan Ring
  0 siblings, 2 replies; 19+ messages in thread
From: Saurabh Kadekodi @ 2017-07-16  0:11 UTC (permalink / raw)
  To: linux-xfs

Hi,

I am a PhD student studying file and storage systems and I am currently conducting research on local file system aging. My research aims at understanding realistic aging patterns and analyzing the effects of aging on file system data structures and its performance. For this purpose, I would like to capture characteristics of naturally aged file systems (i.e. not aged via synthetic workload generators).

In order to facilitate this profile capture, I have written a shell / python based profiling tool (fsagestats - https://github.com/saurabhkadekodi/fsagestats)  that does a file system tree walk and captures different characteristics (file age, file size and directory depth) of files and directories and produces distributions. I do not care about file names or data within each file. It also runs xfs_db in order to capture the free space fragmentation, file fragmentation, directory fragmentation and overall fragmentation; all of which are directly correlated with the file system performance. It dumps the results in the results dir, which is to be specified when you run fsagestats. You can send me the aging profile by tarring up the results directory and sending it via email.

Since I do not have access to XFS systems that see a lot of churn, I am reaching out to the XFS community in order to find volunteers willing to run my script and capture their XFS aging profile. Please feel free to modify the script as per your installation or as you see fit. Since fsagestats collects no private information, I eventually intend to host these profiles publicly (unless explicitly requested not to) to aid other researchers / enthusiasts.

In case you have any questions on concerns, please let me know.

Thanks,
Saurabh Kadekodi

PS: cc’ing the response and / or the aging profile to saukad@cs.cmu.edu is greatly appreciated.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Collecting aged XFS profiles
  2017-07-16  0:11 Collecting aged XFS profiles Saurabh Kadekodi
@ 2017-07-16  2:57 ` Eric Sandeen
  2017-07-17 19:00 ` Stefan Ring
  1 sibling, 0 replies; 19+ messages in thread
From: Eric Sandeen @ 2017-07-16  2:57 UTC (permalink / raw)
  To: Saurabh Kadekodi, linux-xfs



On 07/15/2017 07:11 PM, Saurabh Kadekodi wrote:
> Hi,
> 
> I am a PhD student studying file and storage systems and I am currently conducting research on local file system aging. My research aims at understanding realistic aging patterns and analyzing the effects of aging on file system data structures and its performance. For this purpose, I would like to capture characteristics of naturally aged file systems (i.e. not aged via synthetic workload generators).
> 
> In order to facilitate this profile capture, I have written a shell / python based profiling tool (fsagestats - https://github.com/saurabhkadekodi/fsagestats)  that does a file system tree walk and captures different characteristics (file age, file size and directory depth) of files and directories and produces distributions. I do not care about file names or data within each file. It also runs xfs_db in order to capture the free space fragmentation, file fragmentation, directory fragmentation and overall fragmentation; all of which are directly correlated with the file system performance. It dumps the results in the results dir, which is to be specified when you run fsagestats. You can send me the aging profile by tarring up the results directory and sending it via email.
> 
> Since I do not have access to XFS systems that see a lot of churn, I am reaching out to the XFS community in order to find volunteers willing to run my script and capture their XFS aging profile. Please feel free to modify the script as per your installation or as you see fit. Since fsagestats collects no private information, I eventually intend to host these profiles publicly (unless explicitly requested not to) to aid other researchers / enthusiasts.

Just a note - 

Please see
http://xfs.org/index.php/XFS_FAQ#Q:_The_xfs_db_.22frag.22_command_says_I.27m_over_50.25._Is_that_bad.3F
before you put too much faith or emphasis in the frag number you get back ...

-Eric

> In case you have any questions on concerns, please let me know.
> 
> Thanks,
> Saurabh Kadekodi

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Collecting aged XFS profiles
  2017-07-16  0:11 Collecting aged XFS profiles Saurabh Kadekodi
  2017-07-16  2:57 ` Eric Sandeen
@ 2017-07-17 19:00 ` Stefan Ring
  2017-07-17 23:48   ` Dave Chinner
  1 sibling, 1 reply; 19+ messages in thread
From: Stefan Ring @ 2017-07-17 19:00 UTC (permalink / raw)
  To: Saurabh Kadekodi; +Cc: linux-xfs

On Sun, Jul 16, 2017 at 2:11 AM, Saurabh Kadekodi <saukad@cs.cmu.edu> wrote:
> Hi,
>
> I am a PhD student studying file and storage systems and I am currently conducting research on local file system aging. My research aims at understanding realistic aging patterns and analyzing the effects of aging on file system data structures and its performance. For this purpose, I would like to capture characteristics of naturally aged file systems (i.e. not aged via synthetic workload generators).
>
> In order to facilitate this profile capture, I have written a shell / python based profiling tool (fsagestats - https://github.com/saurabhkadekodi/fsagestats)  that does a file system tree walk and captures different characteristics (file age, file size and directory depth) of files and directories and produces distributions. I do not care about file names or data within each file. It also runs xfs_db in order to capture the free space fragmentation, file fragmentation, directory fragmentation and overall fragmentation; all of which are directly correlated with the file system performance. It dumps the results in the results dir, which is to be specified when you run fsagestats. You can send me the aging profile by tarring up the results directory and sending it via email.
>
> Since I do not have access to XFS systems that see a lot of churn, I am reaching out to the XFS community in order to find volunteers willing to run my script and capture their XFS aging profile. Please feel free to modify the script as per your installation or as you see fit. Since fsagestats collects no private information, I eventually intend to host these profiles publicly (unless explicitly requested not to) to aid other researchers / enthusiasts.
>
> In case you have any questions on concerns, please let me know.

I have a nicely aged filesystem (1 TB) on our dev server with around
10 million files on it. I will not run a script that executes two
xfs_io calls *for each file* on it. Why don't you just use Python's
stat.stat to get at the ctime and the size?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Collecting aged XFS profiles
  2017-07-17 19:00 ` Stefan Ring
@ 2017-07-17 23:48   ` Dave Chinner
  2017-07-18  5:45     ` Saurabh Kadekodi
  2017-07-19  7:59     ` Stefan Ring
  0 siblings, 2 replies; 19+ messages in thread
From: Dave Chinner @ 2017-07-17 23:48 UTC (permalink / raw)
  To: Stefan Ring; +Cc: Saurabh Kadekodi, linux-xfs

On Mon, Jul 17, 2017 at 09:00:19PM +0200, Stefan Ring wrote:
> On Sun, Jul 16, 2017 at 2:11 AM, Saurabh Kadekodi
> <saukad@cs.cmu.edu> wrote:
> > Hi,
> >
> > I am a PhD student studying file and storage systems and I am
> > currently conducting research on local file system aging. My
> > research aims at understanding realistic aging patterns and
> > analyzing the effects of aging on file system data structures
> > and its performance. For this purpose, I would like to capture
> > characteristics of naturally aged file systems (i.e. not aged
> > via synthetic workload generators).

Hi Saurabh - it's a great idea to do this, but I suspect you might
want to spend some more time learning about the mechanisms
and policies XFS uses to prevent aging and maintain performance. I'm
suggesting this because knowing what the filesystem is trying to do
will drastically change your idea of what information needs to be
gathered....

> > In order to facilitate this profile capture, I have written a shell / python based profiling tool (fsagestats - https://github.com/saurabhkadekodi/fsagestats)  that does a file system tree walk and captures different characteristics (file age, file size and directory depth) of files and directories and produces distributions. I do not care about file names or data within each file. It also runs xfs_db in order to capture the free space fragmentation, file fragmentation, directory fragmentation and overall fragmentation; all of which are directly correlated with the file system performance. It dumps the results in the results dir, which is to be specified when you run fsagestats. You can send me the aging profile by tarring up the results directory and sending it via email.
> >
> > Since I do not have access to XFS systems that see a lot of churn, I am reaching out to the XFS community in order to find volunteers willing to run my script and capture their XFS aging profile. Please feel free to modify the script as per your installation or as you see fit. Since fsagestats collects no private information, I eventually intend to host these profiles publicly (unless explicitly requested not to) to aid other researchers / enthusiasts.
> >
> > In case you have any questions on concerns, please let me know.
> 
> I have a nicely aged filesystem (1 TB) on our dev server with around
> 10 million files on it. I will not run a script that executes two
> xfs_io calls *for each file* on it. Why don't you just use Python's
> stat.stat to get at the ctime and the size?

Ok, had a look at the script. You can replace most of it with
pretty much one line.

$ find <dir> -exec stat -c "%n %Z %s" {} \;

Processing the dirents to get the "distribution stats" could be done
by piping the output into a five line awk script. I'll leave that
as an exercise for the reader.

IMO, the script is not gathering anything particularly useful about
how the filesystem has aged. The information being gathered doesn't
tell us anything useful about how the allocator is performing for
the given workload, nor does it provide insight into the locality
characteristics and fragmentation of related files and directories
which directly influence IO (and hence filesystem) performance.

e.g. if the inode64 allocator is in use, then all the files in a
directory should be in the same physical region. As such, a key sign
of an aged filesystem is that the allocator is not able to maintain
the desired locality relationships between files.

To analyse such things, maybe consider gathering obfuscated metadump
images rather asking people to run scripts that gather limited
information.  That way you can develop scripts to extract the
information your research requires from the filesystem images you
received, rather than try to draw tenuous conclusions from a limited
data set...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Collecting aged XFS profiles
  2017-07-17 23:48   ` Dave Chinner
@ 2017-07-18  5:45     ` Saurabh Kadekodi
  2017-07-19  7:59     ` Stefan Ring
  1 sibling, 0 replies; 19+ messages in thread
From: Saurabh Kadekodi @ 2017-07-18  5:45 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Stefan Ring, linux-xfs

Hi Dave,

Thanks for the detailed reply. I was unaware of the sophisticated tools that XFS had and since I am also looking at other file systems in parallel, I missed out on exploring these utilities. xfs_metadump seems extremely apt for my study since it will essentially provide me the complete metadata allowing me to query free space fragmentation, file fragmentation, etc, without requiring people to conduct a potentially expensive file system tree walk, as would have been in Stefan’s case.

I am completely fine with people running xfs_metadump on their aged images. It would be great if they could also run xfs_info on their mount point so that I know the file system size, block size, etc. in order for me to restore and analyze their obfuscated metadata dump.

I believe a tar.gz of the metadump file should be small enough to be attached to an email. In case it is too large, you can either create a pull request by forking my github project (https://github.com/saurabhkadekodi/fsagestats) and adding your data in the aged_file_system_profiles directory, or let me know and I can arrange for uploading the data on some server hosted at my school.

Thanks,
Saurabh

> On Jul 17, 2017, at 4:48 PM, Dave Chinner <david@fromorbit.com> wrote:
> 
> On Mon, Jul 17, 2017 at 09:00:19PM +0200, Stefan Ring wrote:
>> On Sun, Jul 16, 2017 at 2:11 AM, Saurabh Kadekodi
>> <saukad@cs.cmu.edu> wrote:
>>> Hi,
>>> 
>>> I am a PhD student studying file and storage systems and I am
>>> currently conducting research on local file system aging. My
>>> research aims at understanding realistic aging patterns and
>>> analyzing the effects of aging on file system data structures
>>> and its performance. For this purpose, I would like to capture
>>> characteristics of naturally aged file systems (i.e. not aged
>>> via synthetic workload generators).
> 
> Hi Saurabh - it's a great idea to do this, but I suspect you might
> want to spend some more time learning about the mechanisms
> and policies XFS uses to prevent aging and maintain performance. I'm
> suggesting this because knowing what the filesystem is trying to do
> will drastically change your idea of what information needs to be
> gathered....
> 
>>> In order to facilitate this profile capture, I have written a shell / python based profiling tool (fsagestats - https://github.com/saurabhkadekodi/fsagestats)  that does a file system tree walk and captures different characteristics (file age, file size and directory depth) of files and directories and produces distributions. I do not care about file names or data within each file. It also runs xfs_db in order to capture the free space fragmentation, file fragmentation, directory fragmentation and overall fragmentation; all of which are directly correlated with the file system performance. It dumps the results in the results dir, which is to be specified when you run fsagestats. You can send me the aging profile by tarring up the results directory and sending it via email.
>>> 
>>> Since I do not have access to XFS systems that see a lot of churn, I am reaching out to the XFS community in order to find volunteers willing to run my script and capture their XFS aging profile. Please feel free to modify the script as per your installation or as you see fit. Since fsagestats collects no private information, I eventually intend to host these profiles publicly (unless explicitly requested not to) to aid other researchers / enthusiasts.
>>> 
>>> In case you have any questions on concerns, please let me know.
>> 
>> I have a nicely aged filesystem (1 TB) on our dev server with around
>> 10 million files on it. I will not run a script that executes two
>> xfs_io calls *for each file* on it. Why don't you just use Python's
>> stat.stat to get at the ctime and the size?
> 
> Ok, had a look at the script. You can replace most of it with
> pretty much one line.
> 
> $ find <dir> -exec stat -c "%n %Z %s" {} \;
> 
> Processing the dirents to get the "distribution stats" could be done
> by piping the output into a five line awk script. I'll leave that
> as an exercise for the reader.
> 
> IMO, the script is not gathering anything particularly useful about
> how the filesystem has aged. The information being gathered doesn't
> tell us anything useful about how the allocator is performing for
> the given workload, nor does it provide insight into the locality
> characteristics and fragmentation of related files and directories
> which directly influence IO (and hence filesystem) performance.
> 
> e.g. if the inode64 allocator is in use, then all the files in a
> directory should be in the same physical region. As such, a key sign
> of an aged filesystem is that the allocator is not able to maintain
> the desired locality relationships between files.
> 
> To analyse such things, maybe consider gathering obfuscated metadump
> images rather asking people to run scripts that gather limited
> information.  That way you can develop scripts to extract the
> information your research requires from the filesystem images you
> received, rather than try to draw tenuous conclusions from a limited
> data set...
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Collecting aged XFS profiles
  2017-07-17 23:48   ` Dave Chinner
  2017-07-18  5:45     ` Saurabh Kadekodi
@ 2017-07-19  7:59     ` Stefan Ring
  2017-07-19 15:20       ` Eric Sandeen
  1 sibling, 1 reply; 19+ messages in thread
From: Stefan Ring @ 2017-07-19  7:59 UTC (permalink / raw)
  To: linux-xfs

On Tue, Jul 18, 2017 at 1:48 AM, Dave Chinner <david@fromorbit.com> wrote:
> To analyse such things, maybe consider gathering obfuscated metadump
> images rather asking people to run scripts that gather limited
> information.  That way you can develop scripts to extract the
> information your research requires from the filesystem images you
> received, rather than try to draw tenuous conclusions from a limited
> data set...

I have created a metadump that is 1GB in size, xz-compressed. However,
by running strings on it I find that there are many identifiable
remains inside, and I cannot legally pass this on. xfsprogs is
xfsprogs-3.1.1-10.el6, which is obviously really old. I'm reluctant to
just run a newer version on this production machine; after all I've
once almost brought it down by running xfs_bmap on a heavily
fragmented file.

The question is: can I import this metadata image in a VM and recreate
the metadata image from there, using modern xfsprogs? Will this
preserve most of the relevant information?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Collecting aged XFS profiles
  2017-07-19  7:59     ` Stefan Ring
@ 2017-07-19 15:20       ` Eric Sandeen
  2017-07-19 21:08         ` Stefan Ring
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Sandeen @ 2017-07-19 15:20 UTC (permalink / raw)
  To: Stefan Ring, linux-xfs

On 07/19/2017 02:59 AM, Stefan Ring wrote:
> On Tue, Jul 18, 2017 at 1:48 AM, Dave Chinner <david@fromorbit.com> wrote:
>> To analyse such things, maybe consider gathering obfuscated metadump
>> images rather asking people to run scripts that gather limited
>> information.  That way you can develop scripts to extract the
>> information your research requires from the filesystem images you
>> received, rather than try to draw tenuous conclusions from a limited
>> data set...
> 
> I have created a metadump that is 1GB in size, xz-compressed. However,
> by running strings on it I find that there are many identifiable
> remains inside, and I cannot legally pass this on. xfsprogs is
> xfsprogs-3.1.1-10.el6, which is obviously really old. I'm reluctant to
> just run a newer version on this production machine; after all I've
> once almost brought it down by running xfs_bmap on a heavily
> fragmented file.

newer metadump should correct that problem, and is a read-only tool,
so should be (tm) perfectly safe (tm).  You could run it out of a built
git repo, via the xfs_db commands.

> The question is: can I import this metadata image in a VM and recreate
> the metadata image from there, using modern xfsprogs? Will this
> preserve most of the relevant information?

yes, that would work too. (mdrestore followed by or piped through metadump)
If you find significant strings in that result please let me know :)

Thanks,
-Eric

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Collecting aged XFS profiles
  2017-07-19 15:20       ` Eric Sandeen
@ 2017-07-19 21:08         ` Stefan Ring
  2017-07-19 22:00           ` Eric Sandeen
  2017-07-20  3:02           ` Dave Chinner
  0 siblings, 2 replies; 19+ messages in thread
From: Stefan Ring @ 2017-07-19 21:08 UTC (permalink / raw)
  To: linux-xfs

On Wed, Jul 19, 2017 at 5:20 PM, Eric Sandeen <sandeen@sandeen.net> wrote:
> On 07/19/2017 02:59 AM, Stefan Ring wrote:
>> I have created a metadump that is 1GB in size, xz-compressed. However,
>> by running strings on it I find that there are many identifiable
>> remains inside, and I cannot legally pass this on.
>
> newer metadump should correct that problem, and is a read-only tool,
> so should be (tm) perfectly safe (tm).  You could run it out of a built
> git repo, via the xfs_db commands.
>
>> The question is: can I import this metadata image in a VM and recreate
>> the metadata image from there, using modern xfsprogs? Will this
>> preserve most of the relevant information?
>
> yes, that would work too. (mdrestore followed by or piped through metadump)
> If you find significant strings in that result please let me know :)

There is still quite a lot of stuff that should not be there (pasted
selectively while scrolling over it via less):

File names:
10-stdio
6-log-shell_2-stdio.bz2
8-log-shell_4-stdio
:(%40-log-MasterShellCommand_1-stdio.bz2
2-log-hg_periotheus-stdio.bz2
:(#37-log-MasterShellCommand-stdio.bz2
42-log-systemcfg-1-stdio
2060
0:&'1937-log-MasterShellCommand_1-stdio.bz2
42-log-systemcfg-2-stdio
38-log-hg_periotheus-stdio.bz2
40-log-shell_7-stdio
1-log-shell_3-bootstrap.log.bz2
2-log-shell_3-bootstrap.log.bz2
38-log-shell_2-stdio.bz2
41-log-shell-stdio
_9-stdio.bz2
4-log-shell-stdio
38-log-shell_3-stdio.bz2
bootstrap.log.bz2
5-stdio
R       Um
43-log-shell_5-stdio
38-log-shell_6-test.log
43-log-systemcfg-1-stdio
38-log-shell_7-stdio
6-stdio.bz2
41-log-shell_1-stdio
2-log-shell_1-stdio
3-stdio
38-log-shell_8-stdio.bz2
43-log-systemcfg-2-stdio
38-log-shell_9-stdio.bz2

Random contents:
# This file is also read by man in order to find how to call nroff, less, etc.,
# and to determine the correspondence between extensions and decompressors.
# MANBIN                /usr/local/bin/man
# Every automatically generated MANPATH includes these fields
MANPATH /usr/man
MANPATH /usr/share/man
MANPATH /usr/local/man
MANPATH /usr/local/share/man
MANPATH /usr/X11R6/man
# Uncomment if you want to include one of these by default
# MANPATH       /opt/*/man
# MANPATH       /usr/lib/*/man
# MANPATH       /usr/share/*/man
# MANPATH       /usr/kerberos/man
# Set up PATH to MANPATH mapping
# If people ask for "man foo" and have "/dir/bin/foo" in their PATH
# and the docs are found in "/dir/man", then no mapping is required.
# The below mappings are superfluous when the right hand side is
# in the mandatory manpath already, but will keep man from statting
# lots of other nearby files and directories.
MANPATH_MAP     /bin                    /usr/share/man
MANPATH_MAP     /sbin                   /usr/share/man
MANPATH_MAP     /usr/bin                /usr/share/man
MANPATH_MAP     /usr/sbin               /usr/share/man
MANPATH_MAP     /usr/local/bin          /usr/local/share/man
MANPATH_MAP     /usr/local/sbin         /usr/local/share/man
MANPATH_MAP     /usr/X11R6/bin          /usr/X11R6/man
MANPATH_MAP     /usr/bin/X11            /usr/X11R6/man
MANPATH_MAP     /usr/bin/mh             /usr/share/man
# NOAUTOPATH keeps man from automatically adding directories that look like
# manual page directories to the path.
#NOAUTOPATH

Some stray file names in the middle of nowhere:
0pAB
0Lm_vrm
0VRgkg4YiYJ4
y       ?
Jm#\
Ym_v8j
Pstructure.yaml
i-EPEL
pRPM-GPG-KEY-beta
V`:h
Y[uG$P
_;bA
YmP|
@dataKN
PV3SspqG
Y%GE
$Y%GE
01.66TI
u8U7
.bau
+_weakrefset.pychangeset_and_sha256sums.sh
jni_create_stap.c
jni_desc
update_tarballs.sh
pEtWuoffV_SJ57dR3dTyogT44qdBy-BGDu
xdxX8Qs
e+X>
10dTGV  S.

Log files:
2015-04-15 16:25:34 (140466167215872)   File
"/opt/vtse/lib/python2.7/tempfile.py", line 300, in mkstemp
2015-04-15 16:25:34 (140466167215872)     return _mkstemp_inner(dir,
prefix, suffix, flags)
2015-04-15 16:25:34 (140466167215872)   File
"/opt/vtse/lib/python2.7/tempfile.py", line 235, in _mkstemp_inner
2015-04-15 16:25:34 (140466167215872)     fd = _os.open(file, flags, 0600)
2015-04-15 16:25:34 (140466167215872) OSError: [Errno 2] No such file
or directory:

Some SVG:
w3" y2="101.5" stroke="#000000" stroke-width="2.0" />
    <line x1="184.8" y1="105.0" x2="195.3" y2="108.5" stroke="#000000"
stroke-width="2.0" />
    <line x1="210.0" y1="105.0" x2="199.5" y2="108.5" stroke="#000000"
stroke-width="2.0" />
    <line x1="210.0" y1="105.0" x2="199.5" y2="101.5" stroke="#000000"
stroke-width="2.0" />
    <line x1="184.8" y1="105.0" x2="210.0" y2="105.0" stroke="#000000"
stroke-width="2.0" />
</g>
<line x1="197.4" y1="112.0" x2="197.4" y2="133.0" stroke="#000000"
stroke-width="2.0" />
<line x1="4.2" y1="133.0" x2="197.4" y2="133.0" stroke="#000000"
stroke-width="2.0" />
<line x1="197.4" y1="133.0" x2="617.4" y2="133.0" stroke="#000000"
stroke-width="2.0" />
<line x1="4.2" y1="133.0" x2="4.2" y2="189.0" stroke="#000000"
stroke-width="2.0" />
<line x1="197.4" y1="133.0" x2="197.4" y2="154.0" stroke="#000000"
stroke-width="2.0" />
<line x1="449.4" y1="147.0" x2="600.6" y2="147.0" stroke="#000000"
stroke-width="2.0" />
<line x1="617.4" y1="133.0" x2="617.4" y2="189.0" stroke="#000000"
stroke-width="2.0" />
    <line x1="184.8" y1="161.0" x2="195.3" y2="157.5" stroke="#000000"
stroke-width="2.0" />
    <line x1="184.8" y1="161.0" x2="195.3" y2="164.5" stroke="#000000"
stroke-width="2.0" />
    <line x1="210.0" y1="161.0" x2="199.5" y2="164.5" stroke="#000000"
stroke-width="2.0" />
    <line x1="210.0" y1="161.0" x2="199.5" y2="157.5" stroke="#000000"
stroke-width="2.0" />
    <line x1="184.8" y1="161.0" x2="210.0" y2="161.0" stroke="#000000"
stroke-width="2.0" />
</g>
    <line x1="445.2" y1="161.0" x2="434.7" y2="164.5" stroke="#000000"
stroke-width="2.0" />
    <line x1="445.2" y1="161.0" x2="434.7" y2="157.5" stroke="#000000"
stroke-width="2.0" />
    <line x1="428.4" y1="161.0" x2="445.2" y2="161.0" stroke="#000000"
stroke-width="4.0" />
</g>
<line x1="449.4" y1="147.0" x2="449.4" y2="175.0" stroke="#000000"
stroke-width="2.0" />
<line x1="600.6" y1="147.0" x2="600.6" y2="175.0" stroke="#000000"
stroke-width="2.0" />
<line x1="197.4" y1="168.0" x2="197.4" y2="189.0" stroke="#000000"
stroke-width="2.0" />

I used the current git version like this:
./db/xfs_db -f -i -p xfs_metadump -c "metadump -e -g -w -"
xfs-meta.rawimg | strings

commit e116c5c4511bbc2d98579817232258d57a1f1777
Author: Eric Sandeen <sandeen@redhat.com>
Date:   Fri May 5 13:25:49 2017 -0500

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Collecting aged XFS profiles
  2017-07-19 21:08         ` Stefan Ring
@ 2017-07-19 22:00           ` Eric Sandeen
  2017-07-20  7:52             ` Stefan Ring
  2017-07-20  3:02           ` Dave Chinner
  1 sibling, 1 reply; 19+ messages in thread
From: Eric Sandeen @ 2017-07-19 22:00 UTC (permalink / raw)
  To: Stefan Ring, linux-xfs

On 07/19/2017 04:08 PM, Stefan Ring wrote:
> On Wed, Jul 19, 2017 at 5:20 PM, Eric Sandeen <sandeen@sandeen.net> wrote:
>> On 07/19/2017 02:59 AM, Stefan Ring wrote:
>>> I have created a metadump that is 1GB in size, xz-compressed. However,
>>> by running strings on it I find that there are many identifiable
>>> remains inside, and I cannot legally pass this on.
>>
>> newer metadump should correct that problem, and is a read-only tool,
>> so should be (tm) perfectly safe (tm).  You could run it out of a built
>> git repo, via the xfs_db commands.
>>
>>> The question is: can I import this metadata image in a VM and recreate
>>> the metadata image from there, using modern xfsprogs? Will this
>>> preserve most of the relevant information?
>>
>> yes, that would work too. (mdrestore followed by or piped through metadump)
>> If you find significant strings in that result please let me know :)
> 
> There is still quite a lot of stuff that should not be there (pasted
> selectively while scrolling over it via less):


> File names:
> 10-stdio
> 6-log-shell_2-stdio.bz2
> 8-log-shell_4-stdio
> :(%40-log-MasterShellCommand_1-stdio.bz2

<snip way too many strings>

> 
> I used the current git version like this:
> ./db/xfs_db -f -i -p xfs_metadump -c "metadump -e -g -w -"
> xfs-meta.rawimg | strings

ok, that shoulda worked... I wonder how to debug this, if you can't
legally share the problematic image with me to investigate...

I put a lot of effort into selectively zeroing out unused portions
of metablocks a couple years back, I am surprised that this much remains.

I wonder if something regressed ...

-Eric

> commit e116c5c4511bbc2d98579817232258d57a1f1777
> Author: Eric Sandeen <sandeen@redhat.com>
> Date:   Fri May 5 13:25:49 2017 -0500

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Collecting aged XFS profiles
  2017-07-19 21:08         ` Stefan Ring
  2017-07-19 22:00           ` Eric Sandeen
@ 2017-07-20  3:02           ` Dave Chinner
  2017-07-20  3:55             ` Eric Sandeen
  1 sibling, 1 reply; 19+ messages in thread
From: Dave Chinner @ 2017-07-20  3:02 UTC (permalink / raw)
  To: Stefan Ring; +Cc: linux-xfs

On Wed, Jul 19, 2017 at 11:08:41PM +0200, Stefan Ring wrote:
> On Wed, Jul 19, 2017 at 5:20 PM, Eric Sandeen <sandeen@sandeen.net> wrote:
> > On 07/19/2017 02:59 AM, Stefan Ring wrote:
> >> I have created a metadump that is 1GB in size, xz-compressed. However,
> >> by running strings on it I find that there are many identifiable
> >> remains inside, and I cannot legally pass this on.
> >
> > newer metadump should correct that problem, and is a read-only tool,
> > so should be (tm) perfectly safe (tm).  You could run it out of a built
> > git repo, via the xfs_db commands.
> >
> >> The question is: can I import this metadata image in a VM and recreate
> >> the metadata image from there, using modern xfsprogs? Will this
> >> preserve most of the relevant information?
> >
> > yes, that would work too. (mdrestore followed by or piped through metadump)
> > If you find significant strings in that result please let me know :)
> 
> There is still quite a lot of stuff that should not be there (pasted
> selectively while scrolling over it via less):

[snip]

This stuff could all be in the journal. This is why I said "metadump
will first need to be modified to avoid dumping the journal".
Shouldn't be too hard - we already avoid dumping the journal when it
is external....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Collecting aged XFS profiles
  2017-07-20  3:02           ` Dave Chinner
@ 2017-07-20  3:55             ` Eric Sandeen
  2017-07-20  4:38               ` Dave Chinner
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Sandeen @ 2017-07-20  3:55 UTC (permalink / raw)
  To: Dave Chinner, Stefan Ring; +Cc: linux-xfs



On 07/19/2017 10:02 PM, Dave Chinner wrote:
> On Wed, Jul 19, 2017 at 11:08:41PM +0200, Stefan Ring wrote:
>> On Wed, Jul 19, 2017 at 5:20 PM, Eric Sandeen <sandeen@sandeen.net> wrote:
>>> On 07/19/2017 02:59 AM, Stefan Ring wrote:
>>>> I have created a metadump that is 1GB in size, xz-compressed. However,
>>>> by running strings on it I find that there are many identifiable
>>>> remains inside, and I cannot legally pass this on.
>>>
>>> newer metadump should correct that problem, and is a read-only tool,
>>> so should be (tm) perfectly safe (tm).  You could run it out of a built
>>> git repo, via the xfs_db commands.
>>>
>>>> The question is: can I import this metadata image in a VM and recreate
>>>> the metadata image from there, using modern xfsprogs? Will this
>>>> preserve most of the relevant information?
>>>
>>> yes, that would work too. (mdrestore followed by or piped through metadump)
>>> If you find significant strings in that result please let me know :)
>>
>> There is still quite a lot of stuff that should not be there (pasted
>> selectively while scrolling over it via less):
> 
> [snip]
> 
> This stuff could all be in the journal. This is why I said "metadump
> will first need to be modified to avoid dumping the journal".
> Shouldn't be too hard - we already avoid dumping the journal when it
> is external....

We already (in effect) avoid dumping it when it's clean:

        dirty = xlog_is_dirty(mp, &log, &x, 0);

        switch (dirty) {
        case 0:
                /* clear out a clean log */
                if (show_progress)
                        print_progress("Zeroing clean log");
		...
		libxfs_log_clear( ...

and if it /wasn't/ clean, he'd have gotten the warning:

_("Warning: log recovery of an obfuscated metadata image can leak "
"unobfuscated metadata and/or cause image corruption.  If possible, "
"please mount the filesystem to clean the log, or disable obfuscation."));

-Eric

> Cheers,
> 
> Dave.
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Collecting aged XFS profiles
  2017-07-20  3:55             ` Eric Sandeen
@ 2017-07-20  4:38               ` Dave Chinner
  2017-07-20 14:24                 ` Eric Sandeen
  0 siblings, 1 reply; 19+ messages in thread
From: Dave Chinner @ 2017-07-20  4:38 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Stefan Ring, linux-xfs

On Wed, Jul 19, 2017 at 10:55:23PM -0500, Eric Sandeen wrote:
> 
> 
> On 07/19/2017 10:02 PM, Dave Chinner wrote:
> > On Wed, Jul 19, 2017 at 11:08:41PM +0200, Stefan Ring wrote:
> >> On Wed, Jul 19, 2017 at 5:20 PM, Eric Sandeen <sandeen@sandeen.net> wrote:
> >>> On 07/19/2017 02:59 AM, Stefan Ring wrote:
> >>>> I have created a metadump that is 1GB in size, xz-compressed. However,
> >>>> by running strings on it I find that there are many identifiable
> >>>> remains inside, and I cannot legally pass this on.
> >>>
> >>> newer metadump should correct that problem, and is a read-only tool,
> >>> so should be (tm) perfectly safe (tm).  You could run it out of a built
> >>> git repo, via the xfs_db commands.
> >>>
> >>>> The question is: can I import this metadata image in a VM and recreate
> >>>> the metadata image from there, using modern xfsprogs? Will this
> >>>> preserve most of the relevant information?
> >>>
> >>> yes, that would work too. (mdrestore followed by or piped through metadump)
> >>> If you find significant strings in that result please let me know :)
> >>
> >> There is still quite a lot of stuff that should not be there (pasted
> >> selectively while scrolling over it via less):
> > 
> > [snip]
> > 
> > This stuff could all be in the journal. This is why I said "metadump
> > will first need to be modified to avoid dumping the journal".
> > Shouldn't be too hard - we already avoid dumping the journal when it
> > is external....
> 
> We already (in effect) avoid dumping it when it's clean:
> 
>         dirty = xlog_is_dirty(mp, &log, &x, 0);
> 
>         switch (dirty) {
>         case 0:
>                 /* clear out a clean log */
>                 if (show_progress)
>                         print_progress("Zeroing clean log");
> 		...
> 		libxfs_log_clear( ...
> 
> and if it /wasn't/ clean, he'd have gotten the warning:
> 
> _("Warning: log recovery of an obfuscated metadata image can leak "
> "unobfuscated metadata and/or cause image corruption.  If possible, "
> "please mount the filesystem to clean the log, or disable obfuscation."));

Oh, that's in the for-next branch, not the master branch. No wonder
I couldn't find this at first. :/

FWIW, people building from the git tree are probably using the
master branch (v4.11.0), not the for-next branch. I used to stage
all the -rcX releases in the master branch, which was why I was
confused by this at first. Never mind, I'll update my scripts that
haven't been pulling the xfsprogs for-next branch from kernel.org...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Collecting aged XFS profiles
  2017-07-19 22:00           ` Eric Sandeen
@ 2017-07-20  7:52             ` Stefan Ring
  2017-07-20 14:27               ` Eric Sandeen
  0 siblings, 1 reply; 19+ messages in thread
From: Stefan Ring @ 2017-07-20  7:52 UTC (permalink / raw)
  To: linux-xfs

On Thu, Jul 20, 2017 at 12:00 AM, Eric Sandeen <sandeen@sandeen.net> wrote:
> ok, that shoulda worked... I wonder how to debug this, if you can't
> legally share the problematic image with me to investigate...
>
> I put a lot of effort into selectively zeroing out unused portions
> of metablocks a couple years back, I am surprised that this much remains.
>
> I wonder if something regressed ...

I will try if I can find out which code paths lead to the content
being dumped out.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Collecting aged XFS profiles
  2017-07-20  4:38               ` Dave Chinner
@ 2017-07-20 14:24                 ` Eric Sandeen
  2017-07-20 22:27                   ` Dave Chinner
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Sandeen @ 2017-07-20 14:24 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Stefan Ring, linux-xfs



On 07/19/2017 11:38 PM, Dave Chinner wrote:
> On Wed, Jul 19, 2017 at 10:55:23PM -0500, Eric Sandeen wrote:
>>
>>
>> On 07/19/2017 10:02 PM, Dave Chinner wrote:
>>> On Wed, Jul 19, 2017 at 11:08:41PM +0200, Stefan Ring wrote:
>>>> On Wed, Jul 19, 2017 at 5:20 PM, Eric Sandeen <sandeen@sandeen.net> wrote:
>>>>> On 07/19/2017 02:59 AM, Stefan Ring wrote:
>>>>>> I have created a metadump that is 1GB in size, xz-compressed. However,
>>>>>> by running strings on it I find that there are many identifiable
>>>>>> remains inside, and I cannot legally pass this on.
>>>>>
>>>>> newer metadump should correct that problem, and is a read-only tool,
>>>>> so should be (tm) perfectly safe (tm).  You could run it out of a built
>>>>> git repo, via the xfs_db commands.
>>>>>
>>>>>> The question is: can I import this metadata image in a VM and recreate
>>>>>> the metadata image from there, using modern xfsprogs? Will this
>>>>>> preserve most of the relevant information?
>>>>>
>>>>> yes, that would work too. (mdrestore followed by or piped through metadump)
>>>>> If you find significant strings in that result please let me know :)
>>>>
>>>> There is still quite a lot of stuff that should not be there (pasted
>>>> selectively while scrolling over it via less):
>>>
>>> [snip]
>>>
>>> This stuff could all be in the journal. This is why I said "metadump
>>> will first need to be modified to avoid dumping the journal".
>>> Shouldn't be too hard - we already avoid dumping the journal when it
>>> is external....
>>
>> We already (in effect) avoid dumping it when it's clean:
>>
>>         dirty = xlog_is_dirty(mp, &log, &x, 0);
>>
>>         switch (dirty) {
>>         case 0:
>>                 /* clear out a clean log */
>>                 if (show_progress)
>>                         print_progress("Zeroing clean log");
>> 		...
>> 		libxfs_log_clear( ...
>>
>> and if it /wasn't/ clean, he'd have gotten the warning:
>>
>> _("Warning: log recovery of an obfuscated metadata image can leak "
>> "unobfuscated metadata and/or cause image corruption.  If possible, "
>> "please mount the filesystem to clean the log, or disable obfuscation."));
> 
> Oh, that's in the for-next branch, not the master branch. No wonder
> I couldn't find this at first. :/

Well the clean log handling has been there for a couple of years; the
warning is new, yes.

> FWIW, people building from the git tree are probably using the
> master branch (v4.11.0), not the for-next branch. I used to stage
> all the -rcX releases in the master branch, which was why I was
> confused by this at first. Never mind, I'll update my scripts that
> haven't been pulling the xfsprogs for-next branch from kernel.org...

yeah, I was doing that when I was new to the maintainer game, for-next
was rebasable and more forgiving.  Maybe I should change that back.

-Eric

> Cheers,
> 
> Dave.
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Collecting aged XFS profiles
  2017-07-20  7:52             ` Stefan Ring
@ 2017-07-20 14:27               ` Eric Sandeen
  2017-07-20 20:15                 ` Stefan Ring
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Sandeen @ 2017-07-20 14:27 UTC (permalink / raw)
  To: Stefan Ring, linux-xfs

On 07/20/2017 02:52 AM, Stefan Ring wrote:
> On Thu, Jul 20, 2017 at 12:00 AM, Eric Sandeen <sandeen@sandeen.net> wrote:
>> ok, that shoulda worked... I wonder how to debug this, if you can't
>> legally share the problematic image with me to investigate...
>>
>> I put a lot of effort into selectively zeroing out unused portions
>> of metablocks a couple years back, I am surprised that this much remains.
>>
>> I wonder if something regressed ...
> 
> I will try if I can find out which code paths lead to the content
> being dumped out.

Perhaps you could send xfs_info output and offsets of the "safe" strings,
and maybe look at block or sector boundaries of the blocks containing
those "safe" strings to identify magic numbers which would identify
the metadata types ...

-eric

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Collecting aged XFS profiles
  2017-07-20 14:27               ` Eric Sandeen
@ 2017-07-20 20:15                 ` Stefan Ring
  2017-07-20 20:21                   ` Eric Sandeen
  0 siblings, 1 reply; 19+ messages in thread
From: Stefan Ring @ 2017-07-20 20:15 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-xfs

On Thu, Jul 20, 2017 at 4:27 PM, Eric Sandeen <sandeen@sandeen.net> wrote:
> On 07/20/2017 02:52 AM, Stefan Ring wrote:
>> On Thu, Jul 20, 2017 at 12:00 AM, Eric Sandeen <sandeen@sandeen.net> wrote:
>>> ok, that shoulda worked... I wonder how to debug this, if you can't
>>> legally share the problematic image with me to investigate...
>>>
>>> I put a lot of effort into selectively zeroing out unused portions
>>> of metablocks a couple years back, I am surprised that this much remains.
>>>
>>> I wonder if something regressed ...
>>
>> I will try if I can find out which code paths lead to the content
>> being dumped out.
>
> Perhaps you could send xfs_info output and offsets of the "safe" strings,
> and maybe look at block or sector boundaries of the blocks containing
> those "safe" strings to identify magic numbers which would identify
> the metadata types ...
>

Ok,

I'm using the for-next branch now (0602fbe880) and inserted something
like this so I can set a breakpoint on my_break.

--- a/db/metadump.c
+++ b/db/metadump.c
@@ -169,6 +169,10 @@ write_index(void)
        return 0;
 }

+void my_break()
+{
+}
+
 /*
  * Return 0 for success, -errno for failure.
  */
@@ -184,6 +188,8 @@ write_buf_segment(
        for (i = 0; i < len; i++, off++, data += BBSIZE) {
                block_index[cur_index] = cpu_to_be64(off);
                memcpy(&block_buffer[cur_index << BBSHIFT], data, BBSIZE);
+               if (memmem(data, BBSIZE, "_mkstemp_inner", 14))
+                       my_break();
                if (++cur_index == num_indices) {
                        ret = write_index();
                        if (ret)

The backtraces are quite similar except for the "stray filenames"
case. I've repeated the exercise for every snippet I've highlighted in
the previous e-mail (in the same order):

"MasterShellCommand"
#0  my_break () at metadump.c:174
#1  0x000000000041e2a6 in write_buf_segment (data=0xe26c00 "XD2F",
off=134667936, len=8) at metadump.c:192
#2  0x000000000041e3d3 in write_buf (buf=0x7fd7b0) at metadump.c:238
#3  0x000000000042163d in process_single_fsb_objects (o=16777216,
s=16833492, c=1, btype=TYP_DIR2, last=16777217)
    at metadump.c:1829
#4  0x0000000000421cdd in process_bmbt_reclist (rp=0xdf5628,
numrecs=34, btype=TYP_DIR2) at metadump.c:1997
#5  0x0000000000421e22 in scanfunc_bmap (block=0xdf5400, agno=0,
agbno=57273, level=0, btype=TYP_BMAPBTD,
    arg=0x7fffffffd994) at metadump.c:2031
#6  0x000000000041eb6c in scan_btree (agno=0, agbno=57273, level=1,
btype=TYP_BMAPBTD, arg=0x7fffffffd994,
    func=0x421d40 <scanfunc_bmap>) at metadump.c:427
#7  0x000000000042236b in process_btinode (dip=0xdd3700,
itype=TYP_DIR2) at metadump.c:2125
#8  0x000000000042276f in process_inode_data (dip=0xdd3700,
itype=TYP_DIR2) at metadump.c:2186
#9  0x00000000004228b0 in process_inode (agno=0, agino=900491,
dip=0xdd3700, free_inode=false) at metadump.c:2231
#10 0x0000000000422e15 in copy_inode_chunk (agno=0, rp=0xafee40) at
metadump.c:2373
#11 0x0000000000422fd5 in scanfunc_ino (block=0xafea00, agno=0,
agbno=25, level=0, btype=TYP_INOBT, arg=0x7fffffffdc44)
    at metadump.c:2433
#12 0x000000000041eb6c in scan_btree (agno=0, agbno=25, level=1,
btype=TYP_INOBT, arg=0x7fffffffdc44,
    func=0x422ecd <scanfunc_ino>) at metadump.c:427
#13 0x000000000042317e in scanfunc_ino (block=0x78d200, agno=0,
agbno=63095, level=1, btype=TYP_INOBT,
    arg=0x7fffffffdc44) at metadump.c:2456
#14 0x000000000041eb6c in scan_btree (agno=0, agbno=63095, level=2,
btype=TYP_INOBT, arg=0x7fffffffdc44,
    func=0x422ecd <scanfunc_ino>) at metadump.c:427
#15 0x0000000000423274 in copy_inodes (agno=0, agi=0x6cd400) at metadump.c:2489
#16 0x00000000004238b3 in scan_ag (agno=0) at metadump.c:2613
#17 0x00000000004244be in metadump_f (argc=5, argv=0x6cd310) at metadump.c:2895
#18 0x000000000041a544 in main (argc=<value optimized out>,
argv=<value optimized out>) at init.c:207


"MANPATH_MAP"
#0  my_break () at metadump.c:174
#1  0x000000000041e2a6 in write_buf_segment (
    data=0xe38000 " nearby files and
directories.\n#\nMANPATH_MAP\t/bin\t\t\t/usr/share/man\nMANPATH_MAP\t/sbin\t\t\t/usr/share/man\nMANPATH_MAP\t/usr/bin\t\t/usr/share/man\nMANPATH_MAP\t/usr/sbin\t\t/usr/share/man\nMANPATH_MAP\t/usr/local/"...,
off=67559158, len=8) at metadump.c:192
#2  0x000000000041e3d3 in write_buf (buf=0x7fd750) at metadump.c:238
#3  0x000000000042163d in process_single_fsb_objects (o=8388608,
s=8444894, c=1, btype=TYP_DIR2, last=8388609)
    at metadump.c:1829
#4  0x0000000000421cdd in process_bmbt_reclist (rp=0xe2ff84,
numrecs=3, btype=TYP_DIR2) at metadump.c:1997
#5  0x00000000004226db in process_exinode (dip=0xe2ff00,
itype=TYP_DIR2) at metadump.c:2157
#6  0x000000000042275c in process_inode_data (dip=0xe2ff00,
itype=TYP_DIR2) at metadump.c:2183
#7  0x00000000004228b0 in process_inode (agno=0, agino=900611,
dip=0xe2ff00, free_inode=false) at metadump.c:2231
#8  0x0000000000422e15 in copy_inode_chunk (agno=0, rp=0xafee50) at
metadump.c:2373
#9  0x0000000000422fd5 in scanfunc_ino (block=0xafea00, agno=0,
agbno=25, level=0, btype=TYP_INOBT, arg=0x7fffffffdc44)
    at metadump.c:2433
#10 0x000000000041eb6c in scan_btree (agno=0, agbno=25, level=1,
btype=TYP_INOBT, arg=0x7fffffffdc44,
    func=0x422ecd <scanfunc_ino>) at metadump.c:427
#11 0x000000000042317e in scanfunc_ino (block=0x78d200, agno=0,
agbno=63095, level=1, btype=TYP_INOBT,
    arg=0x7fffffffdc44) at metadump.c:2456
#12 0x000000000041eb6c in scan_btree (agno=0, agbno=63095, level=2,
btype=TYP_INOBT, arg=0x7fffffffdc44,
    func=0x422ecd <scanfunc_ino>) at metadump.c:427
#13 0x0000000000423274 in copy_inodes (agno=0, agi=0x6cd400) at metadump.c:2489
#14 0x00000000004238b3 in scan_ag (agno=0) at metadump.c:2613
#15 0x00000000004244be in metadump_f (argc=5, argv=0x6cd310) at metadump.c:2895
#16 0x000000000041a544 in main (argc=<value optimized out>,
argv=<value optimized out>) at init.c:207


"RPM-GPG-KEY-beta"
#0  my_break () at metadump.c:174
#1  0x000000000041e2a6 in write_buf_segment (data=0xf1b600
"INA\375\002\002", off=451728, len=32) at metadump.c:192
#2  0x000000000041e3d3 in write_buf (buf=0x7fd6f0) at metadump.c:238
#3  0x0000000000422e4b in copy_inode_chunk (agno=0, rp=0xafef80) at
metadump.c:2380
#4  0x0000000000422fd5 in scanfunc_ino (block=0xafea00, agno=0,
agbno=25, level=0, btype=TYP_INOBT, arg=0x7fffffffdc44)
    at metadump.c:2433
#5  0x000000000041eb6c in scan_btree (agno=0, agbno=25, level=1,
btype=TYP_INOBT, arg=0x7fffffffdc44,
    func=0x422ecd <scanfunc_ino>) at metadump.c:427
#6  0x000000000042317e in scanfunc_ino (block=0x78d200, agno=0,
agbno=63095, level=1, btype=TYP_INOBT,
    arg=0x7fffffffdc44) at metadump.c:2456
#7  0x000000000041eb6c in scan_btree (agno=0, agbno=63095, level=2,
btype=TYP_INOBT, arg=0x7fffffffdc44,
    func=0x422ecd <scanfunc_ino>) at metadump.c:427
#8  0x0000000000423274 in copy_inodes (agno=0, agi=0x6cd400) at metadump.c:2489
#9  0x00000000004238b3 in scan_ag (agno=0) at metadump.c:2613
#10 0x00000000004244be in metadump_f (argc=5, argv=0x6cd310) at metadump.c:2895
#11 0x000000000041a544 in main (argc=<value optimized out>,
argv=<value optimized out>) at init.c:207


"_mkstemp_inner"
#0  my_break () at metadump.c:174
#1  0x000000000041e2a6 in write_buf_segment (
    data=0x1388e00 "le.py\", line 300, in mkstemp\n2015-04-15 16:25:34
(140466167215872)     return _mkstemp_inner(dir, prefix, suffix,
flags)\n2015-04-15 16:25:34 (140466167215872)   File
\"/opt/vtse/lib/python2.7/tempfile."...,
    off=67568243, len=8) at metadump.c:192
#2  0x000000000041e3d3 in write_buf (buf=0x7fd750) at metadump.c:238
#3  0x000000000042163d in process_single_fsb_objects (o=8388608,
s=8446030, c=1, btype=TYP_DIR2, last=8388609)
    at metadump.c:1829
#4  0x0000000000421cdd in process_bmbt_reclist (rp=0x1382b84,
numrecs=3, btype=TYP_DIR2) at metadump.c:1997
#5  0x00000000004226db in process_exinode (dip=0x1382b00,
itype=TYP_DIR2) at metadump.c:2157
#6  0x000000000042275c in process_inode_data (dip=0x1382b00,
itype=TYP_DIR2) at metadump.c:2183
#7  0x00000000004228b0 in process_inode (agno=0, agino=918921,
dip=0x1382b00, free_inode=false) at metadump.c:2231
#8  0x0000000000422e15 in copy_inode_chunk (agno=0, rp=0x117bbd0) at
metadump.c:2373
#9  0x0000000000422fd5 in scanfunc_ino (block=0x117b800, agno=0,
agbno=57301, level=0, btype=TYP_INOBT,
    arg=0x7fffffffdc44) at metadump.c:2433
#10 0x000000000041eb6c in scan_btree (agno=0, agbno=57301, level=1,
btype=TYP_INOBT, arg=0x7fffffffdc44,
    func=0x422ecd <scanfunc_ino>) at metadump.c:427
#11 0x000000000042317e in scanfunc_ino (block=0x78d200, agno=0,
agbno=63095, level=1, btype=TYP_INOBT,
    arg=0x7fffffffdc44) at metadump.c:2456
#12 0x000000000041eb6c in scan_btree (agno=0, agbno=63095, level=2,
btype=TYP_INOBT, arg=0x7fffffffdc44,
    func=0x422ecd <scanfunc_ino>) at metadump.c:427
#13 0x0000000000423274 in copy_inodes (agno=0, agi=0x6cd400) at metadump.c:2489
#14 0x00000000004238b3 in scan_ag (agno=0) at metadump.c:2613
#15 0x00000000004244be in metadump_f (argc=5, argv=0x6cd310) at metadump.c:2895
#16 0x000000000041a544 in main (argc=<value optimized out>,
argv=<value optimized out>) at init.c:207

SVG
#0  my_break () at metadump.c:174
#1  0x000000000041e2a6 in write_buf_segment (data=0x132f400 "",
off=67567736, len=8) at metadump.c:192
#2  0x000000000041e3d3 in write_buf (buf=0x7fd750) at metadump.c:238
#3  0x000000000042163d in process_single_fsb_objects (o=8388608,
s=8445967, c=1, btype=TYP_DIR2, last=8388609)
    at metadump.c:1829
#4  0x0000000000421cdd in process_bmbt_reclist (rp=0x132b874,
numrecs=2, btype=TYP_DIR2) at metadump.c:1997
#5  0x00000000004226db in process_exinode (dip=0x132b800,
itype=TYP_DIR2) at metadump.c:2157
#6  0x000000000042275c in process_inode_data (dip=0x132b800,
itype=TYP_DIR2) at metadump.c:2183
#7  0x00000000004228b0 in process_inode (agno=0, agino=917784,
dip=0x132b800, free_inode=false) at metadump.c:2231
#8  0x0000000000422e15 in copy_inode_chunk (agno=0, rp=0x117bb50) at
metadump.c:2373
#9  0x0000000000422fd5 in scanfunc_ino (block=0x117b800, agno=0,
agbno=57301, level=0, btype=TYP_INOBT,
    arg=0x7fffffffdc44) at metadump.c:2433
#10 0x000000000041eb6c in scan_btree (agno=0, agbno=57301, level=1,
btype=TYP_INOBT, arg=0x7fffffffdc44,
    func=0x422ecd <scanfunc_ino>) at metadump.c:427
#11 0x000000000042317e in scanfunc_ino (block=0x78d200, agno=0,
agbno=63095, level=1, btype=TYP_INOBT,
    arg=0x7fffffffdc44) at metadump.c:2456
#12 0x000000000041eb6c in scan_btree (agno=0, agbno=63095, level=2,
btype=TYP_INOBT, arg=0x7fffffffdc44,
    func=0x422ecd <scanfunc_ino>) at metadump.c:427
#13 0x0000000000423274 in copy_inodes (agno=0, agi=0x6cd400) at metadump.c:2489
#14 0x00000000004238b3 in scan_ag (agno=0) at metadump.c:2613
#15 0x00000000004244be in metadump_f (argc=5, argv=0x6cd310) at metadump.c:2895
#16 0x000000000041a544 in main (argc=<value optimized out>,
argv=<value optimized out>) at init.c:207


Is this helpful?

Now I also know that I have proceeded to inode 21440 of 11857664 with
my visual inspection, so it is a good thing that I stopped early ;).

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Collecting aged XFS profiles
  2017-07-20 20:15                 ` Stefan Ring
@ 2017-07-20 20:21                   ` Eric Sandeen
  0 siblings, 0 replies; 19+ messages in thread
From: Eric Sandeen @ 2017-07-20 20:21 UTC (permalink / raw)
  To: Stefan Ring; +Cc: linux-xfs



On 07/20/2017 03:15 PM, Stefan Ring wrote:
> On Thu, Jul 20, 2017 at 4:27 PM, Eric Sandeen <sandeen@sandeen.net> wrote:
>> On 07/20/2017 02:52 AM, Stefan Ring wrote:
>>> On Thu, Jul 20, 2017 at 12:00 AM, Eric Sandeen <sandeen@sandeen.net> wrote:
>>>> ok, that shoulda worked... I wonder how to debug this, if you can't
>>>> legally share the problematic image with me to investigate...
>>>>
>>>> I put a lot of effort into selectively zeroing out unused portions
>>>> of metablocks a couple years back, I am surprised that this much remains.
>>>>
>>>> I wonder if something regressed ...
>>>
>>> I will try if I can find out which code paths lead to the content
>>> being dumped out.
>>
>> Perhaps you could send xfs_info output and offsets of the "safe" strings,
>> and maybe look at block or sector boundaries of the blocks containing
>> those "safe" strings to identify magic numbers which would identify
>> the metadata types ...
>>
> 
> Ok,
> 
> I'm using the for-next branch now (0602fbe880) and inserted something
> like this so I can set a breakpoint on my_break.

Thanks, I'll dig into this a bit.  I thought we had an xfstest that looked
for this sort of leaked string problem, but now I'm not finding it... something
to fix, I guess.

-Eric

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Collecting aged XFS profiles
  2017-07-20 14:24                 ` Eric Sandeen
@ 2017-07-20 22:27                   ` Dave Chinner
  2017-07-20 22:48                     ` Eric Sandeen
  0 siblings, 1 reply; 19+ messages in thread
From: Dave Chinner @ 2017-07-20 22:27 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Stefan Ring, linux-xfs

On Thu, Jul 20, 2017 at 09:24:58AM -0500, Eric Sandeen wrote:
> On 07/19/2017 11:38 PM, Dave Chinner wrote:
> > On Wed, Jul 19, 2017 at 10:55:23PM -0500, Eric Sandeen wrote:
> > FWIW, people building from the git tree are probably using the
> > master branch (v4.11.0), not the for-next branch. I used to stage
> > all the -rcX releases in the master branch, which was why I was
> > confused by this at first. Never mind, I'll update my scripts that
> > haven't been pulling the xfsprogs for-next branch from kernel.org...
> 
> yeah, I was doing that when I was new to the maintainer game, for-next
> was rebasable and more forgiving.  Maybe I should change that back.

Yeah, for-next is more forgiving, but if you are tagging stuff for
releases (even -rc) then it probably should have been merged back
into the master branch and then tagged. I've always worked under the
assumption that -rc releases are "stable" release points because you
are asking the wider public to use and test the release....

i.e. If stuff needs to be undone after a -rc release then we should
use reverts that explain why something was undone - rebasing the
entire dev branch to remove the problem from recorded history means
we can't easily find out why that thing caused problems years down
the track.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Collecting aged XFS profiles
  2017-07-20 22:27                   ` Dave Chinner
@ 2017-07-20 22:48                     ` Eric Sandeen
  0 siblings, 0 replies; 19+ messages in thread
From: Eric Sandeen @ 2017-07-20 22:48 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Stefan Ring, linux-xfs



On 07/20/2017 05:27 PM, Dave Chinner wrote:
> On Thu, Jul 20, 2017 at 09:24:58AM -0500, Eric Sandeen wrote:
>> On 07/19/2017 11:38 PM, Dave Chinner wrote:
>>> On Wed, Jul 19, 2017 at 10:55:23PM -0500, Eric Sandeen wrote:
>>> FWIW, people building from the git tree are probably using the
>>> master branch (v4.11.0), not the for-next branch. I used to stage
>>> all the -rcX releases in the master branch, which was why I was
>>> confused by this at first. Never mind, I'll update my scripts that
>>> haven't been pulling the xfsprogs for-next branch from kernel.org...
>>
>> yeah, I was doing that when I was new to the maintainer game, for-next
>> was rebasable and more forgiving.  Maybe I should change that back.
> 
> Yeah, for-next is more forgiving, but if you are tagging stuff for
> releases (even -rc) then it probably should have been merged back
> into the master branch and then tagged. I've always worked under the
> assumption that -rc releases are "stable" release points because you
> are asking the wider public to use and test the release....
> 
> i.e. If stuff needs to be undone after a -rc release then we should
> use reverts that explain why something was undone - rebasing the
> entire dev branch to remove the problem from recorded history means
> we can't easily find out why that thing caused problems years down
> the track.

Fair enough, I'll switch back to that method, pushing things to
master more often, at least by the -rc stage.

-Eric

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2017-07-20 22:48 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-16  0:11 Collecting aged XFS profiles Saurabh Kadekodi
2017-07-16  2:57 ` Eric Sandeen
2017-07-17 19:00 ` Stefan Ring
2017-07-17 23:48   ` Dave Chinner
2017-07-18  5:45     ` Saurabh Kadekodi
2017-07-19  7:59     ` Stefan Ring
2017-07-19 15:20       ` Eric Sandeen
2017-07-19 21:08         ` Stefan Ring
2017-07-19 22:00           ` Eric Sandeen
2017-07-20  7:52             ` Stefan Ring
2017-07-20 14:27               ` Eric Sandeen
2017-07-20 20:15                 ` Stefan Ring
2017-07-20 20:21                   ` Eric Sandeen
2017-07-20  3:02           ` Dave Chinner
2017-07-20  3:55             ` Eric Sandeen
2017-07-20  4:38               ` Dave Chinner
2017-07-20 14:24                 ` Eric Sandeen
2017-07-20 22:27                   ` Dave Chinner
2017-07-20 22:48                     ` Eric Sandeen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.