kernelnewbies.kernelnewbies.org archive mirror
 help / color / mirror / Atom feed
From: "Valdis Klētnieks" <valdis.kletnieks@vt.edu>
To: Sahibzada Irfanullah <irfan.gomalian@gmail.com>
Cc: kernelnewbies@kernelnewbies.org
Subject: Re: Generating Log of Guest Physical Addresses from a Kernel Function and Perform Analysis at Runtime
Date: Tue, 24 Sep 2019 14:55:21 -0400	[thread overview]
Message-ID: <264319.1569351321@turing-police> (raw)
In-Reply-To: <CAGaWEbrZHsZ_EyyfZQF7Wui_v8X5uG0MMe1ZFnXR-xFAaqVNrw@mail.gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 3329 bytes --]

On Tue, 24 Sep 2019 20:26:36 +0900, Sahibzada Irfanullah said:

> After having a reasonable amount  of log data,

If you're trying to figure out how the kernel memory manager is working, you're
probably better off using 'perf'  or one of the other tracing tools already in
the kernel to track the kernel memory manager. For starters, you can get those
tools to give you things like stack tracebacks so you know who is asking for a
page, and who is *releasing* a page, and so on.

Of course, which of these tools to use depends on what data you need to answer
the question - but simply knowing what physical address was involved in a page
fault is almost certainly not going to be sufficient.

> I want to perform some type of analsys at run time, e.g., no. of unique
> addresses, total no. of addresses, frequency of occurences of each addresses
> etc.

So what "some type of analysis" are you trying to do? What question(s)
are you trying to answer? 

The number of unique physical addresses in your system is dictated by how much
RAM you have installed. Similarly for total number of addresses, although I'm
not sure why you list both - that would mean that there is some number of
non-unique addresses.  What would that even mean?

The number of pages actually available for paging and caching depends on other
things as well - the architecture of the system, how much RAM (if any) is
reserved for use by your video card, the size of the kernel, the size of loaded
modules, space taken up by kmalloc allocations, page tables, whether any
processes have called mlock() on a large chunk of space, whether the pages are
locked by the kernel because there's I/O going on, and then there's things like
mmap(), and so on.

The kernel provides /proc/meminfo and /proc/slabinfo - you're going to want
to understand all that stuff before you can make sense of anything.

Simply looking at the frequency of occurrences of each address is probably not
going to tell you much of anything, because you need to know things like
the total working and resident set sizes for the process and other context.

For example - you do the analysis, and find that there are 8 gigabytes of pages
that are constantly being re-used.  But that doesn't tell you if there are two
processes that are thrashing against each other because each is doing heavy
repeated referencing of 6 gigabytes of data, or if one process is wildly referencing
many pages because some programmer has a multi-dimensional array and is
walking across the array with the indices in the wrong order

i_max = 4095; j_max = 4095;
for (i = 0, i < i_max; i++) for j = 0, j < j_max; j++) {sum += foo[i][j]}

If somebdy is doing foo[j][i] instead, things can get ugly.  And if you're
mixing with Fortran code, where the semantics of array references is reverse
and you *want* to use 'foo[j][i]' for efficient memory access, it's a bullet loaded
in the chamber and waiting for somebody to pull the trigger.

Not that I've ever seen *that* particular error happen with a programmer
processing 2 terabytes of arrays on a machine that only had 1.5 terabytes of
RAM.  But I did tease the person involved about it, because they *really*
should have known better. :)

So again:  What question(s) are you trying to get answers to?


[-- Attachment #1.2: Type: application/pgp-signature, Size: 832 bytes --]

[-- Attachment #2: Type: text/plain, Size: 170 bytes --]

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

  reply	other threads:[~2019-09-24 18:56 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-24 10:10 Generating Log of Guest Physical Addresses from a Kernel Function and Perform Analysis at Runtime Sahibzada Irfanullah
2019-09-24 11:16 ` Valdis Klētnieks
2019-09-24 11:26   ` Sahibzada Irfanullah
2019-09-24 18:55     ` Valdis Klētnieks [this message]
2019-09-25  2:44       ` Sahibzada Irfanullah
2019-09-25  7:00         ` Sahibzada Irfanullah
2019-09-25  9:38           ` Greg KH
2019-09-25 14:21             ` Ruben Safir
2019-09-25 17:08               ` Greg KH
2019-09-25 18:04                 ` Ruben Safir
2019-09-26  6:45                   ` Sahibzada Irfanullah
2019-09-26 10:05                     ` Brock
2019-09-26 22:56                       ` Valdis Klētnieks
2019-09-25 16:42             ` Valdis Klētnieks
2019-09-24 13:29 ` Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=264319.1569351321@turing-police \
    --to=valdis.kletnieks@vt.edu \
    --cc=irfan.gomalian@gmail.com \
    --cc=kernelnewbies@kernelnewbies.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).