All of lore.kernel.org
 help / color / mirror / Atom feed
* [fuse] Getting visibility into reads from page cache
@ 2020-04-25 17:06 Nikolaus Rath
  2020-04-27  9:26 ` Miklos Szeredi
  0 siblings, 1 reply; 5+ messages in thread
From: Nikolaus Rath @ 2020-04-25 17:06 UTC (permalink / raw)
  To: linux-fsdevel, fuse-devel

Hello,

For debugging purposes, I would like to get information about read
requests for FUSE filesystems that are answered from the page cache
(i.e., that never make it to the FUSE userspace daemon).

What would be the easiest way to accomplish that?

For now I'd be happy with seeing regular reads and knowing when an
application uses mmap (so that I know that I might be missing reads).


Not having done any real kernel-level work, I would start by looking
into using some tracing framework to hook into the relevant kernel
function. However, I thought I'd ask here first to make sure that I'm
not heading into the completely wrong direction.


Best,
-Nikolaus

-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             »Time flies like an arrow, fruit flies like a Banana.«

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [fuse] Getting visibility into reads from page cache
  2020-04-25 17:06 [fuse] Getting visibility into reads from page cache Nikolaus Rath
@ 2020-04-27  9:26 ` Miklos Szeredi
  2020-05-08 15:28   ` [fuse-devel] " Nikolaus Rath
  0 siblings, 1 reply; 5+ messages in thread
From: Miklos Szeredi @ 2020-04-27  9:26 UTC (permalink / raw)
  To: linux-fsdevel, fuse-devel

On Sat, Apr 25, 2020 at 7:07 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>
> Hello,
>
> For debugging purposes, I would like to get information about read
> requests for FUSE filesystems that are answered from the page cache
> (i.e., that never make it to the FUSE userspace daemon).
>
> What would be the easiest way to accomplish that?
>
> For now I'd be happy with seeing regular reads and knowing when an
> application uses mmap (so that I know that I might be missing reads).
>
>
> Not having done any real kernel-level work, I would start by looking
> into using some tracing framework to hook into the relevant kernel
> function. However, I thought I'd ask here first to make sure that I'm
> not heading into the completely wrong direction.

Bpftrace is a nice high level tracing tool.

E.g.

  sudo bpftrace -e 'kretprobe:fuse_file_read_iter { printf ("fuse
read: %d\n", retval); }'

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [fuse-devel] [fuse] Getting visibility into reads from page cache
  2020-04-27  9:26 ` Miklos Szeredi
@ 2020-05-08 15:28   ` Nikolaus Rath
  2020-05-08 17:04     ` Nikolaus Rath
  2020-05-11 12:12     ` Miklos Szeredi
  0 siblings, 2 replies; 5+ messages in thread
From: Nikolaus Rath @ 2020-05-08 15:28 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, fuse-devel

On Apr 27 2020, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Sat, Apr 25, 2020 at 7:07 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>>
>> Hello,
>>
>> For debugging purposes, I would like to get information about read
>> requests for FUSE filesystems that are answered from the page cache
>> (i.e., that never make it to the FUSE userspace daemon).
>>
>> What would be the easiest way to accomplish that?
>>
>> For now I'd be happy with seeing regular reads and knowing when an
>> application uses mmap (so that I know that I might be missing reads).
>>
>>
>> Not having done any real kernel-level work, I would start by looking
>> into using some tracing framework to hook into the relevant kernel
>> function. However, I thought I'd ask here first to make sure that I'm
>> not heading into the completely wrong direction.
>
> Bpftrace is a nice high level tracing tool.
>
> E.g.
>
>   sudo bpftrace -e 'kretprobe:fuse_file_read_iter { printf ("fuse
> read: %d\n", retval); }'

Thanks, this looks great! I had to do some reading about bpftrace first,
but I think this is exacly what I'm looking for. A few more questions:


- If I attach a probe to fuse_file_mmap, will this tell me whenever an
  application attempts to mmap() a FUSE file?

- I believe that (struct kiocb*)arg0)->ki_pos will give me the offset
  within the file, but where can I see how much data is being read?

- What is the best way to connect read requests to a specific FUSE
  filesystems (if more than one is mounted)? I found the superblock in
  (struct kiocb*)arg0)->ki_filp->f_mapping->host->i_sb->s_fs_info, but I
  do not see anything in this structure that I could map to a similar
  value that FUSE userspace has access to...

- I assume fuse_file_read_iter is called for every read request for FUSE
  filesystems unless it's an mmap'ed access. Is that right?

- Is there any similar way to catch access to an mmap'ed file? I think
  there is probably a way to make sure that every memory read triggers a
  page fault and then hook into the fault handler, but I am not sure how
  difficult this is to do and how much performance this would cost....

- If my BPF program contains e.g. a printf statement, will execution of
  the kernel function block until the printf has completed, or is there
  some queuing mechanism?

Thanks for your help!


Best,
-Nikolaus

-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             »Time flies like an arrow, fruit flies like a Banana.«

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [fuse-devel] [fuse] Getting visibility into reads from page cache
  2020-05-08 15:28   ` [fuse-devel] " Nikolaus Rath
@ 2020-05-08 17:04     ` Nikolaus Rath
  2020-05-11 12:12     ` Miklos Szeredi
  1 sibling, 0 replies; 5+ messages in thread
From: Nikolaus Rath @ 2020-05-08 17:04 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, fuse-devel

On May 08 2020, Nikolaus Rath <Nikolaus@rath.org> wrote:
>>
>>   sudo bpftrace -e 'kretprobe:fuse_file_read_iter { printf ("fuse
>> read: %d\n", retval); }'
>
>
> - I believe that (struct kiocb*)arg0)->ki_pos will give me the offset
>   within the file, but where can I see how much data is being read?

Looking at the code in fuse_file_read_iter, it seems the length is in
((struct iov_iter*)arg1)->count, but I do not really understand why.

The definiton of this parameter is:

struct iov_iter {
	int type;
	const struct iovec *iov;
	unsigned long nr_segs;
	size_t iov_offset;
	size_t count;
};

..so I would think that *count* is the number of `iovec` elements hiding
behind the `iov` pointer, not some total number of bytes.

Furthermore, there is a function iov_length() that is documented to
return the "total number of bytes covered by an iovec" and doesn't look
at `count` at all.

Can someone elucidate why the number of bytes to be read from the file
is in iov_iter.count?


Best,
-Nikolaus

-- 
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             »Time flies like an arrow, fruit flies like a Banana.«

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [fuse-devel] [fuse] Getting visibility into reads from page cache
  2020-05-08 15:28   ` [fuse-devel] " Nikolaus Rath
  2020-05-08 17:04     ` Nikolaus Rath
@ 2020-05-11 12:12     ` Miklos Szeredi
  1 sibling, 0 replies; 5+ messages in thread
From: Miklos Szeredi @ 2020-05-11 12:12 UTC (permalink / raw)
  To: Miklos Szeredi, linux-fsdevel, fuse-devel

On Fri, May 8, 2020 at 5:29 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
>
> On Apr 27 2020, Miklos Szeredi <miklos@szeredi.hu> wrote:
> > On Sat, Apr 25, 2020 at 7:07 PM Nikolaus Rath <Nikolaus@rath.org> wrote:
> >>
> >> Hello,
> >>
> >> For debugging purposes, I would like to get information about read
> >> requests for FUSE filesystems that are answered from the page cache
> >> (i.e., that never make it to the FUSE userspace daemon).
> >>
> >> What would be the easiest way to accomplish that?
> >>
> >> For now I'd be happy with seeing regular reads and knowing when an
> >> application uses mmap (so that I know that I might be missing reads).
> >>
> >>
> >> Not having done any real kernel-level work, I would start by looking
> >> into using some tracing framework to hook into the relevant kernel
> >> function. However, I thought I'd ask here first to make sure that I'm
> >> not heading into the completely wrong direction.
> >
> > Bpftrace is a nice high level tracing tool.
> >
> > E.g.
> >
> >   sudo bpftrace -e 'kretprobe:fuse_file_read_iter { printf ("fuse
> > read: %d\n", retval); }'
>
> Thanks, this looks great! I had to do some reading about bpftrace first,
> but I think this is exacly what I'm looking for. A few more questions:
>
>
> - If I attach a probe to fuse_file_mmap, will this tell me whenever an
>   application attempts to mmap() a FUSE file?

Yes.

> - I believe that (struct kiocb*)arg0)->ki_pos will give me the offset
>   within the file, but where can I see how much data is being read?
>
> Looking at the code in fuse_file_read_iter, it seems the length is in
> ((struct iov_iter*)arg1)->count, but I do not really understand why.

That's correct.

> The definiton of this parameter is:
>
> struct iov_iter {
>         int type;
>         const struct iovec *iov;
>         unsigned long nr_segs;
>         size_t iov_offset;
>         size_t count;
> };
>
> ..so I would think that *count* is the number of `iovec` elements hiding
> behind the `iov` pointer, not some total number of bytes.

That's nr_segs.

> Furthermore, there is a function iov_length() that is documented to
> return the "total number of bytes covered by an iovec" and doesn't look
> at `count` at all.

iov_iter_count() is the accessor function that does this.

> - What is the best way to connect read requests to a specific FUSE
>   filesystems (if more than one is mounted)? I found the superblock in
>   (struct kiocb*)arg0)->ki_filp->f_mapping->host->i_sb->s_fs_info, but I
>   do not see anything in this structure that I could map to a similar
>   value that FUSE userspace has access to...

You can match up ki_filp->f_inode->i_sb->s_dev with st_dev on any
file.  I think the kernel encodes the device value differently, but
the bits should be there.

> - I assume fuse_file_read_iter is called for every read request for FUSE
>   filesystems unless it's an mmap'ed access. Is that right?

Correct.

> - Is there any similar way to catch access to an mmap'ed file? I think
>   there is probably a way to make sure that every memory read triggers a
>   page fault and then hook into the fault handler, but I am not sure how
>   difficult this is to do and how much performance this would cost....

Not sure if that's implementable, but it would surely be grossly
inefficient.  Flushing page tables e.g. every second would probably
work, but then you'd only get the read pattern on a one second
granularity.

> - If my BPF program contains e.g. a printf statement, will execution of
>   the kernel function block until the printf has completed, or is there
>   some queuing mechanism?

AFAIK there's some queuing.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-05-11 12:12 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-25 17:06 [fuse] Getting visibility into reads from page cache Nikolaus Rath
2020-04-27  9:26 ` Miklos Szeredi
2020-05-08 15:28   ` [fuse-devel] " Nikolaus Rath
2020-05-08 17:04     ` Nikolaus Rath
2020-05-11 12:12     ` Miklos Szeredi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.