All of lore.kernel.org
 help / color / mirror / Atom feed
* RE: zero copy issue while receiving the data (counter part of sen dfil e)
@ 2004-12-21 16:35 ` Rajat  Jain, Noida
  0 siblings, 0 replies; 17+ messages in thread
From: Rajat  Jain, Noida @ 2004-12-21 16:35 UTC (permalink / raw)
  To: dima, Rajat  Jain, Noida
  Cc: linux-newbie, linux-net, linux-kernel, kernelnewbies,
	Sanjay Kumar, Noida, Deepak Kumar Gupta, Noida


Okay,

As per my understanding ....

a) pre-fill and use "struct iovec" with sock_recvmsg()

Using this option, data will first be copied from the NIC's buffer to
sk_buff (which are allocated in the NIC's device driver via the
dev_alloc_skb(). And then during tcp_recvmsg(), the SAME data will be copied
from sk_buff to the iovecs that I pass to sock_recvmsg(). But actually it is
this very copying that I'm trying to avoid.


b) intercept socket's receive callback with tcp_read_sock() and use
skb_copy_bits() to copy data from skb to your destination buffer.

Again in this option as well, data will first be copied from the NIC's
buffer to sk_buff. And this is some thing that cannot be avoided. However,
if I use skb_copy_bits(), the data (as you said already) will be AGAIN
copied from the sk_buff to my destination buffer. 

My question is that if I'm developing a module (i.e. if I'm executing in the
kernel space), can't I directly use the buffers from sk_buff ... Instead of
copying them to a destination buffer. This way, we can implement a
functionality similar to send page. 

Any experience / ideas are welcome.

Thanks & Regards,

Rajat



-----Original Message-----
From: Dmitry Yusupov [mailto:dima@s2io.com] 
Sent: Friday, December 17, 2004 11:01 PM
To: Rajat Jain, Noida
Cc: linux-newbie@vger.kernel.org; linux-net@vger.kernel.org;
linux-kernel@vger.kernel.org; kernelnewbies@nl.linux.org; Sanjay Kumar,
Noida; Deepak Kumar Gupta, Noida
Subject: RE: zero copy issue while receiving the data (counter part of sen
dfil e)

On Fri, 2004-12-17 at 21:54 +0530, Rajat Jain, Noida wrote:
>  
> Hi,
> 
> Thanks for the reply.
> 
> Actually I am developing a loadable kernel module. I agree that at the 
> bare minimum, I need to copy from the NIC's device buffer to kernel's 
> allocated sk_buff (socket buffer). What I want is to avoid FURTHER 
> coying of data from the sk_buffs to the buffers allocated by the module.

Looks like you have two options:

a) pre-fill and use "struct iovec" with sock_recvmsg()

b) intercept socket's receive callback with tcp_read_sock() and use
skb_copy_bits() to copy data from skb to your destination buffer.

Regards,
Dima

> 
> And hence I expected to pass the address of a buffer pointer to 
> tcp_read_sock(). And I expected this function to set it to socket buffer.
> Any pointers on the functionality of tcp_read_sock()??
> 
> Rajat
> 
> 
> -----Original Message-----
> From: Dmitry Yusupov [mailto:dima@s2io.com]
> Sent: Friday, December 17, 2004 7:07 AM
> To: Rajat Jain, Noida
> Cc: linux-net@vger.kernel.org; Sanjay Kumar, Noida; Deepak Kumar 
> Gupta, Noida
> Subject: Re: zero copy issue while receiving the data (counter part of 
> sendfil e)
> 
> Hi Rajat,
> 
> I was using this function some times back... It's been working for me 
> just fine. Also kernel's RPC (see xprt* files) uses it. So you might 
> want to take a look.
> 
> In general, it is not possible to fully avoid copying. You need at 
> least copy data from NIC's skb to the destination. It might be user 
> buffer or kernel buffer(depends on application).
> 
> Regards,
> Dmitry
> 
> 
> On Thu, 2004-12-16 at 19:38 +0530, Rajat Jain, Noida wrote:
> >  
> > Hi,
> > 
> > I'm experimenting on stock kernel 2.6.8
> > 
> > I was looking for an interface that could directly receive data from 
> > a network socket, WITHOUT coying from kernel space to user space. 
> > (Like for sending data, "sendfile" provides to send data to network 
> > socket without copying it to kernel space). I came across 
> > tcp_read_sock() interface in net/ipv4/tcp.c.
> > 
> > Has anybody tried tcp_read_sock()?? Is there any known issue with it 
> > ?? If somebody has some idea, I would appreciate if you can share.
> > 
> > I might be wrong, but what I perceive is that I will pass a pointer 
> > to this function. And when the function returns, I expect it to be 
> > set to the kernel buffer (corresponding to socket).
> > 
> > 1) To fulfill this objective, I expect to pass a pointer to pointer 
> > & only then it can be done. (If we have to modify a pointer's value, 
> > we have to pass its address ... Right??). However, this function 
> > expects a char * buf (in read_descriptor_t argument). Any ideas
?????????
> > 
> > 2) This code also frees the space allocated to sk_buffs etc using 
> > sk_eat_skb(sk, skb) and cleanup_rbuf(sk, copied) etc. But this 
> > function is supposed to return these locations to the calling code ...
> Right???
> > 
> > Any pointers are more than welcome. I have provided the code for
> reference.
> > Please cc the reply to me as I'm not on the list.
> > 
> > Thanks & regards,
> > 
> > Rajat Jain
> > 
> > --------------------------------------------------------------------
> > --
> > -
> > /* net/ipv4/tcp.c
> >  * This routine provides an alternative to tcp_recvmsg() for 
> > routines
> >  * that would like to handle copying from skbuffs directly in 'sendfile'
> >  * fashion.
> >  * Note:
> >  *      - It is assumed that the socket was locked by the caller.
> >  *      - The routine does not block.
> >  *      - At present, there is no support for reading OOB data
> >  *        or for 'peeking' the socket using this routine
> >  *        (although both would be easy to implement).
> >  */
> > int tcp_read_sock(struct sock *sk, read_descriptor_t *desc,
> >                   sk_read_actor_t recv_actor) {
> >         struct sk_buff *skb;
> >         struct tcp_opt *tp = tcp_sk(sk);
> >         u32 seq = tp->copied_seq;
> >         u32 offset;
> >         int copied = 0;
> > 
> >         if (sk->sk_state == TCP_LISTEN)
> >                 return -ENOTCONN;
> >         while ((skb = tcp_recv_skb(sk, seq, &offset)) != NULL) {
> >                 if (offset < skb->len) {
> >                         size_t used, len;
> > 
> >                         len = skb->len - offset;
> >                         /* Stop reading if we hit a patch of urgent 
> > data
> */
> >                         if (tp->urg_data) {
> >                                 u32 urg_offset = tp->urg_seq - seq;
> >                                 if (urg_offset < len)
> >                                         len = urg_offset;
> >                                 if (!len)
> >                                         break;
> >                         }
> >                         used = recv_actor(desc, skb, offset, len);
> >                         if (used <= len) {
> >                                 seq += used;
> >                                 copied += used;
> >                                 offset += used;
> >                         }
> >                         if (offset != skb->len)
> >                                 break;
> >                 }
> >                 if (skb->h.th->fin) {
> >                         sk_eat_skb(sk, skb);
> >                         ++seq;
> >                         break;
> >                 }
> >                 sk_eat_skb(sk, skb);
> >                 if (!desc->count)
> >                         break;
> >         }
> >         tp->copied_seq = seq;
> > 
> >         tcp_rcv_space_adjust(sk);
> > 
> >         /* Clean up data we have read: This will do ACK frames. */
> >         if (copied)
> >                 cleanup_rbuf(sk, copied);
> >         return copied;
> > }-------------------------------------------------------------------
> > --
> > --
> > 
> > read_descriptor_t is defined as:
> > 
> > /*
> >  * include/linux/fs.h
> >  */
> > typedef struct {
> >         size_t written;
> >         size_t count;
> >         union {
> >                 char __user * buf;
> >                 void *data;
> >         } arg;
> >         int error;
> > } read_descriptor_t;
> > --------------------------------------------------------------------
> > --
> > -
> > 
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-net" 
> > in the body of a message to majordomo@vger.kernel.org More majordomo 
> > info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: zero copy issue while receiving the data (counter part of sen dfil e)
@ 2004-12-21 16:35 ` Rajat  Jain, Noida
  0 siblings, 0 replies; 17+ messages in thread
From: Rajat  Jain, Noida @ 2004-12-21 16:35 UTC (permalink / raw)
  To: dima, Rajat  Jain, Noida
  Cc: linux-newbie, linux-net, linux-kernel, kernelnewbies,
	Sanjay Kumar, Noida, Deepak Kumar Gupta, Noida


Okay,

As per my understanding ....

a) pre-fill and use "struct iovec" with sock_recvmsg()

Using this option, data will first be copied from the NIC's buffer to
sk_buff (which are allocated in the NIC's device driver via the
dev_alloc_skb(). And then during tcp_recvmsg(), the SAME data will be copied
from sk_buff to the iovecs that I pass to sock_recvmsg(). But actually it is
this very copying that I'm trying to avoid.


b) intercept socket's receive callback with tcp_read_sock() and use
skb_copy_bits() to copy data from skb to your destination buffer.

Again in this option as well, data will first be copied from the NIC's
buffer to sk_buff. And this is some thing that cannot be avoided. However,
if I use skb_copy_bits(), the data (as you said already) will be AGAIN
copied from the sk_buff to my destination buffer. 

My question is that if I'm developing a module (i.e. if I'm executing in the
kernel space), can't I directly use the buffers from sk_buff ... Instead of
copying them to a destination buffer. This way, we can implement a
functionality similar to send page. 

Any experience / ideas are welcome.

Thanks & Regards,

Rajat



-----Original Message-----
From: Dmitry Yusupov [mailto:dima@s2io.com] 
Sent: Friday, December 17, 2004 11:01 PM
To: Rajat Jain, Noida
Cc: linux-newbie@vger.kernel.org; linux-net@vger.kernel.org;
linux-kernel@vger.kernel.org; kernelnewbies@nl.linux.org; Sanjay Kumar,
Noida; Deepak Kumar Gupta, Noida
Subject: RE: zero copy issue while receiving the data (counter part of sen
dfil e)

On Fri, 2004-12-17 at 21:54 +0530, Rajat Jain, Noida wrote:
>  
> Hi,
> 
> Thanks for the reply.
> 
> Actually I am developing a loadable kernel module. I agree that at the 
> bare minimum, I need to copy from the NIC's device buffer to kernel's 
> allocated sk_buff (socket buffer). What I want is to avoid FURTHER 
> coying of data from the sk_buffs to the buffers allocated by the module.

Looks like you have two options:

a) pre-fill and use "struct iovec" with sock_recvmsg()

b) intercept socket's receive callback with tcp_read_sock() and use
skb_copy_bits() to copy data from skb to your destination buffer.

Regards,
Dima

> 
> And hence I expected to pass the address of a buffer pointer to 
> tcp_read_sock(). And I expected this function to set it to socket buffer.
> Any pointers on the functionality of tcp_read_sock()??
> 
> Rajat
> 
> 
> -----Original Message-----
> From: Dmitry Yusupov [mailto:dima@s2io.com]
> Sent: Friday, December 17, 2004 7:07 AM
> To: Rajat Jain, Noida
> Cc: linux-net@vger.kernel.org; Sanjay Kumar, Noida; Deepak Kumar 
> Gupta, Noida
> Subject: Re: zero copy issue while receiving the data (counter part of 
> sendfil e)
> 
> Hi Rajat,
> 
> I was using this function some times back... It's been working for me 
> just fine. Also kernel's RPC (see xprt* files) uses it. So you might 
> want to take a look.
> 
> In general, it is not possible to fully avoid copying. You need at 
> least copy data from NIC's skb to the destination. It might be user 
> buffer or kernel buffer(depends on application).
> 
> Regards,
> Dmitry
> 
> 
> On Thu, 2004-12-16 at 19:38 +0530, Rajat Jain, Noida wrote:
> >  
> > Hi,
> > 
> > I'm experimenting on stock kernel 2.6.8
> > 
> > I was looking for an interface that could directly receive data from 
> > a network socket, WITHOUT coying from kernel space to user space. 
> > (Like for sending data, "sendfile" provides to send data to network 
> > socket without copying it to kernel space). I came across 
> > tcp_read_sock() interface in net/ipv4/tcp.c.
> > 
> > Has anybody tried tcp_read_sock()?? Is there any known issue with it 
> > ?? If somebody has some idea, I would appreciate if you can share.
> > 
> > I might be wrong, but what I perceive is that I will pass a pointer 
> > to this function. And when the function returns, I expect it to be 
> > set to the kernel buffer (corresponding to socket).
> > 
> > 1) To fulfill this objective, I expect to pass a pointer to pointer 
> > & only then it can be done. (If we have to modify a pointer's value, 
> > we have to pass its address ... Right??). However, this function 
> > expects a char * buf (in read_descriptor_t argument). Any ideas
?????????
> > 
> > 2) This code also frees the space allocated to sk_buffs etc using 
> > sk_eat_skb(sk, skb) and cleanup_rbuf(sk, copied) etc. But this 
> > function is supposed to return these locations to the calling code ...
> Right???
> > 
> > Any pointers are more than welcome. I have provided the code for
> reference.
> > Please cc the reply to me as I'm not on the list.
> > 
> > Thanks & regards,
> > 
> > Rajat Jain
> > 
> > --------------------------------------------------------------------
> > --
> > -
> > /* net/ipv4/tcp.c
> >  * This routine provides an alternative to tcp_recvmsg() for 
> > routines
> >  * that would like to handle copying from skbuffs directly in 'sendfile'
> >  * fashion.
> >  * Note:
> >  *      - It is assumed that the socket was locked by the caller.
> >  *      - The routine does not block.
> >  *      - At present, there is no support for reading OOB data
> >  *        or for 'peeking' the socket using this routine
> >  *        (although both would be easy to implement).
> >  */
> > int tcp_read_sock(struct sock *sk, read_descriptor_t *desc,
> >                   sk_read_actor_t recv_actor) {
> >         struct sk_buff *skb;
> >         struct tcp_opt *tp = tcp_sk(sk);
> >         u32 seq = tp->copied_seq;
> >         u32 offset;
> >         int copied = 0;
> > 
> >         if (sk->sk_state == TCP_LISTEN)
> >                 return -ENOTCONN;
> >         while ((skb = tcp_recv_skb(sk, seq, &offset)) != NULL) {
> >                 if (offset < skb->len) {
> >                         size_t used, len;
> > 
> >                         len = skb->len - offset;
> >                         /* Stop reading if we hit a patch of urgent 
> > data
> */
> >                         if (tp->urg_data) {
> >                                 u32 urg_offset = tp->urg_seq - seq;
> >                                 if (urg_offset < len)
> >                                         len = urg_offset;
> >                                 if (!len)
> >                                         break;
> >                         }
> >                         used = recv_actor(desc, skb, offset, len);
> >                         if (used <= len) {
> >                                 seq += used;
> >                                 copied += used;
> >                                 offset += used;
> >                         }
> >                         if (offset != skb->len)
> >                                 break;
> >                 }
> >                 if (skb->h.th->fin) {
> >                         sk_eat_skb(sk, skb);
> >                         ++seq;
> >                         break;
> >                 }
> >                 sk_eat_skb(sk, skb);
> >                 if (!desc->count)
> >                         break;
> >         }
> >         tp->copied_seq = seq;
> > 
> >         tcp_rcv_space_adjust(sk);
> > 
> >         /* Clean up data we have read: This will do ACK frames. */
> >         if (copied)
> >                 cleanup_rbuf(sk, copied);
> >         return copied;
> > }-------------------------------------------------------------------
> > --
> > --
> > 
> > read_descriptor_t is defined as:
> > 
> > /*
> >  * include/linux/fs.h
> >  */
> > typedef struct {
> >         size_t written;
> >         size_t count;
> >         union {
> >                 char __user * buf;
> >                 void *data;
> >         } arg;
> >         int error;
> > } read_descriptor_t;
> > --------------------------------------------------------------------
> > --
> > -
> > 
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-net" 
> > in the body of a message to majordomo@vger.kernel.org More majordomo 
> > info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: zero copy issue while receiving the data (counter part of sen dfil e)
  2004-12-21 16:35 ` Rajat  Jain, Noida
@ 2004-12-21 17:22   ` Dmitry Yusupov
  -1 siblings, 0 replies; 17+ messages in thread
From: Dmitry Yusupov @ 2004-12-21 17:22 UTC (permalink / raw)
  To: Rajat  Jain, Noida
  Cc: linux-newbie, linux-net, linux-kernel, kernelnewbies,
	Sanjay Kumar, Noida, Deepak Kumar Gupta, Noida

Rajat,

small correction, if NIC supports DMA operation on receive, than no
extra copy required. Therefore sock_recvmsg() and tcp_read_sock
()/skb_copy_bits() provides "zero-copy" access to SKB. But unfortunately
you still have to copy data to your destination buffer. This is
unavoidable with TCP. RDMA/MPI will resolve this problem.

Regards,
Dima


On Tue, 2004-12-21 at 22:05 +0530, Rajat Jain, Noida wrote:
> Okay,
> 
> As per my understanding ....
> 
> a) pre-fill and use "struct iovec" with sock_recvmsg()
> 
> Using this option, data will first be copied from the NIC's buffer to
> sk_buff (which are allocated in the NIC's device driver via the
> dev_alloc_skb(). And then during tcp_recvmsg(), the SAME data will be copied
> from sk_buff to the iovecs that I pass to sock_recvmsg(). But actually it is
> this very copying that I'm trying to avoid.
> 
> 
> b) intercept socket's receive callback with tcp_read_sock() and use
> skb_copy_bits() to copy data from skb to your destination buffer.
> 
> Again in this option as well, data will first be copied from the NIC's
> buffer to sk_buff. And this is some thing that cannot be avoided. However,
> if I use skb_copy_bits(), the data (as you said already) will be AGAIN
> copied from the sk_buff to my destination buffer. 
> 
> My question is that if I'm developing a module (i.e. if I'm executing in the
> kernel space), can't I directly use the buffers from sk_buff ... Instead of
> copying them to a destination buffer. This way, we can implement a
> functionality similar to send page. 
> 
> Any experience / ideas are welcome.
> 
> Thanks & Regards,
> 
> Rajat
> 
> 
> 
> -----Original Message-----
> From: Dmitry Yusupov [mailto:dima@s2io.com] 
> Sent: Friday, December 17, 2004 11:01 PM
> To: Rajat Jain, Noida
> Cc: linux-newbie@vger.kernel.org; linux-net@vger.kernel.org;
> linux-kernel@vger.kernel.org; kernelnewbies@nl.linux.org; Sanjay Kumar,
> Noida; Deepak Kumar Gupta, Noida
> Subject: RE: zero copy issue while receiving the data (counter part of sen
> dfil e)
> 
> On Fri, 2004-12-17 at 21:54 +0530, Rajat Jain, Noida wrote:
> >  
> > Hi,
> > 
> > Thanks for the reply.
> > 
> > Actually I am developing a loadable kernel module. I agree that at the 
> > bare minimum, I need to copy from the NIC's device buffer to kernel's 
> > allocated sk_buff (socket buffer). What I want is to avoid FURTHER 
> > coying of data from the sk_buffs to the buffers allocated by the module.
> 
> Looks like you have two options:
> 
> a) pre-fill and use "struct iovec" with sock_recvmsg()
> 
> b) intercept socket's receive callback with tcp_read_sock() and use
> skb_copy_bits() to copy data from skb to your destination buffer.
> 
> Regards,
> Dima
> 
> > 
> > And hence I expected to pass the address of a buffer pointer to 
> > tcp_read_sock(). And I expected this function to set it to socket buffer.
> > Any pointers on the functionality of tcp_read_sock()??
> > 
> > Rajat
> > 
> > 
> > -----Original Message-----
> > From: Dmitry Yusupov [mailto:dima@s2io.com]
> > Sent: Friday, December 17, 2004 7:07 AM
> > To: Rajat Jain, Noida
> > Cc: linux-net@vger.kernel.org; Sanjay Kumar, Noida; Deepak Kumar 
> > Gupta, Noida
> > Subject: Re: zero copy issue while receiving the data (counter part of 
> > sendfil e)
> > 
> > Hi Rajat,
> > 
> > I was using this function some times back... It's been working for me 
> > just fine. Also kernel's RPC (see xprt* files) uses it. So you might 
> > want to take a look.
> > 
> > In general, it is not possible to fully avoid copying. You need at 
> > least copy data from NIC's skb to the destination. It might be user 
> > buffer or kernel buffer(depends on application).
> > 
> > Regards,
> > Dmitry
> > 
> > 
> > On Thu, 2004-12-16 at 19:38 +0530, Rajat Jain, Noida wrote:
> > >  
> > > Hi,
> > > 
> > > I'm experimenting on stock kernel 2.6.8
> > > 
> > > I was looking for an interface that could directly receive data from 
> > > a network socket, WITHOUT coying from kernel space to user space. 
> > > (Like for sending data, "sendfile" provides to send data to network 
> > > socket without copying it to kernel space). I came across 
> > > tcp_read_sock() interface in net/ipv4/tcp.c.
> > > 
> > > Has anybody tried tcp_read_sock()?? Is there any known issue with it 
> > > ?? If somebody has some idea, I would appreciate if you can share.
> > > 
> > > I might be wrong, but what I perceive is that I will pass a pointer 
> > > to this function. And when the function returns, I expect it to be 
> > > set to the kernel buffer (corresponding to socket).
> > > 
> > > 1) To fulfill this objective, I expect to pass a pointer to pointer 
> > > & only then it can be done. (If we have to modify a pointer's value, 
> > > we have to pass its address ... Right??). However, this function 
> > > expects a char * buf (in read_descriptor_t argument). Any ideas
> ?????????
> > > 
> > > 2) This code also frees the space allocated to sk_buffs etc using 
> > > sk_eat_skb(sk, skb) and cleanup_rbuf(sk, copied) etc. But this 
> > > function is supposed to return these locations to the calling code ...
> > Right???
> > > 
> > > Any pointers are more than welcome. I have provided the code for
> > reference.
> > > Please cc the reply to me as I'm not on the list.
> > > 
> > > Thanks & regards,
> > > 
> > > Rajat Jain
> > > 
> > > --------------------------------------------------------------------
> > > --
> > > -
> > > /* net/ipv4/tcp.c
> > >  * This routine provides an alternative to tcp_recvmsg() for 
> > > routines
> > >  * that would like to handle copying from skbuffs directly in 'sendfile'
> > >  * fashion.
> > >  * Note:
> > >  *      - It is assumed that the socket was locked by the caller.
> > >  *      - The routine does not block.
> > >  *      - At present, there is no support for reading OOB data
> > >  *        or for 'peeking' the socket using this routine
> > >  *        (although both would be easy to implement).
> > >  */
> > > int tcp_read_sock(struct sock *sk, read_descriptor_t *desc,
> > >                   sk_read_actor_t recv_actor) {
> > >         struct sk_buff *skb;
> > >         struct tcp_opt *tp = tcp_sk(sk);
> > >         u32 seq = tp->copied_seq;
> > >         u32 offset;
> > >         int copied = 0;
> > > 
> > >         if (sk->sk_state == TCP_LISTEN)
> > >                 return -ENOTCONN;
> > >         while ((skb = tcp_recv_skb(sk, seq, &offset)) != NULL) {
> > >                 if (offset < skb->len) {
> > >                         size_t used, len;
> > > 
> > >                         len = skb->len - offset;
> > >                         /* Stop reading if we hit a patch of urgent 
> > > data
> > */
> > >                         if (tp->urg_data) {
> > >                                 u32 urg_offset = tp->urg_seq - seq;
> > >                                 if (urg_offset < len)
> > >                                         len = urg_offset;
> > >                                 if (!len)
> > >                                         break;
> > >                         }
> > >                         used = recv_actor(desc, skb, offset, len);
> > >                         if (used <= len) {
> > >                                 seq += used;
> > >                                 copied += used;
> > >                                 offset += used;
> > >                         }
> > >                         if (offset != skb->len)
> > >                                 break;
> > >                 }
> > >                 if (skb->h.th->fin) {
> > >                         sk_eat_skb(sk, skb);
> > >                         ++seq;
> > >                         break;
> > >                 }
> > >                 sk_eat_skb(sk, skb);
> > >                 if (!desc->count)
> > >                         break;
> > >         }
> > >         tp->copied_seq = seq;
> > > 
> > >         tcp_rcv_space_adjust(sk);
> > > 
> > >         /* Clean up data we have read: This will do ACK frames. */
> > >         if (copied)
> > >                 cleanup_rbuf(sk, copied);
> > >         return copied;
> > > }-------------------------------------------------------------------
> > > --
> > > --
> > > 
> > > read_descriptor_t is defined as:
> > > 
> > > /*
> > >  * include/linux/fs.h
> > >  */
> > > typedef struct {
> > >         size_t written;
> > >         size_t count;
> > >         union {
> > >                 char __user * buf;
> > >                 void *data;
> > >         } arg;
> > >         int error;
> > > } read_descriptor_t;
> > > --------------------------------------------------------------------
> > > --
> > > -
> > > 
> > > -
> > > To unsubscribe from this list: send the line "unsubscribe linux-net" 
> > > in the body of a message to majordomo@vger.kernel.org More majordomo 
> > > info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: zero copy issue while receiving the data (counter part of sen dfil e)
@ 2004-12-21 17:22   ` Dmitry Yusupov
  0 siblings, 0 replies; 17+ messages in thread
From: Dmitry Yusupov @ 2004-12-21 17:22 UTC (permalink / raw)
  To: Rajat  Jain, Noida
  Cc: linux-newbie, linux-net, linux-kernel, kernelnewbies,
	Sanjay Kumar, Noida, Deepak Kumar Gupta, Noida

Rajat,

small correction, if NIC supports DMA operation on receive, than no
extra copy required. Therefore sock_recvmsg() and tcp_read_sock
()/skb_copy_bits() provides "zero-copy" access to SKB. But unfortunately
you still have to copy data to your destination buffer. This is
unavoidable with TCP. RDMA/MPI will resolve this problem.

Regards,
Dima


On Tue, 2004-12-21 at 22:05 +0530, Rajat Jain, Noida wrote:
> Okay,
> 
> As per my understanding ....
> 
> a) pre-fill and use "struct iovec" with sock_recvmsg()
> 
> Using this option, data will first be copied from the NIC's buffer to
> sk_buff (which are allocated in the NIC's device driver via the
> dev_alloc_skb(). And then during tcp_recvmsg(), the SAME data will be copied
> from sk_buff to the iovecs that I pass to sock_recvmsg(). But actually it is
> this very copying that I'm trying to avoid.
> 
> 
> b) intercept socket's receive callback with tcp_read_sock() and use
> skb_copy_bits() to copy data from skb to your destination buffer.
> 
> Again in this option as well, data will first be copied from the NIC's
> buffer to sk_buff. And this is some thing that cannot be avoided. However,
> if I use skb_copy_bits(), the data (as you said already) will be AGAIN
> copied from the sk_buff to my destination buffer. 
> 
> My question is that if I'm developing a module (i.e. if I'm executing in the
> kernel space), can't I directly use the buffers from sk_buff ... Instead of
> copying them to a destination buffer. This way, we can implement a
> functionality similar to send page. 
> 
> Any experience / ideas are welcome.
> 
> Thanks & Regards,
> 
> Rajat
> 
> 
> 
> -----Original Message-----
> From: Dmitry Yusupov [mailto:dima@s2io.com] 
> Sent: Friday, December 17, 2004 11:01 PM
> To: Rajat Jain, Noida
> Cc: linux-newbie@vger.kernel.org; linux-net@vger.kernel.org;
> linux-kernel@vger.kernel.org; kernelnewbies@nl.linux.org; Sanjay Kumar,
> Noida; Deepak Kumar Gupta, Noida
> Subject: RE: zero copy issue while receiving the data (counter part of sen
> dfil e)
> 
> On Fri, 2004-12-17 at 21:54 +0530, Rajat Jain, Noida wrote:
> >  
> > Hi,
> > 
> > Thanks for the reply.
> > 
> > Actually I am developing a loadable kernel module. I agree that at the 
> > bare minimum, I need to copy from the NIC's device buffer to kernel's 
> > allocated sk_buff (socket buffer). What I want is to avoid FURTHER 
> > coying of data from the sk_buffs to the buffers allocated by the module.
> 
> Looks like you have two options:
> 
> a) pre-fill and use "struct iovec" with sock_recvmsg()
> 
> b) intercept socket's receive callback with tcp_read_sock() and use
> skb_copy_bits() to copy data from skb to your destination buffer.
> 
> Regards,
> Dima
> 
> > 
> > And hence I expected to pass the address of a buffer pointer to 
> > tcp_read_sock(). And I expected this function to set it to socket buffer.
> > Any pointers on the functionality of tcp_read_sock()??
> > 
> > Rajat
> > 
> > 
> > -----Original Message-----
> > From: Dmitry Yusupov [mailto:dima@s2io.com]
> > Sent: Friday, December 17, 2004 7:07 AM
> > To: Rajat Jain, Noida
> > Cc: linux-net@vger.kernel.org; Sanjay Kumar, Noida; Deepak Kumar 
> > Gupta, Noida
> > Subject: Re: zero copy issue while receiving the data (counter part of 
> > sendfil e)
> > 
> > Hi Rajat,
> > 
> > I was using this function some times back... It's been working for me 
> > just fine. Also kernel's RPC (see xprt* files) uses it. So you might 
> > want to take a look.
> > 
> > In general, it is not possible to fully avoid copying. You need at 
> > least copy data from NIC's skb to the destination. It might be user 
> > buffer or kernel buffer(depends on application).
> > 
> > Regards,
> > Dmitry
> > 
> > 
> > On Thu, 2004-12-16 at 19:38 +0530, Rajat Jain, Noida wrote:
> > >  
> > > Hi,
> > > 
> > > I'm experimenting on stock kernel 2.6.8
> > > 
> > > I was looking for an interface that could directly receive data from 
> > > a network socket, WITHOUT coying from kernel space to user space. 
> > > (Like for sending data, "sendfile" provides to send data to network 
> > > socket without copying it to kernel space). I came across 
> > > tcp_read_sock() interface in net/ipv4/tcp.c.
> > > 
> > > Has anybody tried tcp_read_sock()?? Is there any known issue with it 
> > > ?? If somebody has some idea, I would appreciate if you can share.
> > > 
> > > I might be wrong, but what I perceive is that I will pass a pointer 
> > > to this function. And when the function returns, I expect it to be 
> > > set to the kernel buffer (corresponding to socket).
> > > 
> > > 1) To fulfill this objective, I expect to pass a pointer to pointer 
> > > & only then it can be done. (If we have to modify a pointer's value, 
> > > we have to pass its address ... Right??). However, this function 
> > > expects a char * buf (in read_descriptor_t argument). Any ideas
> ?????????
> > > 
> > > 2) This code also frees the space allocated to sk_buffs etc using 
> > > sk_eat_skb(sk, skb) and cleanup_rbuf(sk, copied) etc. But this 
> > > function is supposed to return these locations to the calling code ...
> > Right???
> > > 
> > > Any pointers are more than welcome. I have provided the code for
> > reference.
> > > Please cc the reply to me as I'm not on the list.
> > > 
> > > Thanks & regards,
> > > 
> > > Rajat Jain
> > > 
> > > --------------------------------------------------------------------
> > > --
> > > -
> > > /* net/ipv4/tcp.c
> > >  * This routine provides an alternative to tcp_recvmsg() for 
> > > routines
> > >  * that would like to handle copying from skbuffs directly in 'sendfile'
> > >  * fashion.
> > >  * Note:
> > >  *      - It is assumed that the socket was locked by the caller.
> > >  *      - The routine does not block.
> > >  *      - At present, there is no support for reading OOB data
> > >  *        or for 'peeking' the socket using this routine
> > >  *        (although both would be easy to implement).
> > >  */
> > > int tcp_read_sock(struct sock *sk, read_descriptor_t *desc,
> > >                   sk_read_actor_t recv_actor) {
> > >         struct sk_buff *skb;
> > >         struct tcp_opt *tp = tcp_sk(sk);
> > >         u32 seq = tp->copied_seq;
> > >         u32 offset;
> > >         int copied = 0;
> > > 
> > >         if (sk->sk_state == TCP_LISTEN)
> > >                 return -ENOTCONN;
> > >         while ((skb = tcp_recv_skb(sk, seq, &offset)) != NULL) {
> > >                 if (offset < skb->len) {
> > >                         size_t used, len;
> > > 
> > >                         len = skb->len - offset;
> > >                         /* Stop reading if we hit a patch of urgent 
> > > data
> > */
> > >                         if (tp->urg_data) {
> > >                                 u32 urg_offset = tp->urg_seq - seq;
> > >                                 if (urg_offset < len)
> > >                                         len = urg_offset;
> > >                                 if (!len)
> > >                                         break;
> > >                         }
> > >                         used = recv_actor(desc, skb, offset, len);
> > >                         if (used <= len) {
> > >                                 seq += used;
> > >                                 copied += used;
> > >                                 offset += used;
> > >                         }
> > >                         if (offset != skb->len)
> > >                                 break;
> > >                 }
> > >                 if (skb->h.th->fin) {
> > >                         sk_eat_skb(sk, skb);
> > >                         ++seq;
> > >                         break;
> > >                 }
> > >                 sk_eat_skb(sk, skb);
> > >                 if (!desc->count)
> > >                         break;
> > >         }
> > >         tp->copied_seq = seq;
> > > 
> > >         tcp_rcv_space_adjust(sk);
> > > 
> > >         /* Clean up data we have read: This will do ACK frames. */
> > >         if (copied)
> > >                 cleanup_rbuf(sk, copied);
> > >         return copied;
> > > }-------------------------------------------------------------------
> > > --
> > > --
> > > 
> > > read_descriptor_t is defined as:
> > > 
> > > /*
> > >  * include/linux/fs.h
> > >  */
> > > typedef struct {
> > >         size_t written;
> > >         size_t count;
> > >         union {
> > >                 char __user * buf;
> > >                 void *data;
> > >         } arg;
> > >         int error;
> > > } read_descriptor_t;
> > > --------------------------------------------------------------------
> > > --
> > > -
> > > 
> > > -
> > > To unsubscribe from this list: send the line "unsubscribe linux-net" 
> > > in the body of a message to majordomo@vger.kernel.org More majordomo 
> > > info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: zero copy issue while receiving the data (counter part of sen dfil e)
  2004-12-21 17:22   ` Dmitry Yusupov
@ 2004-12-21 19:30     ` Jeff Garzik
  -1 siblings, 0 replies; 17+ messages in thread
From: Jeff Garzik @ 2004-12-21 19:30 UTC (permalink / raw)
  To: dima
  Cc: Rajat Jain, Noida, linux-newbie, linux-net, linux-kernel,
	kernelnewbies, Sanjay Kumar, Noida, Deepak Kumar Gupta, Noida

Dmitry Yusupov wrote:
> Rajat,
> 
> small correction, if NIC supports DMA operation on receive, than no
> extra copy required. Therefore sock_recvmsg() and tcp_read_sock

large correction:  if NIC supports _checksum_ on receive, then no extra 
copy is required.

	Jeff



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: zero copy issue while receiving the data (counter part of sen dfil e)
@ 2004-12-21 19:30     ` Jeff Garzik
  0 siblings, 0 replies; 17+ messages in thread
From: Jeff Garzik @ 2004-12-21 19:30 UTC (permalink / raw)
  To: dima
  Cc: Rajat Jain, Noida, linux-newbie, linux-net, linux-kernel,
	kernelnewbies, Sanjay Kumar, Noida, Deepak Kumar Gupta, Noida

Dmitry Yusupov wrote:
> Rajat,
> 
> small correction, if NIC supports DMA operation on receive, than no
> extra copy required. Therefore sock_recvmsg() and tcp_read_sock

large correction:  if NIC supports _checksum_ on receive, then no extra 
copy is required.

	Jeff


-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: zero copy issue while receiving the data (counter part of sen dfil e)
  2004-12-21 19:30     ` Jeff Garzik
@ 2004-12-21 19:43       ` Dmitry Yusupov
  -1 siblings, 0 replies; 17+ messages in thread
From: Dmitry Yusupov @ 2004-12-21 19:43 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Rajat Jain, Noida, linux-newbie, linux-net, linux-kernel,
	kernelnewbies, Sanjay Kumar, Noida, Deepak Kumar Gupta, Noida

indeed :)
another words if you have modern NIC than you get "zero-copy"(except
copy_to_user()) for free :)

Regards,
Dima

On Tue, 2004-12-21 at 14:30 -0500, Jeff Garzik wrote:
> Dmitry Yusupov wrote:
> > Rajat,
> > 
> > small correction, if NIC supports DMA operation on receive, than no
> > extra copy required. Therefore sock_recvmsg() and tcp_read_sock
> 
> large correction:  if NIC supports _checksum_ on receive, then no extra 
> copy is required.
> 
> 	Jeff
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: zero copy issue while receiving the data (counter part of sen dfil e)
@ 2004-12-21 19:43       ` Dmitry Yusupov
  0 siblings, 0 replies; 17+ messages in thread
From: Dmitry Yusupov @ 2004-12-21 19:43 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Rajat Jain, Noida, linux-newbie, linux-net, linux-kernel,
	kernelnewbies, Sanjay Kumar, Noida, Deepak Kumar Gupta, Noida

indeed :)
another words if you have modern NIC than you get "zero-copy"(except
copy_to_user()) for free :)

Regards,
Dima

On Tue, 2004-12-21 at 14:30 -0500, Jeff Garzik wrote:
> Dmitry Yusupov wrote:
> > Rajat,
> > 
> > small correction, if NIC supports DMA operation on receive, than no
> > extra copy required. Therefore sock_recvmsg() and tcp_read_sock
> 
> large correction:  if NIC supports _checksum_ on receive, then no extra 
> copy is required.
> 
> 	Jeff
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: zero copy issue while receiving the data (counter part of sendfil e)
  2004-12-21 19:43       ` Dmitry Yusupov
@ 2004-12-22  8:21         ` Mandeep Sandhu
  -1 siblings, 0 replies; 17+ messages in thread
From: Mandeep Sandhu @ 2004-12-22  8:21 UTC (permalink / raw)
  To: dima
  Cc: Jeff Garzik, Rajat Jain, Noida, linux-newbie, linux-net,
	linux-kernel, kernelnewbies, Sanjay Kumar, Noida,
	Deepak Kumar Gupta, Noida

On Wed, 2004-12-22 at 01:13, Dmitry Yusupov wrote:
> indeed :)
> another words if you have modern NIC than you get "zero-copy"(except
> copy_to_user()) for free :)
what does "checksum on rx" mean??? Don't most of the NIC's support
DMA-ing to mem on rx-ing a packet? so what does "zero-copy for free"
mean here?

thanx,
-mandeep
> 
> Regards,
> Dima
> 
> On Tue, 2004-12-21 at 14:30 -0500, Jeff Garzik wrote:
> > Dmitry Yusupov wrote:
> > > Rajat,
> > > 
> > > small correction, if NIC supports DMA operation on receive, than no
> > > extra copy required. Therefore sock_recvmsg() and tcp_read_sock
> > 
> > large correction:  if NIC supports _checksum_ on receive, then no extra 
> > copy is required.
> > 
> > 	Jeff
> > 
> 
> 
> --
> Kernelnewbies: Help each other learn about the Linux kernel.
> Archive:       http://mail.nl.linux.org/kernelnewbies/
> FAQ:           http://kernelnewbies.org/faq/
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: zero copy issue while receiving the data (counter part of sendfil e)
@ 2004-12-22  8:21         ` Mandeep Sandhu
  0 siblings, 0 replies; 17+ messages in thread
From: Mandeep Sandhu @ 2004-12-22  8:21 UTC (permalink / raw)
  To: dima
  Cc: Jeff Garzik, Rajat Jain, Noida, linux-newbie, linux-net,
	linux-kernel, kernelnewbies, Sanjay Kumar, Noida,
	Deepak Kumar Gupta, Noida

On Wed, 2004-12-22 at 01:13, Dmitry Yusupov wrote:
> indeed :)
> another words if you have modern NIC than you get "zero-copy"(except
> copy_to_user()) for free :)
what does "checksum on rx" mean??? Don't most of the NIC's support
DMA-ing to mem on rx-ing a packet? so what does "zero-copy for free"
mean here?

thanx,
-mandeep
> 
> Regards,
> Dima
> 
> On Tue, 2004-12-21 at 14:30 -0500, Jeff Garzik wrote:
> > Dmitry Yusupov wrote:
> > > Rajat,
> > > 
> > > small correction, if NIC supports DMA operation on receive, than no
> > > extra copy required. Therefore sock_recvmsg() and tcp_read_sock
> > 
> > large correction:  if NIC supports _checksum_ on receive, then no extra 
> > copy is required.
> > 
> > 	Jeff
> > 
> 
> 
> --
> Kernelnewbies: Help each other learn about the Linux kernel.
> Archive:       http://mail.nl.linux.org/kernelnewbies/
> FAQ:           http://kernelnewbies.org/faq/
> 

--
Kernelnewbies: Help each other learn about the Linux kernel.
Archive:       http://mail.nl.linux.org/kernelnewbies/
FAQ:           http://kernelnewbies.org/faq/


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: zero copy issue while receiving the data (counter part of sendfil e)
  2004-12-22  8:21         ` Mandeep Sandhu
  (?)
@ 2004-12-22 15:50         ` Martijn van Oosterhout
  2004-12-22 16:06             ` John W. Linville
  -1 siblings, 1 reply; 17+ messages in thread
From: Martijn van Oosterhout @ 2004-12-22 15:50 UTC (permalink / raw)
  To: Mandeep Sandhu
  Cc: dima, Jeff Garzik, linux-newbie, linux-net, linux-kernel, kernelnewbies

[-- Attachment #1: Type: text/plain, Size: 1250 bytes --]

On Wed, Dec 22, 2004 at 01:51:58PM +0530, Mandeep Sandhu wrote:
> On Wed, 2004-12-22 at 01:13, Dmitry Yusupov wrote:
> > indeed :)
> > another words if you have modern NIC than you get "zero-copy"(except
> > copy_to_user()) for free :)
> what does "checksum on rx" mean??? Don't most of the NIC's support
> DMA-ing to mem on rx-ing a packet? so what does "zero-copy for free"
> mean here?

It's if the network card will check the checksums of the packets on
receiving. If it doesn't, the main CPU needs to read every byte in the
packet to calculate the checksum itself. If the CPU is doing that
anyway you can copy it elsewhere for free. 

Generally, reading from memory takes time because the CPU has to wait,
writing is free since it can be deferred in the cache (in theory
indefinitly) until there's free cycle.

In other words, if the card isn't checksumming but does DMA you're not
really saving any time over a manual copy.

Hope this helps,
-- 
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: zero copy issue while receiving the data (counter part of sendfil e)
  2004-12-22 15:50         ` Martijn van Oosterhout
@ 2004-12-22 16:06             ` John W. Linville
  0 siblings, 0 replies; 17+ messages in thread
From: John W. Linville @ 2004-12-22 16:06 UTC (permalink / raw)
  To: Martijn van Oosterhout
  Cc: Mandeep Sandhu, dima, Jeff Garzik, linux-newbie, linux-net,
	linux-kernel, kernelnewbies

On Wed, Dec 22, 2004 at 04:50:05PM +0100, Martijn van Oosterhout wrote:

> Generally, reading from memory takes time because the CPU has to wait,
> writing is free since it can be deferred in the cache (in theory
> indefinitly) until there's free cycle.

I'm not sure I'd call that "free" -- executing the instructions for
the write has a non-zero cost.

Still, it is significantly cheaper than the read...

John
-- 
John W. Linville
linville@tuxdriver.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: zero copy issue while receiving the data (counter part of sendfil e)
@ 2004-12-22 16:06             ` John W. Linville
  0 siblings, 0 replies; 17+ messages in thread
From: John W. Linville @ 2004-12-22 16:06 UTC (permalink / raw)
  To: Martijn van Oosterhout
  Cc: Mandeep Sandhu, dima, Jeff Garzik, linux-newbie, linux-net,
	linux-kernel, kernelnewbies

On Wed, Dec 22, 2004 at 04:50:05PM +0100, Martijn van Oosterhout wrote:

> Generally, reading from memory takes time because the CPU has to wait,
> writing is free since it can be deferred in the cache (in theory
> indefinitly) until there's free cycle.

I'm not sure I'd call that "free" -- executing the instructions for
the write has a non-zero cost.

Still, it is significantly cheaper than the read...

John
-- 
John W. Linville
linville@tuxdriver.com

--
Kernelnewbies: Help each other learn about the Linux kernel.
Archive:       http://mail.nl.linux.org/kernelnewbies/
FAQ:           http://kernelnewbies.org/faq/


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: zero copy issue while receiving the data (counter part of sen dfil e)
  2004-12-17 16:24 ` Rajat  Jain, Noida
@ 2004-12-17 17:31   ` Dmitry Yusupov
  -1 siblings, 0 replies; 17+ messages in thread
From: Dmitry Yusupov @ 2004-12-17 17:31 UTC (permalink / raw)
  To: Rajat  Jain, Noida
  Cc: linux-newbie, linux-net, linux-kernel, kernelnewbies,
	Sanjay Kumar, Noida, Deepak Kumar Gupta, Noida

On Fri, 2004-12-17 at 21:54 +0530, Rajat Jain, Noida wrote:
>  
> Hi,
> 
> Thanks for the reply.
> 
> Actually I am developing a loadable kernel module. I agree that at the bare
> minimum, I need to copy from the NIC's device buffer to kernel's allocated
> sk_buff (socket buffer). What I want is to avoid FURTHER coying of data from
> the sk_buffs to the buffers allocated by the module. 

Looks like you have two options:

a) pre-fill and use "struct iovec" with sock_recvmsg()

b) intercept socket's receive callback with tcp_read_sock() and use 
skb_copy_bits() to copy data from skb to your destination buffer.

Regards,
Dima

> 
> And hence I expected to pass the address of a buffer pointer to
> tcp_read_sock(). And I expected this function to set it to socket buffer.
> Any pointers on the functionality of tcp_read_sock()??
> 
> Rajat
> 
> 
> -----Original Message-----
> From: Dmitry Yusupov [mailto:dima@s2io.com] 
> Sent: Friday, December 17, 2004 7:07 AM
> To: Rajat Jain, Noida
> Cc: linux-net@vger.kernel.org; Sanjay Kumar, Noida; Deepak Kumar Gupta,
> Noida
> Subject: Re: zero copy issue while receiving the data (counter part of
> sendfil e)
> 
> Hi Rajat,
> 
> I was using this function some times back... It's been working for me just
> fine. Also kernel's RPC (see xprt* files) uses it. So you might want to take
> a look.
> 
> In general, it is not possible to fully avoid copying. You need at least
> copy data from NIC's skb to the destination. It might be user buffer or
> kernel buffer(depends on application).
> 
> Regards,
> Dmitry
> 
> 
> On Thu, 2004-12-16 at 19:38 +0530, Rajat Jain, Noida wrote:
> >  
> > Hi,
> > 
> > I'm experimenting on stock kernel 2.6.8
> > 
> > I was looking for an interface that could directly receive data from a 
> > network socket, WITHOUT coying from kernel space to user space. (Like 
> > for sending data, "sendfile" provides to send data to network socket 
> > without copying it to kernel space). I came across tcp_read_sock() 
> > interface in net/ipv4/tcp.c.
> > 
> > Has anybody tried tcp_read_sock()?? Is there any known issue with it 
> > ?? If somebody has some idea, I would appreciate if you can share.
> > 
> > I might be wrong, but what I perceive is that I will pass a pointer to 
> > this function. And when the function returns, I expect it to be set to 
> > the kernel buffer (corresponding to socket).
> > 
> > 1) To fulfill this objective, I expect to pass a pointer to pointer & 
> > only then it can be done. (If we have to modify a pointer's value, we 
> > have to pass its address ... Right??). However, this function expects 
> > a char * buf (in read_descriptor_t argument). Any ideas ?????????
> > 
> > 2) This code also frees the space allocated to sk_buffs etc using 
> > sk_eat_skb(sk, skb) and cleanup_rbuf(sk, copied) etc. But this 
> > function is supposed to return these locations to the calling code ...
> Right???
> > 
> > Any pointers are more than welcome. I have provided the code for
> reference.
> > Please cc the reply to me as I'm not on the list.
> > 
> > Thanks & regards,
> > 
> > Rajat Jain
> > 
> > ----------------------------------------------------------------------
> > -
> > /* net/ipv4/tcp.c
> >  * This routine provides an alternative to tcp_recvmsg() for routines
> >  * that would like to handle copying from skbuffs directly in 'sendfile'
> >  * fashion.
> >  * Note:
> >  *      - It is assumed that the socket was locked by the caller.
> >  *      - The routine does not block.
> >  *      - At present, there is no support for reading OOB data
> >  *        or for 'peeking' the socket using this routine
> >  *        (although both would be easy to implement).
> >  */
> > int tcp_read_sock(struct sock *sk, read_descriptor_t *desc,
> >                   sk_read_actor_t recv_actor) {
> >         struct sk_buff *skb;
> >         struct tcp_opt *tp = tcp_sk(sk);
> >         u32 seq = tp->copied_seq;
> >         u32 offset;
> >         int copied = 0;
> > 
> >         if (sk->sk_state == TCP_LISTEN)
> >                 return -ENOTCONN;
> >         while ((skb = tcp_recv_skb(sk, seq, &offset)) != NULL) {
> >                 if (offset < skb->len) {
> >                         size_t used, len;
> > 
> >                         len = skb->len - offset;
> >                         /* Stop reading if we hit a patch of urgent data
> */
> >                         if (tp->urg_data) {
> >                                 u32 urg_offset = tp->urg_seq - seq;
> >                                 if (urg_offset < len)
> >                                         len = urg_offset;
> >                                 if (!len)
> >                                         break;
> >                         }
> >                         used = recv_actor(desc, skb, offset, len);
> >                         if (used <= len) {
> >                                 seq += used;
> >                                 copied += used;
> >                                 offset += used;
> >                         }
> >                         if (offset != skb->len)
> >                                 break;
> >                 }
> >                 if (skb->h.th->fin) {
> >                         sk_eat_skb(sk, skb);
> >                         ++seq;
> >                         break;
> >                 }
> >                 sk_eat_skb(sk, skb);
> >                 if (!desc->count)
> >                         break;
> >         }
> >         tp->copied_seq = seq;
> > 
> >         tcp_rcv_space_adjust(sk);
> > 
> >         /* Clean up data we have read: This will do ACK frames. */
> >         if (copied)
> >                 cleanup_rbuf(sk, copied);
> >         return copied;
> > }---------------------------------------------------------------------
> > --
> > 
> > read_descriptor_t is defined as:
> > 
> > /*
> >  * include/linux/fs.h
> >  */
> > typedef struct {
> >         size_t written;
> >         size_t count;
> >         union {
> >                 char __user * buf;
> >                 void *data;
> >         } arg;
> >         int error;
> > } read_descriptor_t;
> > ----------------------------------------------------------------------
> > -
> > 
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-net" 
> > in the body of a message to majordomo@vger.kernel.org More majordomo 
> > info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: zero copy issue while receiving the data (counter part of sen dfil e)
@ 2004-12-17 17:31   ` Dmitry Yusupov
  0 siblings, 0 replies; 17+ messages in thread
From: Dmitry Yusupov @ 2004-12-17 17:31 UTC (permalink / raw)
  To: Rajat  Jain, Noida
  Cc: linux-newbie, linux-net, linux-kernel, kernelnewbies,
	Sanjay Kumar, Noida, Deepak Kumar Gupta, Noida

On Fri, 2004-12-17 at 21:54 +0530, Rajat Jain, Noida wrote:
>  
> Hi,
> 
> Thanks for the reply.
> 
> Actually I am developing a loadable kernel module. I agree that at the bare
> minimum, I need to copy from the NIC's device buffer to kernel's allocated
> sk_buff (socket buffer). What I want is to avoid FURTHER coying of data from
> the sk_buffs to the buffers allocated by the module. 

Looks like you have two options:

a) pre-fill and use "struct iovec" with sock_recvmsg()

b) intercept socket's receive callback with tcp_read_sock() and use 
skb_copy_bits() to copy data from skb to your destination buffer.

Regards,
Dima

> 
> And hence I expected to pass the address of a buffer pointer to
> tcp_read_sock(). And I expected this function to set it to socket buffer.
> Any pointers on the functionality of tcp_read_sock()??
> 
> Rajat
> 
> 
> -----Original Message-----
> From: Dmitry Yusupov [mailto:dima@s2io.com] 
> Sent: Friday, December 17, 2004 7:07 AM
> To: Rajat Jain, Noida
> Cc: linux-net@vger.kernel.org; Sanjay Kumar, Noida; Deepak Kumar Gupta,
> Noida
> Subject: Re: zero copy issue while receiving the data (counter part of
> sendfil e)
> 
> Hi Rajat,
> 
> I was using this function some times back... It's been working for me just
> fine. Also kernel's RPC (see xprt* files) uses it. So you might want to take
> a look.
> 
> In general, it is not possible to fully avoid copying. You need at least
> copy data from NIC's skb to the destination. It might be user buffer or
> kernel buffer(depends on application).
> 
> Regards,
> Dmitry
> 
> 
> On Thu, 2004-12-16 at 19:38 +0530, Rajat Jain, Noida wrote:
> >  
> > Hi,
> > 
> > I'm experimenting on stock kernel 2.6.8
> > 
> > I was looking for an interface that could directly receive data from a 
> > network socket, WITHOUT coying from kernel space to user space. (Like 
> > for sending data, "sendfile" provides to send data to network socket 
> > without copying it to kernel space). I came across tcp_read_sock() 
> > interface in net/ipv4/tcp.c.
> > 
> > Has anybody tried tcp_read_sock()?? Is there any known issue with it 
> > ?? If somebody has some idea, I would appreciate if you can share.
> > 
> > I might be wrong, but what I perceive is that I will pass a pointer to 
> > this function. And when the function returns, I expect it to be set to 
> > the kernel buffer (corresponding to socket).
> > 
> > 1) To fulfill this objective, I expect to pass a pointer to pointer & 
> > only then it can be done. (If we have to modify a pointer's value, we 
> > have to pass its address ... Right??). However, this function expects 
> > a char * buf (in read_descriptor_t argument). Any ideas ?????????
> > 
> > 2) This code also frees the space allocated to sk_buffs etc using 
> > sk_eat_skb(sk, skb) and cleanup_rbuf(sk, copied) etc. But this 
> > function is supposed to return these locations to the calling code ...
> Right???
> > 
> > Any pointers are more than welcome. I have provided the code for
> reference.
> > Please cc the reply to me as I'm not on the list.
> > 
> > Thanks & regards,
> > 
> > Rajat Jain
> > 
> > ----------------------------------------------------------------------
> > -
> > /* net/ipv4/tcp.c
> >  * This routine provides an alternative to tcp_recvmsg() for routines
> >  * that would like to handle copying from skbuffs directly in 'sendfile'
> >  * fashion.
> >  * Note:
> >  *      - It is assumed that the socket was locked by the caller.
> >  *      - The routine does not block.
> >  *      - At present, there is no support for reading OOB data
> >  *        or for 'peeking' the socket using this routine
> >  *        (although both would be easy to implement).
> >  */
> > int tcp_read_sock(struct sock *sk, read_descriptor_t *desc,
> >                   sk_read_actor_t recv_actor) {
> >         struct sk_buff *skb;
> >         struct tcp_opt *tp = tcp_sk(sk);
> >         u32 seq = tp->copied_seq;
> >         u32 offset;
> >         int copied = 0;
> > 
> >         if (sk->sk_state == TCP_LISTEN)
> >                 return -ENOTCONN;
> >         while ((skb = tcp_recv_skb(sk, seq, &offset)) != NULL) {
> >                 if (offset < skb->len) {
> >                         size_t used, len;
> > 
> >                         len = skb->len - offset;
> >                         /* Stop reading if we hit a patch of urgent data
> */
> >                         if (tp->urg_data) {
> >                                 u32 urg_offset = tp->urg_seq - seq;
> >                                 if (urg_offset < len)
> >                                         len = urg_offset;
> >                                 if (!len)
> >                                         break;
> >                         }
> >                         used = recv_actor(desc, skb, offset, len);
> >                         if (used <= len) {
> >                                 seq += used;
> >                                 copied += used;
> >                                 offset += used;
> >                         }
> >                         if (offset != skb->len)
> >                                 break;
> >                 }
> >                 if (skb->h.th->fin) {
> >                         sk_eat_skb(sk, skb);
> >                         ++seq;
> >                         break;
> >                 }
> >                 sk_eat_skb(sk, skb);
> >                 if (!desc->count)
> >                         break;
> >         }
> >         tp->copied_seq = seq;
> > 
> >         tcp_rcv_space_adjust(sk);
> > 
> >         /* Clean up data we have read: This will do ACK frames. */
> >         if (copied)
> >                 cleanup_rbuf(sk, copied);
> >         return copied;
> > }---------------------------------------------------------------------
> > --
> > 
> > read_descriptor_t is defined as:
> > 
> > /*
> >  * include/linux/fs.h
> >  */
> > typedef struct {
> >         size_t written;
> >         size_t count;
> >         union {
> >                 char __user * buf;
> >                 void *data;
> >         } arg;
> >         int error;
> > } read_descriptor_t;
> > ----------------------------------------------------------------------
> > -
> > 
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-net" 
> > in the body of a message to majordomo@vger.kernel.org More majordomo 
> > info at  http://vger.kernel.org/majordomo-info.html


--
Kernelnewbies: Help each other learn about the Linux kernel.
Archive:       http://mail.nl.linux.org/kernelnewbies/
FAQ:           http://kernelnewbies.org/faq/


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: zero copy issue while receiving the data (counter part of sen dfil e)
@ 2004-12-17 16:24 ` Rajat  Jain, Noida
  0 siblings, 0 replies; 17+ messages in thread
From: Rajat  Jain, Noida @ 2004-12-17 16:24 UTC (permalink / raw)
  To: linux-newbie, linux-net, linux-kernel, kernelnewbies
  Cc: dima, Sanjay Kumar, Noida, Deepak Kumar Gupta, Noida, Rajat Jain, Noida

 

Hi,

Thanks for the reply.

Actually I am developing a loadable kernel module. I agree that at the bare
minimum, I need to copy from the NIC's device buffer to kernel's allocated
sk_buff (socket buffer). What I want is to avoid FURTHER coying of data from
the sk_buffs to the buffers allocated by the module. 

And hence I expected to pass the address of a buffer pointer to
tcp_read_sock(). And I expected this function to set it to socket buffer.
Any pointers on the functionality of tcp_read_sock()??

Rajat


-----Original Message-----
From: Dmitry Yusupov [mailto:dima@s2io.com] 
Sent: Friday, December 17, 2004 7:07 AM
To: Rajat Jain, Noida
Cc: linux-net@vger.kernel.org; Sanjay Kumar, Noida; Deepak Kumar Gupta,
Noida
Subject: Re: zero copy issue while receiving the data (counter part of
sendfil e)

Hi Rajat,

I was using this function some times back... It's been working for me just
fine. Also kernel's RPC (see xprt* files) uses it. So you might want to take
a look.

In general, it is not possible to fully avoid copying. You need at least
copy data from NIC's skb to the destination. It might be user buffer or
kernel buffer(depends on application).

Regards,
Dmitry


On Thu, 2004-12-16 at 19:38 +0530, Rajat Jain, Noida wrote:
>  
> Hi,
> 
> I'm experimenting on stock kernel 2.6.8
> 
> I was looking for an interface that could directly receive data from a 
> network socket, WITHOUT coying from kernel space to user space. (Like 
> for sending data, "sendfile" provides to send data to network socket 
> without copying it to kernel space). I came across tcp_read_sock() 
> interface in net/ipv4/tcp.c.
> 
> Has anybody tried tcp_read_sock()?? Is there any known issue with it 
> ?? If somebody has some idea, I would appreciate if you can share.
> 
> I might be wrong, but what I perceive is that I will pass a pointer to 
> this function. And when the function returns, I expect it to be set to 
> the kernel buffer (corresponding to socket).
> 
> 1) To fulfill this objective, I expect to pass a pointer to pointer & 
> only then it can be done. (If we have to modify a pointer's value, we 
> have to pass its address ... Right??). However, this function expects 
> a char * buf (in read_descriptor_t argument). Any ideas ?????????
> 
> 2) This code also frees the space allocated to sk_buffs etc using 
> sk_eat_skb(sk, skb) and cleanup_rbuf(sk, copied) etc. But this 
> function is supposed to return these locations to the calling code ...
Right???
> 
> Any pointers are more than welcome. I have provided the code for
reference.
> Please cc the reply to me as I'm not on the list.
> 
> Thanks & regards,
> 
> Rajat Jain
> 
> ----------------------------------------------------------------------
> -
> /* net/ipv4/tcp.c
>  * This routine provides an alternative to tcp_recvmsg() for routines
>  * that would like to handle copying from skbuffs directly in 'sendfile'
>  * fashion.
>  * Note:
>  *      - It is assumed that the socket was locked by the caller.
>  *      - The routine does not block.
>  *      - At present, there is no support for reading OOB data
>  *        or for 'peeking' the socket using this routine
>  *        (although both would be easy to implement).
>  */
> int tcp_read_sock(struct sock *sk, read_descriptor_t *desc,
>                   sk_read_actor_t recv_actor) {
>         struct sk_buff *skb;
>         struct tcp_opt *tp = tcp_sk(sk);
>         u32 seq = tp->copied_seq;
>         u32 offset;
>         int copied = 0;
> 
>         if (sk->sk_state == TCP_LISTEN)
>                 return -ENOTCONN;
>         while ((skb = tcp_recv_skb(sk, seq, &offset)) != NULL) {
>                 if (offset < skb->len) {
>                         size_t used, len;
> 
>                         len = skb->len - offset;
>                         /* Stop reading if we hit a patch of urgent data
*/
>                         if (tp->urg_data) {
>                                 u32 urg_offset = tp->urg_seq - seq;
>                                 if (urg_offset < len)
>                                         len = urg_offset;
>                                 if (!len)
>                                         break;
>                         }
>                         used = recv_actor(desc, skb, offset, len);
>                         if (used <= len) {
>                                 seq += used;
>                                 copied += used;
>                                 offset += used;
>                         }
>                         if (offset != skb->len)
>                                 break;
>                 }
>                 if (skb->h.th->fin) {
>                         sk_eat_skb(sk, skb);
>                         ++seq;
>                         break;
>                 }
>                 sk_eat_skb(sk, skb);
>                 if (!desc->count)
>                         break;
>         }
>         tp->copied_seq = seq;
> 
>         tcp_rcv_space_adjust(sk);
> 
>         /* Clean up data we have read: This will do ACK frames. */
>         if (copied)
>                 cleanup_rbuf(sk, copied);
>         return copied;
> }---------------------------------------------------------------------
> --
> 
> read_descriptor_t is defined as:
> 
> /*
>  * include/linux/fs.h
>  */
> typedef struct {
>         size_t written;
>         size_t count;
>         union {
>                 char __user * buf;
>                 void *data;
>         } arg;
>         int error;
> } read_descriptor_t;
> ----------------------------------------------------------------------
> -
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-net" 
> in the body of a message to majordomo@vger.kernel.org More majordomo 
> info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: zero copy issue while receiving the data (counter part of sen dfil e)
@ 2004-12-17 16:24 ` Rajat  Jain, Noida
  0 siblings, 0 replies; 17+ messages in thread
From: Rajat  Jain, Noida @ 2004-12-17 16:24 UTC (permalink / raw)
  To: linux-newbie, linux-net, linux-kernel, kernelnewbies
  Cc: dima, Sanjay Kumar, Noida, Deepak Kumar Gupta, Noida, Rajat  Jain, Noida

 

Hi,

Thanks for the reply.

Actually I am developing a loadable kernel module. I agree that at the bare
minimum, I need to copy from the NIC's device buffer to kernel's allocated
sk_buff (socket buffer). What I want is to avoid FURTHER coying of data from
the sk_buffs to the buffers allocated by the module. 

And hence I expected to pass the address of a buffer pointer to
tcp_read_sock(). And I expected this function to set it to socket buffer.
Any pointers on the functionality of tcp_read_sock()??

Rajat


-----Original Message-----
From: Dmitry Yusupov [mailto:dima@s2io.com] 
Sent: Friday, December 17, 2004 7:07 AM
To: Rajat Jain, Noida
Cc: linux-net@vger.kernel.org; Sanjay Kumar, Noida; Deepak Kumar Gupta,
Noida
Subject: Re: zero copy issue while receiving the data (counter part of
sendfil e)

Hi Rajat,

I was using this function some times back... It's been working for me just
fine. Also kernel's RPC (see xprt* files) uses it. So you might want to take
a look.

In general, it is not possible to fully avoid copying. You need at least
copy data from NIC's skb to the destination. It might be user buffer or
kernel buffer(depends on application).

Regards,
Dmitry


On Thu, 2004-12-16 at 19:38 +0530, Rajat Jain, Noida wrote:
>  
> Hi,
> 
> I'm experimenting on stock kernel 2.6.8
> 
> I was looking for an interface that could directly receive data from a 
> network socket, WITHOUT coying from kernel space to user space. (Like 
> for sending data, "sendfile" provides to send data to network socket 
> without copying it to kernel space). I came across tcp_read_sock() 
> interface in net/ipv4/tcp.c.
> 
> Has anybody tried tcp_read_sock()?? Is there any known issue with it 
> ?? If somebody has some idea, I would appreciate if you can share.
> 
> I might be wrong, but what I perceive is that I will pass a pointer to 
> this function. And when the function returns, I expect it to be set to 
> the kernel buffer (corresponding to socket).
> 
> 1) To fulfill this objective, I expect to pass a pointer to pointer & 
> only then it can be done. (If we have to modify a pointer's value, we 
> have to pass its address ... Right??). However, this function expects 
> a char * buf (in read_descriptor_t argument). Any ideas ?????????
> 
> 2) This code also frees the space allocated to sk_buffs etc using 
> sk_eat_skb(sk, skb) and cleanup_rbuf(sk, copied) etc. But this 
> function is supposed to return these locations to the calling code ...
Right???
> 
> Any pointers are more than welcome. I have provided the code for
reference.
> Please cc the reply to me as I'm not on the list.
> 
> Thanks & regards,
> 
> Rajat Jain
> 
> ----------------------------------------------------------------------
> -
> /* net/ipv4/tcp.c
>  * This routine provides an alternative to tcp_recvmsg() for routines
>  * that would like to handle copying from skbuffs directly in 'sendfile'
>  * fashion.
>  * Note:
>  *      - It is assumed that the socket was locked by the caller.
>  *      - The routine does not block.
>  *      - At present, there is no support for reading OOB data
>  *        or for 'peeking' the socket using this routine
>  *        (although both would be easy to implement).
>  */
> int tcp_read_sock(struct sock *sk, read_descriptor_t *desc,
>                   sk_read_actor_t recv_actor) {
>         struct sk_buff *skb;
>         struct tcp_opt *tp = tcp_sk(sk);
>         u32 seq = tp->copied_seq;
>         u32 offset;
>         int copied = 0;
> 
>         if (sk->sk_state == TCP_LISTEN)
>                 return -ENOTCONN;
>         while ((skb = tcp_recv_skb(sk, seq, &offset)) != NULL) {
>                 if (offset < skb->len) {
>                         size_t used, len;
> 
>                         len = skb->len - offset;
>                         /* Stop reading if we hit a patch of urgent data
*/
>                         if (tp->urg_data) {
>                                 u32 urg_offset = tp->urg_seq - seq;
>                                 if (urg_offset < len)
>                                         len = urg_offset;
>                                 if (!len)
>                                         break;
>                         }
>                         used = recv_actor(desc, skb, offset, len);
>                         if (used <= len) {
>                                 seq += used;
>                                 copied += used;
>                                 offset += used;
>                         }
>                         if (offset != skb->len)
>                                 break;
>                 }
>                 if (skb->h.th->fin) {
>                         sk_eat_skb(sk, skb);
>                         ++seq;
>                         break;
>                 }
>                 sk_eat_skb(sk, skb);
>                 if (!desc->count)
>                         break;
>         }
>         tp->copied_seq = seq;
> 
>         tcp_rcv_space_adjust(sk);
> 
>         /* Clean up data we have read: This will do ACK frames. */
>         if (copied)
>                 cleanup_rbuf(sk, copied);
>         return copied;
> }---------------------------------------------------------------------
> --
> 
> read_descriptor_t is defined as:
> 
> /*
>  * include/linux/fs.h
>  */
> typedef struct {
>         size_t written;
>         size_t count;
>         union {
>                 char __user * buf;
>                 void *data;
>         } arg;
>         int error;
> } read_descriptor_t;
> ----------------------------------------------------------------------
> -
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-net" 
> in the body of a message to majordomo@vger.kernel.org More majordomo 
> info at  http://vger.kernel.org/majordomo-info.html

--
Kernelnewbies: Help each other learn about the Linux kernel.
Archive:       http://mail.nl.linux.org/kernelnewbies/
FAQ:           http://kernelnewbies.org/faq/


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2004-12-22 16:06 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-12-21 16:35 zero copy issue while receiving the data (counter part of sen dfil e) Rajat  Jain, Noida
2004-12-21 16:35 ` Rajat  Jain, Noida
2004-12-21 17:22 ` Dmitry Yusupov
2004-12-21 17:22   ` Dmitry Yusupov
2004-12-21 19:30   ` Jeff Garzik
2004-12-21 19:30     ` Jeff Garzik
2004-12-21 19:43     ` Dmitry Yusupov
2004-12-21 19:43       ` Dmitry Yusupov
2004-12-22  8:21       ` zero copy issue while receiving the data (counter part of sendfil e) Mandeep Sandhu
2004-12-22  8:21         ` Mandeep Sandhu
2004-12-22 15:50         ` Martijn van Oosterhout
2004-12-22 16:06           ` John W. Linville
2004-12-22 16:06             ` John W. Linville
  -- strict thread matches above, loose matches on Subject: below --
2004-12-17 16:24 zero copy issue while receiving the data (counter part of sen dfil e) Rajat  Jain, Noida
2004-12-17 16:24 ` Rajat  Jain, Noida
2004-12-17 17:31 ` Dmitry Yusupov
2004-12-17 17:31   ` Dmitry Yusupov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.