* sendfile @ 2003-04-30 14:28 Pål Halvorsen 2003-04-30 16:51 ` sendfile bert hubert 0 siblings, 1 reply; 22+ messages in thread From: Pål Halvorsen @ 2003-04-30 14:28 UTC (permalink / raw) To: linux-kernel; +Cc: paalh Hi! Does sendfile support UDP connections (SOCK_DGRAM)? Does sendfile remove ALL in-memory data copy operations? PS! Please cc me as I'm not currently is a member of the list. Best regards, -ph ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: sendfile 2003-04-30 14:28 sendfile Pål Halvorsen @ 2003-04-30 16:51 ` bert hubert 2003-04-30 19:12 ` sendfile Pål Halvorsen 0 siblings, 1 reply; 22+ messages in thread From: bert hubert @ 2003-04-30 16:51 UTC (permalink / raw) To: P?l Halvorsen; +Cc: linux-kernel On Wed, Apr 30, 2003 at 04:28:32PM +0200, P?l Halvorsen wrote: > Hi! > > Does sendfile support UDP connections (SOCK_DGRAM)? Try it. I bet it doesn't do so, and certainly not usably. Blasting UDP frames is seldomly useful without checks, like NFS performs. > Does sendfile remove ALL in-memory data copy operations? Depends. With some network adaptors it might. Definition of 'zero-copy' is somewhat misty. Some variants of zero-copy would actually be called 'one-copy' or 'minus-one-copy' in other contexts. Regards, bert -- http://www.PowerDNS.com Open source, database driven DNS Software http://lartc.org Linux Advanced Routing & Traffic Control HOWTO ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: sendfile 2003-04-30 16:51 ` sendfile bert hubert @ 2003-04-30 19:12 ` Pål Halvorsen 2003-04-30 19:28 ` sendfile bert hubert 0 siblings, 1 reply; 22+ messages in thread From: Pål Halvorsen @ 2003-04-30 19:12 UTC (permalink / raw) To: bert hubert; +Cc: linux-kernel, Pål Halvorsen On Wed, 30 Apr 2003, bert hubert wrote: > > Does sendfile support UDP connections (SOCK_DGRAM)? > > Try it. I bet it doesn't do so, and certainly not usably. Blasting UDP > frames is seldomly useful without checks, like NFS performs. It could be useful for applications like streaming video where other protocols on top provide additional functionality or in a multicast session where TCP migth not be appropriate. > > Does sendfile remove ALL in-memory data copy operations? > > Depends. With some network adaptors it might. Definition of 'zero-copy' is > somewhat misty. Some variants of zero-copy would actually be called > 'one-copy' or 'minus-one-copy' in other contexts. But should not the 2.4.X kernels have support for chained sk_buffs (like the BSD mbufs) meaning that support for scatter-gatter I/O from the NIC should be unneccessary to support zero-copy (i.e., NO in-memory data copy operations)? > Regards, > > bert Cheers, -ph ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: sendfile 2003-04-30 19:12 ` sendfile Pål Halvorsen @ 2003-04-30 19:28 ` bert hubert 2003-04-30 21:57 ` sendfile Pål Halvorsen 0 siblings, 1 reply; 22+ messages in thread From: bert hubert @ 2003-04-30 19:28 UTC (permalink / raw) To: P?l Halvorsen; +Cc: linux-kernel On Wed, Apr 30, 2003 at 09:12:17PM +0200, P?l Halvorsen wrote: > It could be useful for applications like streaming video where other > protocols on top provide additional functionality or in a multicast > session where TCP migth not be appropriate. sendfile on UDP would try to send gigabits per second over ppp0... > But should not the 2.4.X kernels have support for chained sk_buffs (like > the BSD mbufs) meaning that support for scatter-gatter I/O from the NIC > should be unneccessary to support zero-copy (i.e., NO in-memory data > copy operations)? No clue what you mean over here. Zero copy means different things to different people. Sendfile eliminates the 'read(to buffer);write(buffer to network);' copy. Some network drivers again may eliminate the 'copy_with_checksum()' step, allowing minus-one-copy, in zerocopy reference frame. Regards, bert -- http://www.PowerDNS.com Open source, database driven DNS Software http://lartc.org Linux Advanced Routing & Traffic Control HOWTO ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: sendfile 2003-04-30 19:28 ` sendfile bert hubert @ 2003-04-30 21:57 ` Pål Halvorsen 2003-04-30 22:18 ` sendfile Mark Mielke 0 siblings, 1 reply; 22+ messages in thread From: Pål Halvorsen @ 2003-04-30 21:57 UTC (permalink / raw) To: bert hubert; +Cc: linux-kernel On Wed, 30 Apr 2003, bert hubert wrote: > On Wed, Apr 30, 2003 at 09:12:17PM +0200, P?l Halvorsen wrote: > > > It could be useful for applications like streaming video where other > > protocols on top provide additional functionality or in a multicast > > session where TCP migth not be appropriate. > > sendfile on UDP would try to send gigabits per second over ppp0... YES, I guess sendfile will send "count" bytes as fast as possible using UDP. However, can't sendfile be called several times, allowing the sender to keep track of the offsett and byte count, e.g., sending the data needed for a second video each second? Or does sendfile close the file/socket after each call (really making it useful for only whole file transfers at a time like retrieving a www-document)? > > But should not the 2.4.X kernels have support for chained sk_buffs (like > > the BSD mbufs) meaning that support for scatter-gatter I/O from the NIC > > should be unneccessary to support zero-copy (i.e., NO in-memory data > > copy operations)? > > No clue what you mean over here. Zero copy means different things to > different people. Sendfile eliminates the 'read(to buffer);write(buffer to > network);' copy. First, zero-copy for me is to have no copy operations from one main memory location to another (not counting the transfer from disk to memory and from memory to NIC). Thus, I would like to read data into one memory location and transfer the same data form the same location to the NIC. I would like to be able to have data several places in memory (like reading data from disk into several non-contiguous pages, e.g. using DMA). Then, I would like to be able to send these data without moving data to another memory location. If for example data for a packet is located in two different pages, I'd like to have a sk_buff pointing to each of these data areas and sending these two data chunks to the NIC without having to copy the data into one single, continuous memory region first before sending it to the NIC. The issue about "chained" sk_buffs is something I read in the Linux Journal (january issue I think) about sendfile. Taking a very brief look at the sk_buff code, I think skb->data could be pointing to a struct skb_shared_info { atomic_t dataref; unsigned int nr_frags; struct sk_buff *frag_list; skb_frag_t frags[MAX_SKB_FRAGS]; }; where each "frags" is a pointer to a page, the offset and the size. Thus, the sk_buff could be able to get data from several memory pages for a single packet!?? However, you will have a data transfer to the CPU calculating the checksum, but the data will not be put into another memory region (i.e., no copy operation). > Some network drivers again may eliminate the 'copy_with_checksum()' step, > allowing minus-one-copy, in zerocopy reference frame. Does this mean that if the NIC cannot perform the checksum on-board, the Linux communication system performs a "copy_with_checksum" copying the data to another location when performing the checksum, i.e., always giving a copy operation? -ph > Regards, > > bert ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: sendfile 2003-04-30 21:57 ` sendfile Pål Halvorsen @ 2003-04-30 22:18 ` Mark Mielke 2003-04-30 22:34 ` sendfile Pål Halvorsen 0 siblings, 1 reply; 22+ messages in thread From: Mark Mielke @ 2003-04-30 22:18 UTC (permalink / raw) To: Pål Halvorsen; +Cc: bert hubert, linux-kernel On Wed, Apr 30, 2003 at 11:57:59PM +0200, P?l Halvorsen wrote: > On Wed, 30 Apr 2003, bert hubert wrote: > > On Wed, Apr 30, 2003 at 09:12:17PM +0200, P?l Halvorsen wrote: > > > It could be useful for applications like streaming video where other > > > protocols on top provide additional functionality or in a multicast > > > session where TCP migth not be appropriate. > > sendfile on UDP would try to send gigabits per second over ppp0... > YES, I guess sendfile will send "count" bytes as fast as possible using > UDP. However, can't sendfile be called several times, allowing the > sender to keep track of the offsett and byte count, e.g., sending the > data needed for a second video each second? Or does sendfile > close the file/socket after each call (really making it useful for only > whole file transfers at a time like retrieving a www-document)? At some point, I would wonder 'why'? I've always considered the real benefit of sendfile() that the system never has to fully swap your process in, in order to do work on your behalf as would be necessary with read() and write(). The zero copy architecture doesn't seem significant to me if you are going to wait between sendfile() requests. > > > But should not the 2.4.X kernels have support for chained sk_buffs (like > > > the BSD mbufs) meaning that support for scatter-gatter I/O from the NIC > > > should be unneccessary to support zero-copy (i.e., NO in-memory data > > > copy operations)? > > No clue what you mean over here. Zero copy means different things to > > different people. Sendfile eliminates the 'read(to buffer);write(buffer to > > network);' copy. > First, zero-copy for me is to have no copy operations from one main memory > location to another (not counting the transfer from disk to memory and > from memory to NIC). Thus, I would like to read data into one memory > location and transfer the same data form the same location to the NIC. To some degree, couldn't sendto() fit this description? (Assuming the kernel implemented 'zero-copy' on sendto()) The benefit of sendfile() is that data isn't coming from a memory location. It is coming from disk, meaning that your process doesn't have to become active in order for work to be done. In the case of UDP packets, you almost always want a layer on top that either times the UDP packet output, or sends output in response to input, mostly defeating the purpose of sendfile()... mark -- mark@mielke.cc/markm@ncf.ca/markm@nortelnetworks.com __________________________ . . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: sendfile 2003-04-30 22:18 ` sendfile Mark Mielke @ 2003-04-30 22:34 ` Pål Halvorsen 2003-05-01 4:28 ` sendfile Mark Mielke 0 siblings, 1 reply; 22+ messages in thread From: Pål Halvorsen @ 2003-04-30 22:34 UTC (permalink / raw) To: Mark Mielke; +Cc: bert hubert, linux-kernel, Pål Halvorsen On Wed, 30 Apr 2003, Mark Mielke wrote: > On Wed, Apr 30, 2003 at 11:57:59PM +0200, P?l Halvorsen wrote: > > On Wed, 30 Apr 2003, bert hubert wrote: > > > On Wed, Apr 30, 2003 at 09:12:17PM +0200, P?l Halvorsen wrote: > > > > It could be useful for applications like streaming video where other > > > > protocols on top provide additional functionality or in a multicast > > > > session where TCP migth not be appropriate. > > > sendfile on UDP would try to send gigabits per second over ppp0... > > YES, I guess sendfile will send "count" bytes as fast as possible using > > UDP. However, can't sendfile be called several times, allowing the > > sender to keep track of the offsett and byte count, e.g., sending the > > data needed for a second video each second? Or does sendfile > > close the file/socket after each call (really making it useful for only > > whole file transfers at a time like retrieving a www-document)? > > At some point, I would wonder 'why'? I've always considered the real > benefit of sendfile() that the system never has to fully swap your > process in, in order to do work on your behalf as would be necessary > with read() and write(). The zero copy architecture doesn't seem > significant to me if you are going to wait between sendfile() > requests. OK, but what I want to do is to use a sendfile-like ("streamfile") system call for streaming multimedia data like video, i.e., sending the whole file requires large buffers at the client (e.g., 4-5 GB for a DVD video). Thus, I would like to have a sending/transfer rate equal to the consumption rate. Sure, I can use read/write, mmap/write, etc. but these include copy operations and several address space switches. If I can have a system call saying "send data segment X to client Y" in one system call and no copy operations, I'll save resources on a heavily loaded machine..... > > > > But should not the 2.4.X kernels have support for chained sk_buffs (like > > > > the BSD mbufs) meaning that support for scatter-gatter I/O from the NIC > > > > should be unneccessary to support zero-copy (i.e., NO in-memory data > > > > copy operations)? > > > No clue what you mean over here. Zero copy means different things to > > > different people. Sendfile eliminates the 'read(to buffer);write(buffer to > > > network);' copy. > > First, zero-copy for me is to have no copy operations from one main memory > > location to another (not counting the transfer from disk to memory and > > from memory to NIC). Thus, I would like to read data into one memory > > location and transfer the same data form the same location to the NIC. > > To some degree, couldn't sendto() fit this description? (Assuming the kernel > implemented 'zero-copy' on sendto()) The benefit of sendfile() is that > data isn't coming from a memory location. It is coming from disk, meaning > that your process doesn't have to become active in order for work to be > done. In the case of UDP packets, you almost always want a layer on top > that either times the UDP packet output, or sends output in response to > input, mostly defeating the purpose of sendfile()... Maybe, but then I'll have two system calls... -ph > mark ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: sendfile 2003-04-30 22:34 ` sendfile Pål Halvorsen @ 2003-05-01 4:28 ` Mark Mielke 2003-05-01 15:25 ` sendfile Joseph Malicki 2003-05-01 21:17 ` sendfile Pål Halvorsen 0 siblings, 2 replies; 22+ messages in thread From: Mark Mielke @ 2003-05-01 4:28 UTC (permalink / raw) To: Pål Halvorsen; +Cc: bert hubert, linux-kernel On Thu, May 01, 2003 at 12:34:32AM +0200, Pål Halvorsen wrote: > On Wed, 30 Apr 2003, Mark Mielke wrote: > > To some degree, couldn't sendto() fit this description? (Assuming the > > kernel implemented 'zero-copy' on sendto()) The benefit of sendfile() > > is that data isn't coming from a memory location. It is coming from disk, > > meaning that your process doesn't have to become active in order for work > > to be done. In the case of UDP packets, you almost always want a layer on > > top that either times the UDP packet output, or sends output in response > > to input, mostly defeating the purpose of sendfile()... > Maybe, but then I'll have two system calls... As I mentioned before, the real benefit to sendfile(), as I understand it, is that sendfile() makes it unnecessary for the OS to fully activate the calling process in order to do work for the calling process. Unless you can point out some other benefit provided by sendfile(), I fail to see how you will do: while (1) { send_frame_over_udp(); sleep(); } Without two system calls. Whether send_frame_over_udp() uses sendfile() as you seem to want it to, or whether it just calls sendto(), doesn't make a difference. Because one of your requirements is that you need to provide a smooth feed, the primary benefit of sendfile(), that of not having to activate your process, becomes invalid. I haven't done timings, or looked deeply at this part of linux-2.5.x, however, I fail to see why the following code should not meet your requirements: void *p = mmap(0, length_of_file, PROT_READ, MAP_SHARED, fd, 0); off_t offset = 0; while (offset < length_of_file) { int packet_size = max(512, length_of_file - offset); send(socket, &p[offset], packet_size, 0); offset += packet_size; usleep(packets_size * 1000000 / packets_per_second); } In theory, send() should be able to provide the zero copy benefits you are requesting. In practice, it might be a little harder, but in this case, from my perspective, send() and sendfile() should both provide equivalent performance. Why would sendfile() perform better than send()? mark -- mark@mielke.cc/markm@ncf.ca/markm@nortelnetworks.com __________________________ . . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: sendfile 2003-05-01 4:28 ` sendfile Mark Mielke @ 2003-05-01 15:25 ` Joseph Malicki 2003-05-01 21:17 ` sendfile Pål Halvorsen 1 sibling, 0 replies; 22+ messages in thread From: Joseph Malicki @ 2003-05-01 15:25 UTC (permalink / raw) To: Mark Mielke, Pål Halvorsen; +Cc: linux-kernel One major difference I've noticed is the interaction with the VM subsystem. When you have a large number of processes mmap'ing large files to send(), it really starts to tickle bugs and performance problems. sendfile() avoids this, only needing to use the page cache. -joe ----- Original Message ----- From: "Mark Mielke" <mark@mark.mielke.cc> To: "Pål Halvorsen" <paalh@ifi.uio.no> Cc: "bert hubert" <ahu@ds9a.nl>; <linux-kernel@vger.kernel.org> Sent: Thursday, May 01, 2003 12:28 AM Subject: Re: sendfile > On Thu, May 01, 2003 at 12:34:32AM +0200, Pål Halvorsen wrote: > > On Wed, 30 Apr 2003, Mark Mielke wrote: > > > To some degree, couldn't sendto() fit this description? (Assuming the > > > kernel implemented 'zero-copy' on sendto()) The benefit of sendfile() > > > is that data isn't coming from a memory location. It is coming from disk, > > > meaning that your process doesn't have to become active in order for work > > > to be done. In the case of UDP packets, you almost always want a layer on > > > top that either times the UDP packet output, or sends output in response > > > to input, mostly defeating the purpose of sendfile()... > > Maybe, but then I'll have two system calls... > > As I mentioned before, the real benefit to sendfile(), as I understand it, is > that sendfile() makes it unnecessary for the OS to fully activate the calling > process in order to do work for the calling process. Unless you can point out > some other benefit provided by sendfile(), I fail to see how you will do: > > while (1) { > send_frame_over_udp(); > sleep(); > } > > Without two system calls. Whether send_frame_over_udp() uses sendfile() as > you seem to want it to, or whether it just calls sendto(), doesn't make a > difference. Because one of your requirements is that you need to provide a > smooth feed, the primary benefit of sendfile(), that of not having to activate > your process, becomes invalid. > > I haven't done timings, or looked deeply at this part of linux-2.5.x, > however, I fail to see why the following code should not meet your > requirements: > > void *p = mmap(0, length_of_file, PROT_READ, MAP_SHARED, fd, 0); > off_t offset = 0; > > while (offset < length_of_file) > { > int packet_size = max(512, length_of_file - offset); > send(socket, &p[offset], packet_size, 0); > offset += packet_size; > usleep(packets_size * 1000000 / packets_per_second); > } > > In theory, send() should be able to provide the zero copy benefits you > are requesting. In practice, it might be a little harder, but in this > case, from my perspective, send() and sendfile() should both provide > equivalent performance. Why would sendfile() perform better than send()? > > mark > > -- > mark@mielke.cc/markm@ncf.ca/markm@nortelnetworks.com __________________________ > . . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder > |\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | > | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada > > One ring to rule them all, one ring to find them, one ring to bring them all > and in the darkness bind them... > > http://mark.mielke.cc/ > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: sendfile 2003-05-01 4:28 ` sendfile Mark Mielke 2003-05-01 15:25 ` sendfile Joseph Malicki @ 2003-05-01 21:17 ` Pål Halvorsen 2003-05-01 22:31 ` sendfile Chris Friesen 1 sibling, 1 reply; 22+ messages in thread From: Pål Halvorsen @ 2003-05-01 21:17 UTC (permalink / raw) To: Mark Mielke; +Cc: bert hubert, linux-kernel, Pål Halvorsen On Thu, 1 May 2003, Mark Mielke wrote: > On Thu, May 01, 2003 at 12:34:32AM +0200, Pål Halvorsen wrote: > > On Wed, 30 Apr 2003, Mark Mielke wrote: > > > To some degree, couldn't sendto() fit this description? (Assuming the > > > kernel implemented 'zero-copy' on sendto()) The benefit of sendfile() > > > is that data isn't coming from a memory location. It is coming from disk, > > > meaning that your process doesn't have to become active in order for work > > > to be done. In the case of UDP packets, you almost always want a layer on > > > top that either times the UDP packet output, or sends output in response > > > to input, mostly defeating the purpose of sendfile()... > > Maybe, but then I'll have two system calls... > > As I mentioned before, the real benefit to sendfile(), as I understand it, is > that sendfile() makes it unnecessary for the OS to fully activate the calling > process in order to do work for the calling process. Unless you can point out > some other benefit provided by sendfile(), I fail to see how you will do: > > while (1) { > send_frame_over_udp(); > sleep(); > } > > Without two system calls. Whether send_frame_over_udp() uses sendfile() as > you seem to want it to, or whether it just calls sendto(), doesn't make a > difference. Because one of your requirements is that you need to provide a > smooth feed, the primary benefit of sendfile(), that of not having to activate > your process, becomes invalid. > > I haven't done timings, or looked deeply at this part of linux-2.5.x, > however, I fail to see why the following code should not meet your > requirements: > > void *p = mmap(0, length_of_file, PROT_READ, MAP_SHARED, fd, 0); > off_t offset = 0; > > while (offset < length_of_file) > { > int packet_size = max(512, length_of_file - offset); > send(socket, &p[offset], packet_size, 0); > offset += packet_size; > usleep(packets_size * 1000000 / packets_per_second); > } > > In theory, send() should be able to provide the zero copy benefits you > are requesting. In practice, it might be a little harder, but in this > case, from my perspective, send() and sendfile() should both provide > equivalent performance. Why would sendfile() perform better than send()? As far as i understand mmap/send, you'll have a copy operation in the kernel here. mmap shares the kernel and user buffer, but when sending the packet data is copied to the socket buffer!!?? OK, but I understand that my streaming scenario is not the target application for sendfile. Then, I have another question - so that I maybe can implement this myself. Can the network interface support gather operations - ie. collecting data several places for a packet ("DMA gather copy" from memory to NIC)? (Like described in http://delivery.acm.org/10.1145/610000/603774/6345.html?key1=603774&key2=4582281501&coll=portal&dl=ACM&CFID=10149715&CFTOKEN=89922395 - Linux Journal Volume 2003 , Issue 105 (January 2003) ) If so, does the sk_buff use the struct skb_shared_info to point to the different memory regions, or ...? -ph > mark ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: sendfile 2003-05-01 21:17 ` sendfile Pål Halvorsen @ 2003-05-01 22:31 ` Chris Friesen 2003-05-01 23:32 ` sendfile Ketil Froyn 2003-05-02 2:41 ` sendfile Mark Mielke 0 siblings, 2 replies; 22+ messages in thread From: Chris Friesen @ 2003-05-01 22:31 UTC (permalink / raw) To: Pål Halvorsen; +Cc: Mark Mielke, bert hubert, linux-kernel Pål Halvorsen wrote: > As far as i understand mmap/send, you'll have a copy operation in the > kernel here. mmap shares the kernel and user buffer, but when sending the > packet data is copied to the socket buffer!!?? Yes, there is a copy there. > OK, but I understand that my streaming scenario is not the target > application for sendfile. What stops you from using sendfile (with TCP) to each destination separately, with the client only reading from the pipe as needed (presumably with a number of frames worth of buffer on the client side)? Chris -- Chris Friesen | MailStop: 043/33/F10 Nortel Networks | work: (613) 765-0557 3500 Carling Avenue | fax: (613) 765-2986 Nepean, ON K2H 8E9 Canada | email: cfriesen@nortelnetworks.com ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: sendfile 2003-05-01 22:31 ` sendfile Chris Friesen @ 2003-05-01 23:32 ` Ketil Froyn 2003-05-02 9:02 ` sendfile Bernd Eckenfels 2003-05-02 2:41 ` sendfile Mark Mielke 1 sibling, 1 reply; 22+ messages in thread From: Ketil Froyn @ 2003-05-01 23:32 UTC (permalink / raw) To: Chris Friesen; +Cc: Pål Halvorsen, Mark Mielke, bert hubert, linux-kernel On Thu, 1 May 2003, Chris Friesen wrote: > Pål Halvorsen wrote: > > > OK, but I understand that my streaming scenario is not the target > > application for sendfile. > > What stops you from using sendfile (with TCP) to each destination > separately, with the client only reading from the pipe as needed > (presumably with a number of frames worth of buffer on the client > side)? I don't think TCP is suitable for streaming multimedia stuff to clients. For instance, if a packet does not arrive on the client, it's better to handle this in the client and skip a frame or show one of worse quality than to have the video stop while waiting for the server to resend. Ketil ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: sendfile 2003-05-01 23:32 ` sendfile Ketil Froyn @ 2003-05-02 9:02 ` Bernd Eckenfels 0 siblings, 0 replies; 22+ messages in thread From: Bernd Eckenfels @ 2003-05-02 9:02 UTC (permalink / raw) To: linux-kernel In article <Pine.LNX.4.40L0.0305020124050.1874-100000@ketil.hb.local> you wrote: > I don't think TCP is suitable for streaming multimedia stuff to clients. > For instance, if a packet does not arrive on the client, it's better to > handle this in the client and skip a frame or show one of worse quality > than to have the video stop while waiting for the server to resend. Yes, this is a problem, but on the other hand, if you want to stream to a large number of clients, you need to consider deployment and firewalling issues. Nearly all streaming applications out there nowaday offer at least a TCP (or HTTP) fallback, or use only TCP. Greetings Bernd -- eckes privat - http://www.eckes.org/ Project Freefire - http://www.freefire.org/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: sendfile 2003-05-01 22:31 ` sendfile Chris Friesen 2003-05-01 23:32 ` sendfile Ketil Froyn @ 2003-05-02 2:41 ` Mark Mielke 2003-05-02 4:19 ` sendfile Chris Friesen 1 sibling, 1 reply; 22+ messages in thread From: Mark Mielke @ 2003-05-02 2:41 UTC (permalink / raw) To: Chris Friesen; +Cc: Pål Halvorsen, bert hubert, linux-kernel On Thu, May 01, 2003 at 06:31:05PM -0400, Chris Friesen wrote: > Pål Halvorsen wrote: > >As far as i understand mmap/send, you'll have a copy operation in the > >kernel here. mmap shares the kernel and user buffer, but when sending the > >packet data is copied to the socket buffer!!?? > Yes, there is a copy there. As far as I understand, sendfile() still requires the data to get from the disk to a page in memory, similar to how send() referencing an mmap()'d page may cause a page fault, reading the data from disk to a page in memory. One copy each. I don't know of a kernel interface that lets data be copied from disk to ethernet card without involving a temporary copy to be in paged memory at some point in time... perhaps the iSCSI stuff can do this? I dunno. Somebody else pointed out that mmap() may not yet be implemented completely optimally. I will have to look at the code before I continue to make my 'in theory' comments, however the following 'NOTE' in the manpage for sendfile makes me suspect that sendfile() is very similar to mmap()/write(): -- CUT -- Presently the descriptor from which data is read cannot correspond to a socket, it must correspond to a file which supports mmap()-like opera- tions. -- CUT -- > >OK, but I understand that my streaming scenario is not the target > >application for sendfile. > What stops you from using sendfile (with TCP) to each destination > separately, with the client only reading from the pipe as needed > (presumably with a number of frames worth of buffer on the client side)? TCP isn't very well suited for video feeds. First, it is streamed, which makes it a little annoying to ensure that only whole frames get through. Second, its acknowledgement scheme prefers reliability over low latency. I'm hoping for good things from SCTP. From what I've read, it looks as if it should provide a compromise between TCP and UDP that is quite optimal. mark -- mark@mielke.cc/markm@ncf.ca/markm@nortelnetworks.com __________________________ . . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: sendfile 2003-05-02 2:41 ` sendfile Mark Mielke @ 2003-05-02 4:19 ` Chris Friesen 2003-05-02 21:06 ` sendfile Mark Mielke 0 siblings, 1 reply; 22+ messages in thread From: Chris Friesen @ 2003-05-02 4:19 UTC (permalink / raw) To: Mark Mielke; +Cc: Pål Halvorsen, bert hubert, linux-kernel Mark Mielke wrote: > As far as I understand, sendfile() still requires the data to get from the > disk to a page in memory, similar to how send() referencing an mmap()'d page > may cause a page fault, reading the data from disk to a page in memory. One > copy each. I don't know of a kernel interface that lets data be copied from > disk to ethernet card without involving a temporary copy to be in paged > memory at some point in time... perhaps the iSCSI stuff can do this? I dunno. According to this: http://asia.cnet.com/builder/program/dev/0,39009360,39062783,00.htm using sendfile() is easier on the CPU due to less trashing of the TLB. I do get your point about protocol limitiations though. Chris -- Chris Friesen | MailStop: 043/33/F10 Nortel Networks | work: (613) 765-0557 3500 Carling Avenue | fax: (613) 765-2986 Nepean, ON K2H 8E9 Canada | email: cfriesen@nortelnetworks.com ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: sendfile 2003-05-02 4:19 ` sendfile Chris Friesen @ 2003-05-02 21:06 ` Mark Mielke 2003-05-03 0:42 ` sendfile Miquel van Smoorenburg ` (2 more replies) 0 siblings, 3 replies; 22+ messages in thread From: Mark Mielke @ 2003-05-02 21:06 UTC (permalink / raw) To: Chris Friesen; +Cc: Pål Halvorsen, bert hubert, linux-kernel On Fri, May 02, 2003 at 12:19:25AM -0400, Chris Friesen wrote: > According to this: > http://asia.cnet.com/builder/program/dev/0,39009360,39062783,00.htm > using sendfile() is easier on the CPU due to less trashing of the TLB. Thanks for the link. It looks quite accurate. One question it raises in my mind, is whether there would be value in improving write()/send() such that they detect that the userspace pointer refers entirely to mmap()'d file pages, and therefore no copy of data from userspace -> kernelspace should be performed. The pages could be loaded and accessed directly (as they are with sendfile()) rather than generating a page fault to load the pages. The TLB trashing does not need to occur. I guess the first response to this question would be 'why not use sendfile()? it already exists, and people have already begun to use it...' My answer is that I don't like sendfile(). It is not-yet-standard, and is fairly limited. I could just be naive, but I think that: write(fd, mmapped_file_pages, length); Could be transparently mapped to the sendfile() code in the kernel, minimizing the benefit of sendfile() having its own system call. It all comes down to optimization. The current implementation of mmap() is not optimal where mmap()'d file pages are passed as data to system calls. mark -- mark@mielke.cc/markm@ncf.ca/markm@nortelnetworks.com __________________________ . . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: sendfile 2003-05-02 21:06 ` sendfile Mark Mielke @ 2003-05-03 0:42 ` Miquel van Smoorenburg 2003-05-03 15:04 ` sendfile Mark Mielke 2003-05-03 12:52 ` sendfile Pål Halvorsen 2003-05-03 21:01 ` sendfile Pål Halvorsen 2 siblings, 1 reply; 22+ messages in thread From: Miquel van Smoorenburg @ 2003-05-03 0:42 UTC (permalink / raw) To: linux-kernel In article <20030502210648.GA5322@mark.mielke.cc>, Mark Mielke <mark@mark.mielke.cc> wrote: >One question it raises in my mind, is whether there would be value in >improving write()/send() such that they detect that the userspace >pointer refers entirely to mmap()'d file pages, and therefore no copy >of data from userspace -> kernelspace should be performed. You mean like http://hypermail.idiosynkrasia.net/linux-kernel/archived/2003/week00/0056.html Mike. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: sendfile 2003-05-03 0:42 ` sendfile Miquel van Smoorenburg @ 2003-05-03 15:04 ` Mark Mielke 0 siblings, 0 replies; 22+ messages in thread From: Mark Mielke @ 2003-05-03 15:04 UTC (permalink / raw) To: Miquel van Smoorenburg; +Cc: linux-kernel On Sat, May 03, 2003 at 12:42:59AM +0000, Miquel van Smoorenburg wrote: > In article <20030502210648.GA5322@mark.mielke.cc>, > Mark Mielke <mark@mark.mielke.cc> wrote: > >One question it raises in my mind, is whether there would be value in > >improving write()/send() such that they detect that the userspace > >pointer refers entirely to mmap()'d file pages, and therefore no copy > >of data from userspace -> kernelspace should be performed. > You mean like > http://hypermail.idiosynkrasia.net/linux-kernel/archived/2003/week00/0056.html Yes, definately, and thank you for referring us to work that has already been done. mark -- mark@mielke.cc/markm@ncf.ca/markm@nortelnetworks.com __________________________ . . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: sendfile 2003-05-02 21:06 ` sendfile Mark Mielke 2003-05-03 0:42 ` sendfile Miquel van Smoorenburg @ 2003-05-03 12:52 ` Pål Halvorsen 2003-05-03 21:01 ` sendfile Pål Halvorsen 2 siblings, 0 replies; 22+ messages in thread From: Pål Halvorsen @ 2003-05-03 12:52 UTC (permalink / raw) To: Mark Mielke; +Cc: Chris Friesen, bert hubert, linux-kernel On Fri, 2 May 2003, Mark Mielke wrote: > On Fri, May 02, 2003 at 12:19:25AM -0400, Chris Friesen wrote: > > According to this: > > http://asia.cnet.com/builder/program/dev/0,39009360,39062783,00.htm > > using sendfile() is easier on the CPU due to less trashing of the TLB. > > Thanks for the link. It looks quite accurate. > > One question it raises in my mind, is whether there would be value in > improving write()/send() such that they detect that the userspace > pointer refers entirely to mmap()'d file pages, and therefore no copy > of data from userspace -> kernelspace should be performed. The pages > could be loaded and accessed directly (as they are with sendfile()) > rather than generating a page fault to load the pages. The TLB trashing > does not need to occur. > > I guess the first response to this question would be 'why not use > sendfile()? it already exists, and people have already begun to use > it...' > > My answer is that I don't like sendfile(). It is not-yet-standard, and > is fairly limited. I could just be naive, but I think that: > > write(fd, mmapped_file_pages, length); > > Could be transparently mapped to the sendfile() code in the kernel, > minimizing the benefit of sendfile() having its own system call. It all > comes down to optimization. The current implementation of mmap() is not > optimal where mmap()'d file pages are passed as data to system calls. This is somewhat similar to what I want to do as well. As long as sendfile can have this, why cant we make write/send/... similar. Thus, removing the copy operation. Then, one can easier support streaming applications (or applications needing more control than sendfile)! -ph > mark ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: sendfile 2003-05-02 21:06 ` sendfile Mark Mielke 2003-05-03 0:42 ` sendfile Miquel van Smoorenburg 2003-05-03 12:52 ` sendfile Pål Halvorsen @ 2003-05-03 21:01 ` Pål Halvorsen 2003-05-04 0:53 ` sendfile Miquel van Smoorenburg 2 siblings, 1 reply; 22+ messages in thread From: Pål Halvorsen @ 2003-05-03 21:01 UTC (permalink / raw) To: Mark Mielke; +Cc: linux-kernel, Pål Halvorsen, miquels > Sat, May 03, 2003 at 12:42:59AM +0000, Miquel van Smoorenburg wrote: > > In article <20030502210648.GA5322@mark.mielke.cc>, > > Mark Mielke <mark@mark.mielke.cc> wrote: > > >One question it raises in my mind, is whether there would be value in > > >improving write()/send() such that they detect that the userspace > > >pointer refers entirely to mmap()'d file pages, and therefore no copy > > >of data from userspace -> kernelspace should be performed. > > You mean like > > > http://hypermail.idiosynkrasia.net/linux-kernel/archived/2003/week00/0056.html > > Yes, definately, and thank you for referring us to work that has already > been done. > > mark Does this mean that if you memory map a file and send it through TCP, you'll have no copy operations transfering data from disk to NIC (except the DMS transfers disk->memory and memory->NIC)? Does there exist work implementing this also for UDP? -ph ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: sendfile 2003-05-03 21:01 ` sendfile Pål Halvorsen @ 2003-05-04 0:53 ` Miquel van Smoorenburg 0 siblings, 0 replies; 22+ messages in thread From: Miquel van Smoorenburg @ 2003-05-04 0:53 UTC (permalink / raw) To: Pål Halvorsen; +Cc: Mark Mielke, linux-kernel, Pål Halvorsen, miquels On Sat, 03 May 2003 23:01:21, Pål Halvorsen wrote: > > > Sat, May 03, 2003 at 12:42:59AM +0000, Miquel van Smoorenburg wrote: > > > In article <20030502210648.GA5322@mark.mielke.cc>, > > > Mark Mielke <mark@mark.mielke.cc> wrote: > > > >One question it raises in my mind, is whether there would be value > in > > > >improving write()/send() such that they detect that the userspace > > > >pointer refers entirely to mmap()'d file pages, and therefore no > copy > > > >of data from userspace -> kernelspace should be performed. > > > You mean like > > > > > > http://hypermail.idiosynkrasia.net/linux-kernel/archived/2003/week00/0056.html > > > > Yes, definately, and thank you for referring us to work that has > already > > been done. > > > > mark > > Does this mean that if you memory map a file and send it through TCP, > you'll have no copy operations transfering data from disk to NIC (except > the DMS transfers disk->memory and memory->NIC)? No. I just referred to an earlier discussion about this topic. That does't mean it has been implemented. In fact if you actually read that discussion you'll see that it probably won't be implemented at all. Mike. -- | Miquel van Smoorenburg | "I know one million ways, to always pick | | miquels@{drinkel.,}cistron.nl | the wrong fantasy" - the Black Crowes. | ^ permalink raw reply [flat|nested] 22+ messages in thread
* sendfile @ 2001-05-24 8:44 Pål Halvorsen 0 siblings, 0 replies; 22+ messages in thread From: Pål Halvorsen @ 2001-05-24 8:44 UTC (permalink / raw) To: linux-kernel, torvalds; +Cc: paalh Hi! I'm a Norwegian PhD student looking at zero-copy data paths through the OS kernel and found sendfile to be interesting. Do this system call remove all in-memory copy operations, i.e., sharing data buffers between file system and com. system? (i'm sending data from disk to the network) Is there any documentation about sendfile? PS! I'm not a member of the mailing list so please cc the answers to my mailing address Thank you in advance, -ph --- . o o . o . o .. o .. o .. o oo . o . o o o _n_n_n____i_i _++++++_ _______ ________ _+++++++++++_ *>(____________I I______I I_____I I______I I___________I __^__ /ooOOOO OOOOoo oo ooo oo oo oo oo ooo ooo __^__ ( ___ )--------------------------------------------------------( ___ ) | / | Paal Halvorsen UniK - Center for technology at Kjeller | \ | | / | University of Oslo | \ | | / | Phone: +47 64844731 PB. 70 | \ | | / | Phone: +47 64844700 (switchboard) N - 2027 KJELLER | \ | |_/_| Fax: +47 63818146 Norway |__| (_____)-- E-mail: paalh@unik.no -- http://www.unik.no/~paalh --(_____) ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2003-05-04 0:39 UTC | newest] Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2003-04-30 14:28 sendfile Pål Halvorsen 2003-04-30 16:51 ` sendfile bert hubert 2003-04-30 19:12 ` sendfile Pål Halvorsen 2003-04-30 19:28 ` sendfile bert hubert 2003-04-30 21:57 ` sendfile Pål Halvorsen 2003-04-30 22:18 ` sendfile Mark Mielke 2003-04-30 22:34 ` sendfile Pål Halvorsen 2003-05-01 4:28 ` sendfile Mark Mielke 2003-05-01 15:25 ` sendfile Joseph Malicki 2003-05-01 21:17 ` sendfile Pål Halvorsen 2003-05-01 22:31 ` sendfile Chris Friesen 2003-05-01 23:32 ` sendfile Ketil Froyn 2003-05-02 9:02 ` sendfile Bernd Eckenfels 2003-05-02 2:41 ` sendfile Mark Mielke 2003-05-02 4:19 ` sendfile Chris Friesen 2003-05-02 21:06 ` sendfile Mark Mielke 2003-05-03 0:42 ` sendfile Miquel van Smoorenburg 2003-05-03 15:04 ` sendfile Mark Mielke 2003-05-03 12:52 ` sendfile Pål Halvorsen 2003-05-03 21:01 ` sendfile Pål Halvorsen 2003-05-04 0:53 ` sendfile Miquel van Smoorenburg -- strict thread matches above, loose matches on Subject: below -- 2001-05-24 8:44 sendfile Pål Halvorsen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).