Hi! > From: Matteo Croce > > Write a C version of memcpy() which uses the biggest data size allowed, > without generating unaligned accesses. > > The procedure is made of three steps: > First copy data one byte at time until the destination buffer is aligned > to a long boundary. > Then copy the data one long at time shifting the current and the next u8 > to compose a long at every cycle. > Finally, copy the remainder one byte at time. > > On a BeagleV, the TCP RX throughput increased by 45%: > > before: > > $ iperf3 -c beaglev > Connecting to host beaglev, port 5201 > [ 5] local 192.168.85.6 port 44840 connected to 192.168.85.48 port 5201 > [ ID] Interval Transfer Bitrate Retr Cwnd > [ 5] 0.00-1.00 sec 76.4 MBytes 641 Mbits/sec 27 624 KBytes > [ 5] 1.00-2.00 sec 72.5 MBytes 608 Mbits/sec 0 708 KBytes > > after: > > $ iperf3 -c beaglev > Connecting to host beaglev, port 5201 > [ 5] local 192.168.85.6 port 44864 connected to 192.168.85.48 port 5201 > [ ID] Interval Transfer Bitrate Retr Cwnd > [ 5] 0.00-1.00 sec 109 MBytes 912 Mbits/sec 48 559 KBytes > [ 5] 1.00-2.00 sec 108 MBytes 902 Mbits/sec 0 690 > KBytes That's really quite cool. Could you see if it is your "optimized unaligned" copy doing the difference?> +/* convenience union to avoid cast between different pointer types */ > +union types { > + u8 *as_u8; > + unsigned long *as_ulong; > + uintptr_t as_uptr; > +}; > + > +union const_types { > + const u8 *as_u8; > + unsigned long *as_ulong; > + uintptr_t as_uptr; > +}; Missing consts here? Plus... this is really "interesting" coding style. I'd just use casts in kernel. Regards, Pavel -- http://www.livejournal.com/~pavelmachek