linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC] 2.5.73 zlib #1 memmove
@ 2003-07-01 15:45 Jörn Engel
  2003-07-01 16:16 ` [PATCH RFC] 2.5.73 zlib #2 codefold Jörn Engel
  0 siblings, 1 reply; 4+ messages in thread
From: Jörn Engel @ 2003-07-01 15:45 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: linux-kernel

[ I currently have some mail problems that might take a while to
  resolve.  wohnheim.fh-wedel.de has a new dns-entry and the old one
  has a max lifetime on one week. :(

  I'm trying to resolve that problem but please keep linux-kernel on
  CC:. That way I can at least read the archives until the problems
  are history. ]

Joakim found a performance issue in the zlib and after some back and
forth, it appears that memmove() should be the correct solution.

The code in question copies a string from the LZ77 sliding window into
the output buffer.  The string is 3-258 bytes long with a tendency
towards small strings.  The zlib uses this code to do the copy:

*q++ = *r++;  c--;
*q++ = *r++;  c--;
do {
	*q++ = *r++;
} while (--c);

The first two lines are loop unrolling.  Apart from being ugly, this
should also be slower than memmove(), so I propose this patch.

Jörn

-- 
Fantasy is more important than knowlegde. Knowlegde is limited,
while fantasy embraces the whole world.
-- Albert Einstein

--- linux-2.5.73/lib/zlib_inflate/inffast.c~zlib_memcpy	2003-06-30 03:51:54.000000000 +0200
+++ linux-2.5.73/lib/zlib_inflate/inffast.c	2003-06-30 04:22:32.000000000 +0200
@@ -20,6 +20,14 @@
 #define GRABBITS(j) {while(k<(j)){b|=((uLong)NEXTBYTE)<<k;k+=8;}}
 #define UNGRAB {c=z->avail_in-n;c=(k>>3)<c?k>>3:c;n+=c;p-=c;k-=c<<3;}
 
+static inline void memmove_update(Byte **dest, Byte **src, size_t *n)
+{
+	memmove(*dest, *src, *n);
+	*dest += *n;
+	*src += *n;
+	*n = 0;
+}
+
 /* Called with number of bytes left to write in window at least 258
    (the maximum string length) and number of input bytes available
    at least ten.  The ten bytes are six bytes for the longest length/
@@ -100,30 +108,18 @@
               if (c > e)
               {
                 c -= e;                         /* wrapped copy */
-                do {
-                    *q++ = *r++;
-                } while (--e);
+		memmove_update(&q, &r, &e);
                 r = s->window;
-                do {
-                    *q++ = *r++;
-                } while (--c);
+		memmove_update(&q, &r, &c);
               }
               else                              /* normal copy */
               {
-                *q++ = *r++;  c--;
-                *q++ = *r++;  c--;
-                do {
-                    *q++ = *r++;
-                } while (--c);
+		memmove_update(&q, &r, &c);
               }
             }
             else                                /* normal copy */
             {
-              *q++ = *r++;  c--;
-              *q++ = *r++;  c--;
-              do {
-                *q++ = *r++;
-              } while (--c);
+              memmove_update(&q, &r, &c);
             }
             break;
           }

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH RFC] 2.5.73 zlib #2 codefold
  2003-07-01 15:45 [PATCH RFC] 2.5.73 zlib #1 memmove Jörn Engel
@ 2003-07-01 16:16 ` Jörn Engel
  2003-07-02  8:46   ` Joakim Tjernlund
  0 siblings, 1 reply; 4+ messages in thread
From: Jörn Engel @ 2003-07-01 16:16 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: linux-kernel

This patch folds three calls to memmove_update into one.  This is the
same structure that was in the 1.1.3 version of the zlib as well.  The
change towards 1.1.4 was mixed with a real bugfix, so it slipped
through my brain.

Jörn

-- 
When you close your hand, you own nothing. When you open it up, you
own the whole world.
-- Li Mu Bai in Tiger & Dragon

--- linux-2.5.73/lib/zlib_inflate/inffast.c~zlib_codefold	2003-06-30 03:57:31.000000000 +0200
+++ linux-2.5.73/lib/zlib_inflate/inffast.c	2003-06-30 04:11:37.000000000 +0200
@@ -99,28 +99,18 @@
             /* do the copy */
             m -= c;
             r = q - d;
-            if (r < s->window)                  /* wrap if needed */
-            {
-              do {
+            if (r < s->window) {                /* wrap if needed */
+	      while (r < s->window) {
                 r += s->end - s->window;        /* force pointer in window */
-              } while (r < s->window);          /* covers invalid distances */
+              }                                 /* covers invalid distances */
               e = s->end - r;
-              if (c > e)
-              {
+              if (c > e) {
                 c -= e;                         /* wrapped copy */
 		memmove_update(&q, &r, &e);
                 r = s->window;
-		memmove_update(&q, &r, &c);
-              }
-              else                              /* normal copy */
-              {
-		memmove_update(&q, &r, &c);
               }
             }
-            else                                /* normal copy */
-            {
-              memmove_update(&q, &r, &c);
-            }
+            memmove_update(&q, &r, &c);
             break;
           }
           else if ((e & 64) == 0)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [PATCH RFC] 2.5.73 zlib #2 codefold
  2003-07-01 16:16 ` [PATCH RFC] 2.5.73 zlib #2 codefold Jörn Engel
@ 2003-07-02  8:46   ` Joakim Tjernlund
  2003-07-02 16:59     ` Jörn Engel
  0 siblings, 1 reply; 4+ messages in thread
From: Joakim Tjernlund @ 2003-07-02  8:46 UTC (permalink / raw)
  To: =?UNKNOWN?Q?J=F6rn?= Engel; +Cc: linux-kernel

> This patch folds three calls to memmove_update into one.  This is the
> same structure that was in the 1.1.3 version of the zlib as well.  The
> change towards 1.1.4 was mixed with a real bugfix, so it slipped
> through my brain.
>
> Jörn
[SNIP]

Looks fine to me.

Here is another one in gen_bitlen():
Replace:
  for (bits = 0; bits <= MAX_BITS; bits++) s->bl_count[bits] = 0;
with:
  memset(&s->bl_count[0], 0, MAX_BITS * sizeof(s->bl_count[0]));

Also the following could should be replaced(in defutil.h):
/* ===========================================================================
 * Reverse the first len bits of a code, using straightforward code (a faster
 * method would use a table)
 * IN assertion: 1 <= len <= 15
 */
static inline unsigned bi_reverse(unsigned code, /* the value to invert */
				  int len)       /* its bit length */
{
    register unsigned res = 0;
    do {
        res |= code & 1;
        code >>= 1, res <<= 1;
    } while (--len > 0);
    return res >> 1;
}

Anybody have a table version handy?

 Joakim



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH RFC] 2.5.73 zlib #2 codefold
  2003-07-02  8:46   ` Joakim Tjernlund
@ 2003-07-02 16:59     ` Jörn Engel
  0 siblings, 0 replies; 4+ messages in thread
From: Jörn Engel @ 2003-07-02 16:59 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: linux-kernel, Etienne Lorrain

On Wed, 2 July 2003 10:46:05 +0200, Joakim Tjernlund wrote:
> 
> > This patch folds three calls to memmove_update into one.  This is the
> > same structure that was in the 1.1.3 version of the zlib as well.  The
> > change towards 1.1.4 was mixed with a real bugfix, so it slipped
> > through my brain.
> >
> [SNIP]
> 
> Looks fine to me.
> 
> Here is another one in gen_bitlen():
> Replace:
>   for (bits = 0; bits <= MAX_BITS; bits++) s->bl_count[bits] = 0;
> with:
>   memset(&s->bl_count[0], 0, MAX_BITS * sizeof(s->bl_count[0]));
> 
> Also the following could should be replaced(in defutil.h):
> /* ===========================================================================
>  * Reverse the first len bits of a code, using straightforward code (a faster
>  * method would use a table)
>  * IN assertion: 1 <= len <= 15
>  */
> static inline unsigned bi_reverse(unsigned code, /* the value to invert */
> 				  int len)       /* its bit length */
> {
>     register unsigned res = 0;
>     do {
>         res |= code & 1;
>         code >>= 1, res <<= 1;
>     } while (--len > 0);
>     return res >> 1;
> }
> 
> Anybody have a table version handy?

Onto my unwritten todo list with them.  The next lazy afternoon will
come for sure.

Etiennes code sounds promising as well.  Will have a closer look one
of those afternoons.  If it does fit the description, it might be a
good alternative for those platforms it happens to run on.

On a whole, I think it is better to leave most changes out of
mainline until 2.7 is opened.  At least, unless someone comes up with
an extensive test suite for correctness, throughput and interactivity.
Volunteers? ;)

Jörn

-- 
Data dominates. If you've chosen the right data structures and organized
things well, the algorithms will almost always be self-evident. Data
structures, not algorithms, are central to programming.
-- Rob Pike

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2003-07-02 16:45 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-07-01 15:45 [PATCH RFC] 2.5.73 zlib #1 memmove Jörn Engel
2003-07-01 16:16 ` [PATCH RFC] 2.5.73 zlib #2 codefold Jörn Engel
2003-07-02  8:46   ` Joakim Tjernlund
2003-07-02 16:59     ` Jörn Engel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).