linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* linux-firmware binary corruption with gitweb
@ 2009-02-28 19:24 Dave
  2009-03-01  5:47 ` [Orinoco-users] " Pavel Roskin
  0 siblings, 1 reply; 7+ messages in thread
From: Dave @ 2009-02-28 19:24 UTC (permalink / raw)
  To: linux-kernel; +Cc: dwmw2, orinoco-users

I'm aware of at least a couple users of orinoco who have picked up
corrupt firmware# from the linux-firmware tree*.

I've verified that the firmware in the repository itself is correct.

It appears that downloading the file using the blob/raw links from
gitweb causes the corruption (0xc3 everywhere). At least it does with
firefox.

Is this known/expected behaviour?


Thanks,

Dave.

#<http://marc.info/?l=orinoco-users&m=123411762524637>
*<http://git.kernel.org/?p=linux/kernel/git/dwmw2/linux-firmware.git;a=shortlog>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Orinoco-users] linux-firmware binary corruption with gitweb
  2009-02-28 19:24 linux-firmware binary corruption with gitweb Dave
@ 2009-03-01  5:47 ` Pavel Roskin
  2009-03-03 18:59   ` Dave
  0 siblings, 1 reply; 7+ messages in thread
From: Pavel Roskin @ 2009-03-01  5:47 UTC (permalink / raw)
  To: Dave; +Cc: linux-kernel, orinoco-users, dwmw2

On Sat, 2009-02-28 at 19:24 +0000, Dave wrote:
> I'm aware of at least a couple users of orinoco who have picked up
> corrupt firmware# from the linux-firmware tree*.
> 
> I've verified that the firmware in the repository itself is correct.
> 
> It appears that downloading the file using the blob/raw links from
> gitweb causes the corruption (0xc3 everywhere). At least it does with
> firefox.

I can confirm the problem with Firefox 3.0.6.  But it's not "0xc3
everywhere".  The corrupted file is a result of recoding from iso-8859-1
to utf-8.  The correct agere_sta_fw.bin is 65046 bytes long.  The
corrupted agere_sta_fw.bin is 89729 bytes long.

There is a way to recode the original binary with GNU recode:
recode utf8..iso8859-1 agere_sta_fw.bin

wget 1.11.4 also gets a corrupted file 89729 bytes long.

$ wget "http://git.kernel.org/?p=linux/kernel/git/dwmw2/linux-firmware.git;a=blob;f=agere_sta_fw.bin;h=bae000f5a7162f5a5b052a2f5b78016e95f825c5;hb=d4cfa9f14c55e9d62f053a542fac21744f22546b"
--2009-03-01 00:42:38--  http://git.kernel.org/?p=linux/kernel/git/dwmw2/linux-firmware.git;a=blob;f=agere_sta_fw.bin;h=bae000f5a7162f5a5b052a2f5b78016e95f825c5;hb=d4cfa9f14c55e9d62f053a542fac21744f22546b
Resolving git.kernel.org... 204.152.191.40, 149.20.20.136
Connecting to git.kernel.org|204.152.191.40|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/octet-stream]
Saving to: `index.html?p=linux%2Fkernel%2Fgit%2Fdwmw2%2Flinux-firmware.git;a=blob;f=agere_sta_fw.bin;h=bae000f5a7162f5a5b052a2f5b78016e95f825c5;hb=d4cfa9f14c55e9d62f053a542fac21744f22546b'

    [  <=>                                                  ] 89,729       237K/s   in 0.4s    

2009-03-01 00:42:39 (237 KB/s) - `index.html?p=linux%2Fkernel%2Fgit%2Fdwmw2%2Flinux-firmware.git;a=blob;f=agere_sta_fw.bin;h=bae000f5a7162f5a5b052a2f5b78016e95f825c5;hb=d4cfa9f14c55e9d62f053a542fac21744f22546b' saved [89729]

curl 7.18.2 also get the corrupted file:

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 89729    0 89729    0     0   111k      0 --:--:-- --:--:-- --:--:--  191k

My strong impression is that the recoding takes place on the server.  I
think the bug should be reported to the gitweb maintainers unless it a
local breakage on the kernel.org site.

-- 
Regards,
Pavel Roskin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Orinoco-users] linux-firmware binary corruption with gitweb
  2009-03-01  5:47 ` [Orinoco-users] " Pavel Roskin
@ 2009-03-03 18:59   ` Dave
  2009-03-04  0:26     ` Jakub Narebski
  0 siblings, 1 reply; 7+ messages in thread
From: Dave @ 2009-03-03 18:59 UTC (permalink / raw)
  To: Pavel Roskin, git; +Cc: linux-kernel, orinoco-users, dwmw2

Adding the git mailing list.

Pavel Roskin wrote:
> On Sat, 2009-02-28 at 19:24 +0000, Dave wrote:
>> I'm aware of at least a couple users of orinoco who have picked up
>> corrupt firmware# from the linux-firmware tree*.
>>
>> I've verified that the firmware in the repository itself is correct.
>>
>> It appears that downloading the file using the blob/raw links from
>> gitweb causes the corruption (0xc3 everywhere). At least it does with
>> firefox.
> 
> I can confirm the problem with Firefox 3.0.6.  But it's not "0xc3
> everywhere".  The corrupted file is a result of recoding from iso-8859-1
> to utf-8.  The correct agere_sta_fw.bin is 65046 bytes long.  The
> corrupted agere_sta_fw.bin is 89729 bytes long.
> 
> There is a way to recode the original binary with GNU recode:
> recode utf8..iso8859-1 agere_sta_fw.bin
> 
> wget 1.11.4 also gets a corrupted file 89729 bytes long.
> 
> curl 7.18.2 also get the corrupted file:
> 
> My strong impression is that the recoding takes place on the server.  I
> think the bug should be reported to the gitweb maintainers unless it a
> local breakage on the kernel.org site.

Thanks Pavel.

I just did a quick scan of the gitweb README - is this an issue with the
$mimetypes_file or $fallback_encoding configurations variables?


Regards,

Dave.

#<http://marc.info/?l=orinoco-users&m=123411762524637>
*<http://git.kernel.org/?p=linux/kernel/git/dwmw2/linux-firmware.git;a=shortlog>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Orinoco-users] linux-firmware binary corruption with gitweb
  2009-03-03 18:59   ` Dave
@ 2009-03-04  0:26     ` Jakub Narebski
  2009-03-04 23:52       ` Dave
  0 siblings, 1 reply; 7+ messages in thread
From: Jakub Narebski @ 2009-03-04  0:26 UTC (permalink / raw)
  To: Dave; +Cc: Pavel Roskin, git, linux-kernel, orinoco-users, dwmw2

Dave <kilroyd@googlemail.com> writes:

> Adding the git mailing list.
> 
> Pavel Roskin wrote:
> > On Sat, 2009-02-28 at 19:24 +0000, Dave wrote:

>>> I'm aware of at least a couple users of orinoco who have picked up
>>> corrupt firmware# from the linux-firmware tree*.
>>>
>>> I've verified that the firmware in the repository itself is correct.
>>>
>>> It appears that downloading the file using the blob/raw links from
>>> gitweb causes the corruption (0xc3 everywhere). At least it does with
>>> firefox.
>> 
>> I can confirm the problem with Firefox 3.0.6.  But it's not "0xc3
>> everywhere".  The corrupted file is a result of recoding from iso-8859-1
>> to utf-8.  The correct agere_sta_fw.bin is 65046 bytes long.  The
>> corrupted agere_sta_fw.bin is 89729 bytes long.

[...]
>> My strong impression is that the recoding takes place on the server.  I
>> think the bug should be reported to the gitweb maintainers unless it a
>> local breakage on the kernel.org site.
> 
> Thanks Pavel.
> 
> I just did a quick scan of the gitweb README - is this an issue with the
> $mimetypes_file or $fallback_encoding configurations variables?

First, what version of gitweb do you use? It should be in 'Generator'
meta header, or (in older gitweb) in comments in HTML source at the
top of the page.

Second, the file is actually sent to browser 'as is', using binmode :raw
(or at least should be according to my understanding of Perl). And *.bin
binary file gets application/octet-stream mimetype, and doesn't send any
charset info. git.kernel.org should have modern enough gitweb to use this.
Strange...

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Orinoco-users] linux-firmware binary corruption with gitweb
  2009-03-04  0:26     ` Jakub Narebski
@ 2009-03-04 23:52       ` Dave
  2009-03-05 17:26         ` Pavel Roskin
  2009-03-06  0:03         ` Jakub Narebski
  0 siblings, 2 replies; 7+ messages in thread
From: Dave @ 2009-03-04 23:52 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Pavel Roskin, git, linux-kernel, orinoco-users, dwmw2

Jakub Narebski wrote:
> Dave <kilroyd@googlemail.com> writes:
>>> My strong impression is that the recoding takes place on the server.  I
>>> think the bug should be reported to the gitweb maintainers unless it a
>>> local breakage on the kernel.org site.
>> Thanks Pavel.
>>
>> I just did a quick scan of the gitweb README - is this an issue with the
>> $mimetypes_file or $fallback_encoding configurations variables?
> 
> First, what version of gitweb do you use? It should be in 'Generator'
> meta header, or (in older gitweb) in comments in HTML source at the
> top of the page.

Not sure where I'd find the meta header, but at the top of the HTML:

<!-- git web interface version 1.4.5-rc0.GIT-dirty, (C) 2005-2006, Kay
Sievers <kay.sievers@vrfy.org>, Christian Gierke -->
<!-- git core binaries version 1.6.1.1 -->

> Second, the file is actually sent to browser 'as is', using binmode :raw
> (or at least should be according to my understanding of Perl). And *.bin
> binary file gets application/octet-stream mimetype, and doesn't send any
> charset info. git.kernel.org should have modern enough gitweb to use this.
> Strange...

Dug around gitweb.perl in the main git repo. Then looked at the
git/warthog9/gitweb.git repo (after noting the Git Wiki says kernel.org
is running John Hawley's branch).

One notable change to git_blob_plain:

        undef $/;
        binmode STDOUT, ':raw';
-        print <$fd>;
+        #print <$fd>;
+        $output .= <$fd>;
        binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi
        $/ = "\n";

        close $fd;
+
+        return $output;

If that's the code that's running, doesn't that mean the output mode
change doesn't impact the concatenation to $output? So the blob gets utf
encoding when actually printed.


Regards,

Dave.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Orinoco-users] linux-firmware binary corruption with gitweb
  2009-03-04 23:52       ` Dave
@ 2009-03-05 17:26         ` Pavel Roskin
  2009-03-06  0:03         ` Jakub Narebski
  1 sibling, 0 replies; 7+ messages in thread
From: Pavel Roskin @ 2009-03-05 17:26 UTC (permalink / raw)
  To: Dave
  Cc: Jakub Narebski, git, linux-kernel, orinoco-users, dwmw2,
	John 'Warthog9' Hawley

On Wed, 2009-03-04 at 23:52 +0000, Dave wrote:
>         binmode STDOUT, ':raw';
> -        print <$fd>;
> +        #print <$fd>;
> +        $output .= <$fd>;
>         binmode STDOUT, ':utf8'; # as set at the beginning of
> gitweb.cgi

Nice catch!

Looking at the gitweb repository from kernel.org, two instances of
circumventing binmode were introduced by this commit:

commit c79ae555fb3c89d91b4cafbfce306e695720507b
Author: John Hawley <warthog9@voot-cruiser.localdomain>
Date:   Thu Dec 28 21:59:43 2006 -0800

    Last of the changes to deal with channeling the text through the caching
    engine.  Wow is this a total hack.

The original behavior was restored in git_snapshot() by the recent
commit c15229acd9bedf165f1eb05d99fa989d3b9f3e32, but git_blob_plain()
remains broken.

I don't see an easy fix.  We cannot manipulate the blob to counteract
the encoding, as it may not be valid utf-8, and therefore won't be
output in the utf-8 mode.

Maybe binmode should be raw everywhere, and adding to $output should
recode data to utf-8 from other encodings where needed, but it would be
a massive patch, I'm afraid.  Or it would be a small patch requiring
massive testing.

Adding John Hawley to cc:

-- 
Regards,
Pavel Roskin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Orinoco-users] linux-firmware binary corruption with gitweb
  2009-03-04 23:52       ` Dave
  2009-03-05 17:26         ` Pavel Roskin
@ 2009-03-06  0:03         ` Jakub Narebski
  1 sibling, 0 replies; 7+ messages in thread
From: Jakub Narebski @ 2009-03-06  0:03 UTC (permalink / raw)
  To: Dave; +Cc: Pavel Roskin, git, linux-kernel, orinoco-users, dwmw2, J.H.

On Thu, 5 March 2009, Dave wrote:
> Jakub Narebski wrote:
>> Dave <kilroyd@googlemail.com> writes:

>>>> My strong impression is that the recoding takes place on the server.  I
>>>> think the bug should be reported to the gitweb maintainers unless it a
>>>> local breakage on the kernel.org site.

It is on server, but kernel.org runs modified version of gitweb, and
the bug is in the modifications.  See below.

CC-ed John 'Warthog9' Hawley, maintainer of gitweb on kernel.org

>>>>
>>> Thanks Pavel.
>>>
>>> I just did a quick scan of the gitweb README - is this an issue with the
>>> $mimetypes_file or $fallback_encoding configurations variables?
>> 
>> First, what version of gitweb do you use? It should be in 'Generator'
>> meta header, or (in older gitweb) in comments in HTML source at the
>> top of the page.
> 
> Not sure where I'd find the meta header,

<meta name="generator" content="gitweb/1.4.5-rc0.GIT-dirty git/1.6.1.1"/>

> but at the top of the HTML: 
> 
> <!-- git web interface version 1.4.5-rc0.GIT-dirty, (C) 2005-2006, Kay
> Sievers <kay.sievers@vrfy.org>, Christian Gierke -->
> <!-- git core binaries version 1.6.1.1 -->

The question was if it is extremely old version of gitweb, without fix
of raw blob ('blob_plain') output for non-utf8, non-text files. But the
answer is that it is _modified_ version of gitweb, see below.

> 
>> Second, the file is actually sent to browser 'as is', using binmode :raw
>> (or at least should be according to my understanding of Perl). And *.bin
>> binary file gets application/octet-stream mimetype, and doesn't send any
>> charset info. git.kernel.org should have modern enough gitweb to use this.
>> Strange...
> 
> Dug around gitweb.perl in the main git repo. Then looked at the
> git/warthog9/gitweb.git repo (after noting the Git Wiki says kernel.org
> is running John Hawley's branch).
> 
> One notable change to git_blob_plain:
> 
>         undef $/;
>         binmode STDOUT, ':raw';
> -        print <$fd>;
> +        #print <$fd>;
> +        $output .= <$fd>;
>         binmode STDOUT, ':utf8'; # as set at the beginning of gitweb.cgi
>         $/ = "\n";
> 
>         close $fd;
> +
> +        return $output;
> 
> If that's the code that's running, doesn't that mean the output mode
> change doesn't impact the concatenation to $output? So the blob gets utf
> encoding when actually printed.

That is the culprit. kernel.org runs modified version of gitweb, with
added caching.  I guess that the above change was to have 'blob_plain'
output cached... but it loses "rawness", and I guess it also loses
mimetype info (unless "print $cgi->header(...)" is also changed to
appending to $output).

One possible solution would be to redirect STDOUT to scalar, and return
that scalar; do that always when caching _output_, and print :raw all
cached _output_ data.
    close STDOUT;
    open STDOUT, '>', \$output or die "Can't open STDOUT: $!";


BTW. f5aa79d (gitweb: safely output binary files for 'blob_plain' action)
was my third patch for git...

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-03-06  0:03 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-02-28 19:24 linux-firmware binary corruption with gitweb Dave
2009-03-01  5:47 ` [Orinoco-users] " Pavel Roskin
2009-03-03 18:59   ` Dave
2009-03-04  0:26     ` Jakub Narebski
2009-03-04 23:52       ` Dave
2009-03-05 17:26         ` Pavel Roskin
2009-03-06  0:03         ` Jakub Narebski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).