* [PATCH 0/2] gitweb: remove invalid http-equiv="content-type"
@ 2022-03-07 3:37 Jason Yundt
2022-03-07 3:37 ` [PATCH 1/2] comment: fix typo Jason Yundt
` (4 more replies)
0 siblings, 5 replies; 17+ messages in thread
From: Jason Yundt @ 2022-03-07 3:37 UTC (permalink / raw)
To: git; +Cc: Jeff King, Jason Yundt
See the second commit's message for more details.
Jason Yundt (2):
comment: fix typo
gitweb: remove invalid http-equiv="content-type"
gitweb/gitweb.perl | 4 +---
t/t9502-gitweb-standalone-parse-output.sh | 15 ++++++++++++++-
2 files changed, 15 insertions(+), 4 deletions(-)
--
2.35.1
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 1/2] comment: fix typo
2022-03-07 3:37 [PATCH 0/2] gitweb: remove invalid http-equiv="content-type" Jason Yundt
@ 2022-03-07 3:37 ` Jason Yundt
2022-03-07 3:37 ` [PATCH 2/2] gitweb: remove invalid http-equiv="content-type" Jason Yundt
` (3 subsequent siblings)
4 siblings, 0 replies; 17+ messages in thread
From: Jason Yundt @ 2022-03-07 3:37 UTC (permalink / raw)
To: git; +Cc: Jeff King, Jason Yundt
Signed-off-by: Jason Yundt <jason@jasonyundt.email>
---
t/t9502-gitweb-standalone-parse-output.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/t/t9502-gitweb-standalone-parse-output.sh b/t/t9502-gitweb-standalone-parse-output.sh
index 3167473b30..e7363511dd 100755
--- a/t/t9502-gitweb-standalone-parse-output.sh
+++ b/t/t9502-gitweb-standalone-parse-output.sh
@@ -34,7 +34,7 @@ EOF
#
# This will check that gitweb HTTP header contains proposed filename
# as <basename> with '.tar' suffix added, and that generated tarfile
-# (gitweb message body) has <prefix> as prefix for al files in tarfile
+# (gitweb message body) has <prefix> as prefix for all files in tarfile
#
# <prefix> default to <basename>
check_snapshot () {
--
2.35.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 2/2] gitweb: remove invalid http-equiv="content-type"
2022-03-07 3:37 [PATCH 0/2] gitweb: remove invalid http-equiv="content-type" Jason Yundt
2022-03-07 3:37 ` [PATCH 1/2] comment: fix typo Jason Yundt
@ 2022-03-07 3:37 ` Jason Yundt
2022-03-07 12:23 ` Ævar Arnfjörð Bjarmason
2022-03-07 23:24 ` brian m. carlson
2022-03-08 1:07 ` [PATCH v2 0/2] " Jason Yundt
` (2 subsequent siblings)
4 siblings, 2 replies; 17+ messages in thread
From: Jason Yundt @ 2022-03-07 3:37 UTC (permalink / raw)
To: git; +Cc: Jeff King, Jason Yundt
Before this change, gitweb would generate pages which included:
<meta http-equiv="content-type" content="application/xhtml+xml; charset=utf-8"/>
A meta element with http-equiv="content-type" is said to be in the
"Encoding declaration state". According to the HTML Standard,
The Encoding declaration state may be used in HTML documents,
but elements with an http-equiv attribute in that state must not
be used in XML documents.
Source: <https://html.spec.whatwg.org/multipage/semantics.html#attr-meta-http-equiv-content-type>
This change removes that meta element since gitweb always generates XML
documents.
Signed-off-by: Jason Yundt <jason@jasonyundt.email>
---
gitweb/gitweb.perl | 4 +---
t/t9502-gitweb-standalone-parse-output.sh | 13 +++++++++++++
2 files changed, 14 insertions(+), 3 deletions(-)
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index fbd1c20a23..606b50104c 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -4213,8 +4213,7 @@ sub git_header_html {
my %opts = @_;
my $title = get_page_title();
- my $content_type = get_content_type_html();
- print $cgi->header(-type=>$content_type, -charset => 'utf-8',
+ print $cgi->header(-type=>get_content_type_html(), -charset => 'utf-8',
-status=> $status, -expires => $expires)
unless ($opts{'-no_http_header'});
my $mod_perl_version = $ENV{'MOD_PERL'} ? " $ENV{'MOD_PERL'}" : '';
@@ -4225,7 +4224,6 @@ sub git_header_html {
<!-- git web interface version $version, (C) 2005-2006, Kay Sievers <kay.sievers\@vrfy.org>, Christian Gierke -->
<!-- git core binaries version $git_version -->
<head>
-<meta http-equiv="content-type" content="$content_type; charset=utf-8"/>
<meta name="generator" content="gitweb/$version git/$git_version$mod_perl_version"/>
<meta name="robots" content="index, nofollow"/>
<title>$title</title>
diff --git a/t/t9502-gitweb-standalone-parse-output.sh b/t/t9502-gitweb-standalone-parse-output.sh
index e7363511dd..25165edacc 100755
--- a/t/t9502-gitweb-standalone-parse-output.sh
+++ b/t/t9502-gitweb-standalone-parse-output.sh
@@ -207,4 +207,17 @@ test_expect_success 'xss checks' '
xss "" "$TAG+"
'
+no_http_equiv_content_type() {
+ gitweb_run "$@" &&
+ ! grep -Ei "http-equiv=['\"]?content-type" gitweb.body
+}
+
+# See: <https://html.spec.whatwg.org/dev/semantics.html#attr-meta-http-equiv-content-type>
+test_expect_success 'no http-equiv="content-type" in XHTML' '
+ no_http_equiv_content_type &&
+ no_http_equiv_content_type "p=.git" &&
+ no_http_equiv_content_type "p=.git;a=log" &&
+ no_http_equiv_content_type "p=.git;a=tree"
+'
+
test_done
--
2.35.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] gitweb: remove invalid http-equiv="content-type"
2022-03-07 3:37 ` [PATCH 2/2] gitweb: remove invalid http-equiv="content-type" Jason Yundt
@ 2022-03-07 12:23 ` Ævar Arnfjörð Bjarmason
2022-03-07 22:49 ` Jason Yundt
2022-03-07 23:24 ` brian m. carlson
1 sibling, 1 reply; 17+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-07 12:23 UTC (permalink / raw)
To: Jason Yundt; +Cc: git, Jeff King
On Sun, Mar 06 2022, Jason Yundt wrote:
> Before this change, gitweb would generate pages which included:
>
> <meta http-equiv="content-type" content="application/xhtml+xml; charset=utf-8"/>
>
> A meta element with http-equiv="content-type" is said to be in the
> "Encoding declaration state". According to the HTML Standard,
>
> The Encoding declaration state may be used in HTML documents,
> but elements with an http-equiv attribute in that state must not
> be used in XML documents.
>
> Source: <https://html.spec.whatwg.org/multipage/semantics.html#attr-meta-http-equiv-content-type>
>
> This change removes that meta element since gitweb always generates XML
> documents.
>
> Signed-off-by: Jason Yundt <jason@jasonyundt.email>
> ---
> gitweb/gitweb.perl | 4 +---
> t/t9502-gitweb-standalone-parse-output.sh | 13 +++++++++++++
> 2 files changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index fbd1c20a23..606b50104c 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -4213,8 +4213,7 @@ sub git_header_html {
> my %opts = @_;
>
> my $title = get_page_title();
> - my $content_type = get_content_type_html();
> - print $cgi->header(-type=>$content_type, -charset => 'utf-8',
> + print $cgi->header(-type=>get_content_type_html(), -charset => 'utf-8',
I think it would be better to just skip this hunk, no behavior will
change if it's left in.
> -status=> $status, -expires => $expires)
> unless ($opts{'-no_http_header'});
> my $mod_perl_version = $ENV{'MOD_PERL'} ? " $ENV{'MOD_PERL'}" : '';
> @@ -4225,7 +4224,6 @@ sub git_header_html {
> <!-- git web interface version $version, (C) 2005-2006, Kay Sievers <kay.sievers\@vrfy.org>, Christian Gierke -->
> <!-- git core binaries version $git_version -->
> <head>
> -<meta http-equiv="content-type" content="$content_type; charset=utf-8"/>
..with this being the only behavior change (yeah the variable will now
be used only in one place, but that's fine)
I'm not sure I understand this change really. The result in always XML,
so application/xhtml+xml is redundant, text/html, or both?
But aside from that: I have seen browsers get the lack of encoding=""
"wrong" with data at rest, don't some still default to ISO-8859-1?
So won't this result in badly decoded data if you save the web page &
view it locally?
> <meta name="generator" content="gitweb/$version git/$git_version$mod_perl_version"/>
> <meta name="robots" content="index, nofollow"/>
> <title>$title</title>
> diff --git a/t/t9502-gitweb-standalone-parse-output.sh b/t/t9502-gitweb-standalone-parse-output.sh
> index e7363511dd..25165edacc 100755
> --- a/t/t9502-gitweb-standalone-parse-output.sh
> +++ b/t/t9502-gitweb-standalone-parse-output.sh
> @@ -207,4 +207,17 @@ test_expect_success 'xss checks' '
> xss "" "$TAG+"
> '
>
> +no_http_equiv_content_type() {
> + gitweb_run "$@" &&
> + ! grep -Ei "http-equiv=['\"]?content-type" gitweb.body
Nit: Should we skip the "-i" here since we're testing our own output,
and not http standards in general (i.e. we don't have to worry about the
case of http-equiv?)
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] gitweb: remove invalid http-equiv="content-type"
2022-03-07 12:23 ` Ævar Arnfjörð Bjarmason
@ 2022-03-07 22:49 ` Jason Yundt
0 siblings, 0 replies; 17+ messages in thread
From: Jason Yundt @ 2022-03-07 22:49 UTC (permalink / raw)
To: Ævar Arnfjörð Bjarmason; +Cc: git, Jeff King
On Monday, March 7, 2022 7:23:49 AM EST Ævar Arnfjörð Bjarmason wrote:
> I'm not sure I understand this change really. The result in always XML,
> so application/xhtml+xml is redundant, text/html, or both?
To be honest, using an http-equiv="content-type" in XHTML is confusing. When
you do use one, your goal shouldn’t really be to specify the document’s MIME
type. After all, the first three lines of each page say
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US" lang="en-US">
Those lines are more than enough to determine that something is using XHTML
and UTF-8. Instead, the idea is to help out a parser that is incorrectly
parsing the document as HTML (instead of as XHTML). Historical W3C documents
(that were applicable when http-equiv="content-type" was allowed in XHTML) [1]
[2][3] indicate that http-equiv="content-type" should be used like this:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
In other words, to use http-equiv="content-type" properly in XHTML, you had to
lie about the document’s type. The fact that this is confusing is probably
part of why WHATWG disallowed it in the HTML Standard.
> But aside from that: I have seen browsers get the lack of encoding=""
> "wrong" with data at rest, don't some still default to ISO-8859-1?
>
> So won't this result in badly decoded data if you save the web page &
> view it locally?
I tested this idea in ungoogled-chromium, Firefox and Pale Moon. Other than
Pale Moon in one specific circumstance, they all used UTF-8 as the encoding.
Pale Moon used windows-1252, but only when the file ended with .html. When the
file ended with .xhtml, Pale Moon used UTF-8. That being said, we don’t have to
use an http-equiv="content-type" to fix the problem. Instead, we can use a
<meta charset="utf-8"> which is allowed by the HTML Standard [4].
[1]: <https://www.w3.org/TR/xhtml1/#C_9>
[2]: <https://www.w3.org/TR/html-polyglot/#character-encoding>
[3]: <https://www.w3.org/Bugs/Public/show_bug.cgi?id=21818>
[4]: <https://html.spec.whatwg.org/multipage/semantics.html#attr-meta-charset>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] gitweb: remove invalid http-equiv="content-type"
2022-03-07 3:37 ` [PATCH 2/2] gitweb: remove invalid http-equiv="content-type" Jason Yundt
2022-03-07 12:23 ` Ævar Arnfjörð Bjarmason
@ 2022-03-07 23:24 ` brian m. carlson
1 sibling, 0 replies; 17+ messages in thread
From: brian m. carlson @ 2022-03-07 23:24 UTC (permalink / raw)
To: Jason Yundt; +Cc: git, Jeff King
[-- Attachment #1: Type: text/plain, Size: 1772 bytes --]
On 2022-03-07 at 03:37:23, Jason Yundt wrote:
> Before this change, gitweb would generate pages which included:
>
> <meta http-equiv="content-type" content="application/xhtml+xml; charset=utf-8"/>
>
> A meta element with http-equiv="content-type" is said to be in the
> "Encoding declaration state". According to the HTML Standard,
>
> The Encoding declaration state may be used in HTML documents,
> but elements with an http-equiv attribute in that state must not
> be used in XML documents.
>
> Source: <https://html.spec.whatwg.org/multipage/semantics.html#attr-meta-http-equiv-content-type>
>
> This change removes that meta element since gitweb always generates XML
> documents.
This change seems fine. We do specify this in the HTTP header,
including the character set, which is what matters, so this should work
in every browser, and the http-equiv is unneeded.
I also don't think we need a meta header here, since we have an XML
declaration, and that's controlling in this situation. This isn't
regular HTML and we don't declare it as such, so using a meta header to
control this isn't correct: the XML declaration should be used instead
in the event a user downloads this to a local disk and processes it
outside the context of an HTTP request.
Since we control the HTTP headers, I'd actually argue that your test
might well reject all http-equiv headers since they could be done much
better with actual HTTP headers (and would therefore work with
non-browser clients), but I don't think that's worth a reroll, nor do I
think a test is even needed here (but bonus points for adding one).
So I think this looks good as is. Thanks for the patch.
--
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v2 0/2] gitweb: remove invalid http-equiv="content-type"
2022-03-07 3:37 [PATCH 0/2] gitweb: remove invalid http-equiv="content-type" Jason Yundt
2022-03-07 3:37 ` [PATCH 1/2] comment: fix typo Jason Yundt
2022-03-07 3:37 ` [PATCH 2/2] gitweb: remove invalid http-equiv="content-type" Jason Yundt
@ 2022-03-08 1:07 ` Jason Yundt
2022-03-08 2:13 ` Junio C Hamano
2022-03-08 15:56 ` [PATCH v3 " Jason Yundt
2022-03-08 1:07 ` [PATCH v2 1/2] comment: fix typo Jason Yundt
2022-03-08 1:07 ` [PATCH v2 2/2] gitweb: remove invalid http-equiv="content-type" Jason Yundt
4 siblings, 2 replies; 17+ messages in thread
From: Jason Yundt @ 2022-03-08 1:07 UTC (permalink / raw)
To: git; +Cc: Jason Yundt, Jeff King, Ævar Arnfjörð Bjarmason
See the second commit's message for more details. Compared to the first
version of this patch, this one
- keeps an extra variable,
- replaces the http-equiv="content-type" tag with a charset= one, and
- removes the -i flag from grep.
Jason Yundt (2):
comment: fix typo
gitweb: remove invalid http-equiv="content-type"
gitweb/gitweb.perl | 2 +-
t/t9502-gitweb-standalone-parse-output.sh | 18 +++++++++++++++++-
2 files changed, 18 insertions(+), 2 deletions(-)
--
2.35.1
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v2 1/2] comment: fix typo
2022-03-07 3:37 [PATCH 0/2] gitweb: remove invalid http-equiv="content-type" Jason Yundt
` (2 preceding siblings ...)
2022-03-08 1:07 ` [PATCH v2 0/2] " Jason Yundt
@ 2022-03-08 1:07 ` Jason Yundt
2022-03-08 1:07 ` [PATCH v2 2/2] gitweb: remove invalid http-equiv="content-type" Jason Yundt
4 siblings, 0 replies; 17+ messages in thread
From: Jason Yundt @ 2022-03-08 1:07 UTC (permalink / raw)
To: git; +Cc: Jason Yundt, Jeff King, Ævar Arnfjörð Bjarmason
Signed-off-by: Jason Yundt <jason@jasonyundt.email>
---
t/t9502-gitweb-standalone-parse-output.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/t/t9502-gitweb-standalone-parse-output.sh b/t/t9502-gitweb-standalone-parse-output.sh
index 3167473b30..e7363511dd 100755
--- a/t/t9502-gitweb-standalone-parse-output.sh
+++ b/t/t9502-gitweb-standalone-parse-output.sh
@@ -34,7 +34,7 @@ EOF
#
# This will check that gitweb HTTP header contains proposed filename
# as <basename> with '.tar' suffix added, and that generated tarfile
-# (gitweb message body) has <prefix> as prefix for al files in tarfile
+# (gitweb message body) has <prefix> as prefix for all files in tarfile
#
# <prefix> default to <basename>
check_snapshot () {
--
2.35.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 2/2] gitweb: remove invalid http-equiv="content-type"
2022-03-07 3:37 [PATCH 0/2] gitweb: remove invalid http-equiv="content-type" Jason Yundt
` (3 preceding siblings ...)
2022-03-08 1:07 ` [PATCH v2 1/2] comment: fix typo Jason Yundt
@ 2022-03-08 1:07 ` Jason Yundt
2022-03-08 1:50 ` brian m. carlson
4 siblings, 1 reply; 17+ messages in thread
From: Jason Yundt @ 2022-03-08 1:07 UTC (permalink / raw)
To: git; +Cc: Jason Yundt, Jeff King, Ævar Arnfjörð Bjarmason
Before this change, gitweb would generate pages which included:
<meta http-equiv="content-type" content="application/xhtml+xml; charset=utf-8"/>
A meta element with http-equiv="content-type" is said to be in the
"Encoding declaration state". According to the HTML Standard,
The Encoding declaration state may be used in HTML documents,
but elements with an http-equiv attribute in that state must not
be used in XML documents.
Source: <https://html.spec.whatwg.org/multipage/semantics.html#attr-meta-http-equiv-content-type>
Gitweb always generates XML documents, so its use of
http-equiv="content-type" was invalid. This change replaces that tag
with
<meta charset="utf-8"/>
which is equivalent [1] and allowed in XML documents [2].
[1]: <https://developer.mozilla.org/en-US/docs/Web/HTML/Element/meta#attr-http-equiv>
[2]: <https://html.spec.whatwg.org/multipage/semantics.html#attr-meta-charset>
Signed-off-by: Jason Yundt <jason@jasonyundt.email>
---
gitweb/gitweb.perl | 2 +-
t/t9502-gitweb-standalone-parse-output.sh | 16 ++++++++++++++++
2 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index fbd1c20a23..59457c1004 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -4225,7 +4225,7 @@ sub git_header_html {
<!-- git web interface version $version, (C) 2005-2006, Kay Sievers <kay.sievers\@vrfy.org>, Christian Gierke -->
<!-- git core binaries version $git_version -->
<head>
-<meta http-equiv="content-type" content="$content_type; charset=utf-8"/>
+<meta charset="utf-8"/>
<meta name="generator" content="gitweb/$version git/$git_version$mod_perl_version"/>
<meta name="robots" content="index, nofollow"/>
<title>$title</title>
diff --git a/t/t9502-gitweb-standalone-parse-output.sh b/t/t9502-gitweb-standalone-parse-output.sh
index e7363511dd..0b06e2d6b0 100755
--- a/t/t9502-gitweb-standalone-parse-output.sh
+++ b/t/t9502-gitweb-standalone-parse-output.sh
@@ -207,4 +207,20 @@ test_expect_success 'xss checks' '
xss "" "$TAG+"
'
+check_encoding_meta_element() {
+ gitweb_run "$@" &&
+ ! grep -E "http-equiv=['\"]?content-type" gitweb.body &&
+ grep -F '<meta charset="utf-8"/>' gitweb.body
+}
+
+# One of those can be used in XHTML, the other one can't. See:
+# <https://html.spec.whatwg.org/dev/semantics.html#attr-meta-charset>
+# <https://html.spec.whatwg.org/dev/semantics.html#attr-meta-http-equiv-content-type>
+test_expect_success 'no http-equiv="content-type", yes charset="utf-8"' '
+ check_encoding_meta_element &&
+ check_encoding_meta_element "p=.git" &&
+ check_encoding_meta_element "p=.git;a=log" &&
+ check_encoding_meta_element "p=.git;a=tree"
+'
+
test_done
--
2.35.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v2 2/2] gitweb: remove invalid http-equiv="content-type"
2022-03-08 1:07 ` [PATCH v2 2/2] gitweb: remove invalid http-equiv="content-type" Jason Yundt
@ 2022-03-08 1:50 ` brian m. carlson
2022-03-08 12:44 ` Ævar Arnfjörð Bjarmason
0 siblings, 1 reply; 17+ messages in thread
From: brian m. carlson @ 2022-03-08 1:50 UTC (permalink / raw)
To: Jason Yundt; +Cc: git, Jeff King, Ævar Arnfjörð Bjarmason
[-- Attachment #1: Type: text/plain, Size: 814 bytes --]
On 2022-03-08 at 01:07:11, Jason Yundt wrote:
> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index fbd1c20a23..59457c1004 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -4225,7 +4225,7 @@ sub git_header_html {
> <!-- git web interface version $version, (C) 2005-2006, Kay Sievers <kay.sievers\@vrfy.org>, Christian Gierke -->
> <!-- git core binaries version $git_version -->
> <head>
> -<meta http-equiv="content-type" content="$content_type; charset=utf-8"/>
> +<meta charset="utf-8"/>
I don't actually think this is an improvement. I don't think it's
necessary, considering we have an XML declaration and the HTTP header,
both of which already say it's UTF-8 and will take precedence over this.
--
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 0/2] gitweb: remove invalid http-equiv="content-type"
2022-03-08 1:07 ` [PATCH v2 0/2] " Jason Yundt
@ 2022-03-08 2:13 ` Junio C Hamano
2022-03-08 12:26 ` Jason Yundt
2022-03-08 15:56 ` [PATCH v3 " Jason Yundt
1 sibling, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2022-03-08 2:13 UTC (permalink / raw)
To: Jason Yundt; +Cc: git, Jeff King, Ævar Arnfjörð Bjarmason
Jason Yundt <jason@jasonyundt.email> writes:
> - keeps an extra variable,
I am not sure if this is an improvement. The original had two
places that used $content_type, but after getting rid of one, there
is only one place that needed the value, which can be used in place;
and it was quite clear that was what was going on in the previous
iteration.
About the <meta> thing, it seems that brian already commented on it.
Thanks.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 0/2] gitweb: remove invalid http-equiv="content-type"
2022-03-08 2:13 ` Junio C Hamano
@ 2022-03-08 12:26 ` Jason Yundt
0 siblings, 0 replies; 17+ messages in thread
From: Jason Yundt @ 2022-03-08 12:26 UTC (permalink / raw)
To: Junio C Hamano
Cc: git, Jeff King, Ævar Arnfjörð Bjarmason, brian m. carlson
On Monday, March 7, 2022 9:13:52 PM EST Junio C Hamano wrote:
> About the <meta> thing, it seems that brian already commented on it.
Thanks for mentioning that. I now see Brian’s comments on the archive. My mail
server was blocking him (via zen.spamhaus.org), but I’ve added his server to
the allowlist.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 2/2] gitweb: remove invalid http-equiv="content-type"
2022-03-08 1:50 ` brian m. carlson
@ 2022-03-08 12:44 ` Ævar Arnfjörð Bjarmason
2022-03-08 14:54 ` Jason Yundt
0 siblings, 1 reply; 17+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-03-08 12:44 UTC (permalink / raw)
To: brian m. carlson; +Cc: Jason Yundt, git, Jeff King
On Tue, Mar 08 2022, brian m. carlson wrote:
> [[PGP Signed Part:Undecided]]
> On 2022-03-08 at 01:07:11, Jason Yundt wrote:
>> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
>> index fbd1c20a23..59457c1004 100755
>> --- a/gitweb/gitweb.perl
>> +++ b/gitweb/gitweb.perl
>> @@ -4225,7 +4225,7 @@ sub git_header_html {
>> <!-- git web interface version $version, (C) 2005-2006, Kay Sievers <kay.sievers\@vrfy.org>, Christian Gierke -->
>> <!-- git core binaries version $git_version -->
>> <head>
>> -<meta http-equiv="content-type" content="$content_type; charset=utf-8"/>
>> +<meta charset="utf-8"/>
>
> I don't actually think this is an improvement. I don't think it's
> necessary, considering we have an XML declaration and the HTTP header,
> both of which already say it's UTF-8 and will take precedence over this.
Ageed. I was a bit surprised per Jason's
https://lore.kernel.org/git/109813056.nniJfEyVGO@jason-desktop-linux/
that the removal wasn't kept.
I.e. he was replying to a question of mine asking whether we didn't need
this data at rest, e.g if you save the page. I didn't notice the "<?xml
version..." we emit, which seems to be enough.
I.e. this seems to have always been redundant going back to c994d620cc8
(v220, 2005-08-07), or rather, the character set part of it.
Maybe I still don't understand this, but the commit message seems to me
be conflating whether we send the *right* http-equiv with whether we
send it at all, i.e. if the problem is that XML documents shouldn't be
text/html isn't this correct?:
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index fbd1c20a232..c1c5af0b197 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -4049,7 +4049,13 @@ sub get_page_title {
return $title;
}
+sub get_content_type_xml {
+ return 'application/xhtml+xml';
+}
+
sub get_content_type_html {
+ my ($want_xml) = @_;
+
# require explicit support from the UA if we are to send the page as
# 'application/xhtml+xml', otherwise send it as plain old 'text/html'.
# we have to do this because MSIE sometimes globs '*/*', pretending to
@@ -4057,7 +4063,7 @@ sub get_content_type_html {
if (defined $cgi->http('HTTP_ACCEPT') &&
$cgi->http('HTTP_ACCEPT') =~ m/(,|;|\s|^)application\/xhtml\+xml(,|;|\s|$)/ &&
$cgi->Accept('application/xhtml+xml') != 0) {
- return 'application/xhtml+xml';
+ return get_content_type_html();
} else {
return 'text/html';
}
@@ -4214,6 +4220,7 @@ sub git_header_html {
my $title = get_page_title();
my $content_type = get_content_type_html();
+ my $content_type_xml = get_content_type_html();
print $cgi->header(-type=>$content_type, -charset => 'utf-8',
-status=> $status, -expires => $expires)
unless ($opts{'-no_http_header'});
@@ -4225,7 +4232,7 @@ sub git_header_html {
<!-- git web interface version $version, (C) 2005-2006, Kay Sievers <kay.sievers\@vrfy.org>, Christian Gierke -->
<!-- git core binaries version $git_version -->
<head>
-<meta http-equiv="content-type" content="$content_type; charset=utf-8"/>
+<meta http-equiv="content-type" content="$content_type_xml; charset=utf-8"/>
<meta name="generator" content="gitweb/$version git/$git_version$mod_perl_version"/>
<meta name="robots" content="index, nofollow"/>
<title>$title</title>
Of course we might then *also* decide that <meta http-equiv> in this
case isn't needed at all, but isn't that a seperate change?
And won't conforming browsers treat application/xhtml+xml differently
when the page is saved? A long time ago (Idid some web development)
using it would enable pedantic strictness in browsers, i.e. unclosed
tags etc. would be a hard error, but I can't reproduce that locally in
either Firefox or Chrome now (with just the gitweb output as-is with
that http-equiv tweaked).
So maybe it does nothing, or maybe it's just those browser...
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 2/2] gitweb: remove invalid http-equiv="content-type"
2022-03-08 12:44 ` Ævar Arnfjörð Bjarmason
@ 2022-03-08 14:54 ` Jason Yundt
0 siblings, 0 replies; 17+ messages in thread
From: Jason Yundt @ 2022-03-08 14:54 UTC (permalink / raw)
To: brian m. carlson, Ævar Arnfjörð Bjarmason; +Cc: git, Jeff King
On Tuesday, March 8, 2022 7:44:35 AM EST Ævar Arnfjörð Bjarmason wrote:
> Maybe I still don't understand this, but the commit message seems to me
> be conflating whether we send the *right* http-equiv with whether we
> send it at all,
The intent behind the commit message is to say that <meta
http-equiv="content-type" …> is never correct in XHTML.
> i.e. if the problem is that XML documents shouldn't be
> text/html isn't this correct?:
>
> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index fbd1c20a232..c1c5af0b197 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -4049,7 +4049,13 @@ sub get_page_title {
> return $title;
> }
>
> +sub get_content_type_xml {
> + return 'application/xhtml+xml';
> +}
> +
> sub get_content_type_html {
> + my ($want_xml) = @_;
> +
> # require explicit support from the UA if we are to send the page as
> # 'application/xhtml+xml', otherwise send it as plain old 'text/html'.
> # we have to do this because MSIE sometimes globs '*/*', pretending to
> @@ -4057,7 +4063,7 @@ sub get_content_type_html {
> if (defined $cgi->http('HTTP_ACCEPT') &&
> $cgi->http('HTTP_ACCEPT') =~ m/(,|;|\s|^)application\/xhtml\+xml(,|;|\s|$)/ &&
> $cgi->Accept('application/xhtml+xml') != 0) {
> - return 'application/xhtml+xml';
> + return get_content_type_html();
I’m guessing that you meant to call get_content_type_xml() here.
> } else {
> return 'text/html';
> }
> @@ -4214,6 +4220,7 @@ sub git_header_html {
>
> my $title = get_page_title();
> my $content_type = get_content_type_html();
> + my $content_type_xml = get_content_type_html();
I’m also guessing that you meant to call get_content_type_xml() here.
> print $cgi->header(-type=>$content_type, -charset => 'utf-8',
> -status=> $status, -expires => $expires)
> unless ($opts{'-no_http_header'});
> @@ -4225,7 +4232,7 @@ sub git_header_html {
> <!-- git web interface version $version, (C) 2005-2006, Kay Sievers <kay.sievers\@vrfy.org>, Christian Gierke -->
> <!-- git core binaries version $git_version -->
> <head>
> -<meta http-equiv="content-type" content="$content_type; charset=utf-8"/>
> +<meta http-equiv="content-type" content="$content_type_xml; charset=utf-8"/>
> <meta name="generator" content="gitweb/$version git/$git_version$mod_perl_version"/>
> <meta name="robots" content="index, nofollow"/>
> <title>$title</title>
With those assumptions in mind, I don’t think that your code is correct if
the problem is that XML documents shouldn't be text/html. Here’s why:
1. XML documents shouldn’t contain http-equiv="content-type" [1].
2. When a meta’s http-equiv attribute equals content-type, then its content
attribute should equal “the literal string "text/html;", optionally
followed by any number of ASCII whitespace, followed by the literal
string "charset=utf-8".” [1]
[1]: <https://html.spec.whatwg.org/multipage/semantics.html#attr-meta-http-equiv-content-type>
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v3 0/2] gitweb: remove invalid http-equiv="content-type"
2022-03-08 1:07 ` [PATCH v2 0/2] " Jason Yundt
2022-03-08 2:13 ` Junio C Hamano
@ 2022-03-08 15:56 ` Jason Yundt
2022-03-08 15:56 ` [PATCH v3 1/2] comment: fix typo Jason Yundt
2022-03-08 15:56 ` [PATCH v3 2/2] gitweb: remove invalid http-equiv="content-type" Jason Yundt
1 sibling, 2 replies; 17+ messages in thread
From: Jason Yundt @ 2022-03-08 15:56 UTC (permalink / raw)
To: git
Cc: Ævar Arnfjörð Bjarmason, brian m. carlson,
Junio C Hamano, Jeff King, Jason Yundt
See the second commit's message for more details. Compared to the second
version of this patch, this one
- removes the extra variable again,
- doesn't include a <meta charset="utf-8"/> and
- corrects a technical error in the second commit’s message.
Jason Yundt (2):
comment: fix typo
gitweb: remove invalid http-equiv="content-type"
gitweb/gitweb.perl | 4 +---
t/t9502-gitweb-standalone-parse-output.sh | 15 ++++++++++++++-
2 files changed, 15 insertions(+), 4 deletions(-)
--
2.35.1
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v3 1/2] comment: fix typo
2022-03-08 15:56 ` [PATCH v3 " Jason Yundt
@ 2022-03-08 15:56 ` Jason Yundt
2022-03-08 15:56 ` [PATCH v3 2/2] gitweb: remove invalid http-equiv="content-type" Jason Yundt
1 sibling, 0 replies; 17+ messages in thread
From: Jason Yundt @ 2022-03-08 15:56 UTC (permalink / raw)
To: git
Cc: Ævar Arnfjörð Bjarmason, brian m. carlson,
Junio C Hamano, Jeff King, Jason Yundt
Signed-off-by: Jason Yundt <jason@jasonyundt.email>
---
t/t9502-gitweb-standalone-parse-output.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/t/t9502-gitweb-standalone-parse-output.sh b/t/t9502-gitweb-standalone-parse-output.sh
index 3167473b30..e7363511dd 100755
--- a/t/t9502-gitweb-standalone-parse-output.sh
+++ b/t/t9502-gitweb-standalone-parse-output.sh
@@ -34,7 +34,7 @@ EOF
#
# This will check that gitweb HTTP header contains proposed filename
# as <basename> with '.tar' suffix added, and that generated tarfile
-# (gitweb message body) has <prefix> as prefix for al files in tarfile
+# (gitweb message body) has <prefix> as prefix for all files in tarfile
#
# <prefix> default to <basename>
check_snapshot () {
--
2.35.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v3 2/2] gitweb: remove invalid http-equiv="content-type"
2022-03-08 15:56 ` [PATCH v3 " Jason Yundt
2022-03-08 15:56 ` [PATCH v3 1/2] comment: fix typo Jason Yundt
@ 2022-03-08 15:56 ` Jason Yundt
1 sibling, 0 replies; 17+ messages in thread
From: Jason Yundt @ 2022-03-08 15:56 UTC (permalink / raw)
To: git
Cc: Ævar Arnfjörð Bjarmason, brian m. carlson,
Junio C Hamano, Jeff King, Jason Yundt
Before this change, gitweb would generate pages which included:
<meta http-equiv="content-type" content="application/xhtml+xml; charset=utf-8"/>
When a meta's http-equiv equals "content-type", the http-equiv is said
to be in the "Encoding declaration state". According to the HTML
Standard,
The Encoding declaration state may be used in HTML documents,
but elements with an http-equiv attribute in that state must not
be used in XML documents.
Source: <https://html.spec.whatwg.org/multipage/semantics.html#attr-meta-http-equiv-content-type>
This change removes that meta element since gitweb always generates XML
documents.
Signed-off-by: Jason Yundt <jason@jasonyundt.email>
---
gitweb/gitweb.perl | 4 +---
t/t9502-gitweb-standalone-parse-output.sh | 13 +++++++++++++
2 files changed, 14 insertions(+), 3 deletions(-)
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index fbd1c20a23..606b50104c 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -4213,8 +4213,7 @@ sub git_header_html {
my %opts = @_;
my $title = get_page_title();
- my $content_type = get_content_type_html();
- print $cgi->header(-type=>$content_type, -charset => 'utf-8',
+ print $cgi->header(-type=>get_content_type_html(), -charset => 'utf-8',
-status=> $status, -expires => $expires)
unless ($opts{'-no_http_header'});
my $mod_perl_version = $ENV{'MOD_PERL'} ? " $ENV{'MOD_PERL'}" : '';
@@ -4225,7 +4224,6 @@ sub git_header_html {
<!-- git web interface version $version, (C) 2005-2006, Kay Sievers <kay.sievers\@vrfy.org>, Christian Gierke -->
<!-- git core binaries version $git_version -->
<head>
-<meta http-equiv="content-type" content="$content_type; charset=utf-8"/>
<meta name="generator" content="gitweb/$version git/$git_version$mod_perl_version"/>
<meta name="robots" content="index, nofollow"/>
<title>$title</title>
diff --git a/t/t9502-gitweb-standalone-parse-output.sh b/t/t9502-gitweb-standalone-parse-output.sh
index e7363511dd..8cb582f0e6 100755
--- a/t/t9502-gitweb-standalone-parse-output.sh
+++ b/t/t9502-gitweb-standalone-parse-output.sh
@@ -207,4 +207,17 @@ test_expect_success 'xss checks' '
xss "" "$TAG+"
'
+no_http_equiv_content_type() {
+ gitweb_run "$@" &&
+ ! grep -E "http-equiv=['\"]?content-type" gitweb.body
+}
+
+# See: <https://html.spec.whatwg.org/dev/semantics.html#attr-meta-http-equiv-content-type>
+test_expect_success 'no http-equiv="content-type" in XHTML' '
+ no_http_equiv_content_type &&
+ no_http_equiv_content_type "p=.git" &&
+ no_http_equiv_content_type "p=.git;a=log" &&
+ no_http_equiv_content_type "p=.git;a=tree"
+'
+
test_done
--
2.35.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
end of thread, other threads:[~2022-03-08 15:59 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-07 3:37 [PATCH 0/2] gitweb: remove invalid http-equiv="content-type" Jason Yundt
2022-03-07 3:37 ` [PATCH 1/2] comment: fix typo Jason Yundt
2022-03-07 3:37 ` [PATCH 2/2] gitweb: remove invalid http-equiv="content-type" Jason Yundt
2022-03-07 12:23 ` Ævar Arnfjörð Bjarmason
2022-03-07 22:49 ` Jason Yundt
2022-03-07 23:24 ` brian m. carlson
2022-03-08 1:07 ` [PATCH v2 0/2] " Jason Yundt
2022-03-08 2:13 ` Junio C Hamano
2022-03-08 12:26 ` Jason Yundt
2022-03-08 15:56 ` [PATCH v3 " Jason Yundt
2022-03-08 15:56 ` [PATCH v3 1/2] comment: fix typo Jason Yundt
2022-03-08 15:56 ` [PATCH v3 2/2] gitweb: remove invalid http-equiv="content-type" Jason Yundt
2022-03-08 1:07 ` [PATCH v2 1/2] comment: fix typo Jason Yundt
2022-03-08 1:07 ` [PATCH v2 2/2] gitweb: remove invalid http-equiv="content-type" Jason Yundt
2022-03-08 1:50 ` brian m. carlson
2022-03-08 12:44 ` Ævar Arnfjörð Bjarmason
2022-03-08 14:54 ` Jason Yundt
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.