All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Jakub Narębski" <jnareb@gmail.com>
Cc: Michael Wagner <accounts@mwagner.org>,
	Peter Krefting <peter@softwolves.pp.se>,
	git <git@vger.kernel.org>
Subject: Re: [PATCH/RFC] Gitweb: Convert UTF-8 encoded file names
Date: Thu, 15 May 2014 18:26:35 -0700	[thread overview]
Message-ID: <xmqqmweibjjo.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <CANQwDwffdbqD96OadyECFs=6WY_t+_0b63L5yAZVQ8aXrMvHHA@mail.gmail.com> ("Jakub =?utf-8?Q?Nar=C4=99bski=22's?= message of "Thu, 15 May 2014 22:45:22 +0200")

Jakub Narębski <jnareb@gmail.com> writes:

> On Thu, May 15, 2014 at 9:38 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Jakub Narębski <jnareb@gmail.com> writes:
>>
>>> Writing test for this would not be easy, and require some HTML
>>> parser (WWW::Mechanize, Web::Scraper, HTML::Query, pQuery,
>>> ... or low level HTML::TreeBuilder, or other low level parser).
>>
>> Hmph.  Is it more than just looking for a specific run of %xx we
>> would expect to see in the output of the tree view for a repository
>> in which there is one tree with non-ASCII name?
>
> There is if we want to check (in non-fragile way) that said
> specific run is in 'href' *attribute* of 'a' element (link target).

Correct, but is "where does it appear" the question we are
primarily interested in, wrt this breakage and its fix?

If gitweb output has some volatile parts that do not depend on the
contents of the Git test repository (e.g. showing contents of
/etc/motd, date/time of when the test was run, or the full pathname
leading to the trash directory), then preparing a tree whose name is
äéìõû and making sure that the properly encoded version of äéìõû
appears anywhere in the output may not be sufficient to validate
that we got the encoding right, as that string may appear in the
parts that are totally unrelated to the contents being shown and not
under our control.  But is that really the case?

Also we may introduce a bug and misspell the attr name and produce
an anchor element with hpef attribute with the properly encoded URL
in it, and your "parse HTML properly" approach would catch it, but
is that the kind of breakage under discussion?  You hinted at new
tests for UTF-8 encoding in the other message in the thread earlier,
and I've been assuming that we were talking about the encoding test,
not a test to catch s/href/hpef/ kind of breakage.

  reply	other threads:[~2014-05-16  1:26 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-14 18:41 [PATCH/RFC] Gitweb: Convert UTF-8 encoded file names Michael Wagner
2014-05-14 21:57 ` Junio C Hamano
2014-05-14 22:25   ` Jakub Narębski
2014-05-15  5:08     ` Michael Wagner
2014-05-15  9:04       ` Peter Krefting
2014-05-15 17:24         ` Junio C Hamano
2014-05-15 18:48         ` Michael Wagner
2014-05-15 19:28           ` Jakub Narębski
2014-05-15 19:37             ` Jakub Narębski
2014-05-15 19:38             ` Junio C Hamano
2014-05-15 20:45               ` Jakub Narębski
2014-05-16  1:26                 ` Junio C Hamano [this message]
2014-05-16  7:54                   ` Jakub Narębski
2014-05-16 17:05                     ` Junio C Hamano
2014-05-27 14:18                       ` Jakub Narębski
2014-05-16 18:17                     ` Junio C Hamano
2014-05-27 14:22             ` [PATCH] gitweb: Harden UTF-8 handling in generated links Jakub Narębski
2014-06-04 15:41               ` Michael Wagner
2014-06-04 18:47                 ` Jakub Narębski
2014-06-04 20:47                   ` Michael Wagner
2015-03-23 21:35                   ` What's cooking in git.git (Mar 2015, #08; Mon, 23) Junio C Hamano
2014-12-17 14:18                     ` [PATCH v4] remote: add --fetch and --both options to set-url Peter Wu
2014-12-17 14:32                       ` Jeff King
2014-12-17 14:42                         ` Peter Wu
2014-12-17 22:31                       ` Junio C Hamano
2015-03-24 22:21                       ` What's cooking in git.git (Mar 2015, #08; Mon, 23) Junio C Hamano
2015-03-26 16:18                         ` Jeff King
2015-03-24 20:02                     ` Junio C Hamano
2015-03-24 20:04                       ` Jeff King
2015-03-24 20:08                     ` Junio C Hamano
2015-03-24 23:37                     ` Junio C Hamano
2015-03-24 22:26                   ` Junio C Hamano
2015-03-25  0:37                     ` Jakub Narębski
2015-03-25  1:05                       ` Junio C Hamano
2014-05-15 12:32       ` [PATCH/RFC] Gitweb: Convert UTF-8 encoded file names Jakub Narębski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqmweibjjo.fsf@gitster.dls.corp.google.com \
    --to=gitster@pobox.com \
    --cc=accounts@mwagner.org \
    --cc=git@vger.kernel.org \
    --cc=jnareb@gmail.com \
    --cc=peter@softwolves.pp.se \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.