From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 82378C00454 for ; Wed, 11 Dec 2019 20:00:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 32105222C4 for ; Wed, 11 Dec 2019 20:00:14 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=pobox.com header.i=@pobox.com header.b="m83StIeK" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726686AbfLKUAN (ORCPT ); Wed, 11 Dec 2019 15:00:13 -0500 Received: from pb-smtp2.pobox.com ([64.147.108.71]:62860 "EHLO pb-smtp2.pobox.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726345AbfLKUAN (ORCPT ); Wed, 11 Dec 2019 15:00:13 -0500 Received: from pb-smtp2.pobox.com (unknown [127.0.0.1]) by pb-smtp2.pobox.com (Postfix) with ESMTP id 963A92D3C5; Wed, 11 Dec 2019 15:00:10 -0500 (EST) (envelope-from junio@pobox.com) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; s=sasl; bh=zVn2E6BkXaV26BylPT0Tu3Ejyv4=; b=m83StI eKGPqquVkVfjTRw60Bb0Ibi3xwClW4Hb+ncFrH+Io2IXJ1oEvaE8nMlSJ+GyeSgt +R6qk6yJrvjR/v+BDnL8Mj767eduRFfRvL3Ly50aucsgGXmY2cxqbV8rWxQjoh19 EQsUWgEuyqRAK17f3VowAZUh8giPK5QuLdRcs= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; q=dns; s=sasl; b=AqSrda+Z7EmG8jMXpXjePil40PAeT6qC JrUIlisN2m30iVzgF1bYT/IPZLR8pOJxDlD/6xbFXCV7sLQqALW18IgxJdXmQH+n hoaTPwjPvp7fHtCIQVHgIs4M6D/5YmAoQ1AvXCILT7syTSZK1t+Yiy6EAInHNWUh csUSi6LIYcQ= Received: from pb-smtp2.nyi.icgroup.com (unknown [127.0.0.1]) by pb-smtp2.pobox.com (Postfix) with ESMTP id 3651F2D3C4; Wed, 11 Dec 2019 15:00:10 -0500 (EST) (envelope-from junio@pobox.com) Received: from pobox.com (unknown [34.76.80.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by pb-smtp2.pobox.com (Postfix) with ESMTPSA id ABDD72D3BD; Wed, 11 Dec 2019 15:00:06 -0500 (EST) (envelope-from junio@pobox.com) From: Junio C Hamano To: Derrick Stolee Cc: Derrick Stolee via GitGitGadget , git@vger.kernel.org, szeder.dev@gmail.com, newren@gmail.com, Derrick Stolee Subject: Re: [PATCH 1/1] sparse-checkout: respect core.ignoreCase in cone mode References: <23705845ce73992bf7ab645d28febebe0a698d49.1575920580.git.gitgitgadget@gmail.com> <9dbf6d43-ac1e-4790-84e5-4829d21f5fdb@gmail.com> Date: Wed, 11 Dec 2019 12:00:03 -0800 In-Reply-To: <9dbf6d43-ac1e-4790-84e5-4829d21f5fdb@gmail.com> (Derrick Stolee's message of "Wed, 11 Dec 2019 14:11:32 -0500") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Pobox-Relay-ID: D3F545DE-1C50-11EA-8145-D1361DBA3BAF-77302942!pb-smtp2.pobox.com Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Derrick Stolee writes: > I'm trying to find a way around these two ideas: > > 1. The index is case-sensitive, and the sparse-checkout patterns are > case-sensitive. OK. The latter is local to the repository and not shared to the world where people with case sensitive systems would live, right? > 2. If a filesystem is case-insensitive, most index-change operations > already succeed with incorrect case, especially with core.ignoreCase > enabled. I am not sure about "incorrect", though. My understanding is that modern case-insensitive systems are not information-destroying [*1*]. A new file you create as "ReadMe.txt" on your filesystem would be seen in that mixed case spelling via readdir() or equivalent, so adding it to the index as-is would likely be in the "correct" case, no? If, after adding that path to the index, you did "rm ReadMe.txt" and created "README.TXT", surely we won't have both ReadMe.txt and README.TXT in the index with ignoreCase set, and keep using ReadMe.txt that matches what you added to the index. I am not sure which one is the "incorrect" case in such a scenario. > The approach I have is to allow a user to provide a case that does not > match the index, and then we store the pattern in the sparse-checkout > that matches the case in the index. Yes, I understood that from your proposed log message and cover letter. They were very clearly written. But imagine that your user writes ABC in the sparse pattern file, and there is no abc anything in the index in any case permutations. When you check out a branch that has Abc, shouldn't the pattern ABC affect the operation just like a pattern Abc would on a case insensitive system? Or are end users perfectly aware that the thing in that branch is spelled "Abc" and not "ABC" (the data in Git does---it comes from a tree object that is case sensitive)? If so, then the pattern ABC should not affect the subtree whose name is "Abc" even on a case insensitive system. I am not sure what the design of this part of the system expects out of the end user. Perhaps keeping the patterns case insensitive and tell the end users to spell them correctly is the right way to go, I guess, if it is just the filesystem that cannot represente the cases correctly at times and the users are perfectly capable of telling the right and wrong cases apart. But then I am not sure why correcting misspelled names by comparing them with the index entries is a good idea, either. > It sounds like you are preferring this second option, despite the > performance costs. It is possible to use a case-insensitive hashing > algorithm when in the cone mode, but I'm not sure about how to do > a similar concept for the normal mode. I have no strong preference, other than that I prefer to see things being done consistently. Don't we already use case folding hash function to ensure that we won't add duplicate to the index on case-insensitive system? I suspect that we would need to do the same, whether in cone mode or just a normal sparse-checkout mode consistently. Thanks. [Footnote] *1* ... unlike HFS+ where everything is forced to NKD and a bit of information---whether the original was in NKC or NKD---is discarded forever.