## Package: git / 1:2.11.0-3+deb9u4

Package Version Patches format
git 1:2.11.0-3+deb9u4 3.0 (quilt)

### Patch series

view the series file
Patch File delta Description
0001 pre rebase hook capture documentation in a here docum.diff | (download)

templates/hooks--pre-rebase.sample | 6 3 + 3 - 0 !
1 file changed, 3 insertions(+), 3 deletions(-)

 pre-rebase hook: capture documentation in a <<here document

Without this change, the sample hook does not pass a syntax check
(sh -n):

$sh -n hooks--pre-rebase.sample hooks--pre-rebase.sample: line 101: syntax error near unexpected token (' hooks--pre-rebase.sample: line 101:  merged into it again (either directly or indirectly).' Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Improved-by: Junio C Hamano <gitster@pobox.com>  Normalize generated asciidoc timestamps with SOURCE_D.diff | (download) Documentation/Makefile | 7 5 + 2 - 0 ! Documentation/technical/api-index.sh | 4 4 + 0 - 0 ! 2 files changed, 9 insertions(+), 2 deletions(-)  normalize generated asciidoc timestamps with source_date_epoch This is needed to pass the Debian build reproducibility test (https://wiki.debian.org/ReproducibleBuilds/TimestampsProposal). Signed-off-by: Anders Kaseorg <andersk@mit.edu>  git gui Sort entries in optimized tclIndex.diff | (download) git-gui/Makefile | 2 1 + 1 - 0 ! 1 file changed, 1 insertion(+), 1 deletion(-)  git-gui: sort entries in optimized tclindex MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit auto_mkindex expands wildcards in directory order, which depends on the underlying filesystem. To improve build reproducibility, sort the list of *.tcl files in the Makefile. The unoptimized loading case was previously fixed in v2.11.0-rc0~31^2^2~14 “git-gui: sort entries in tclIndex”. Signed-off-by: Anders Kaseorg <andersk@mit.edu>  xdiff Do not enable XDL_FAST_HASH by default.diff | (download) Makefile | 1 0 + 1 - 0 ! config.mak.uname | 5 0 + 5 - 0 ! 2 files changed, 6 deletions(-)  xdiff: do not enable xdl_fast_hash by default MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Although XDL_FAST_HASH computes hashes slightly faster on some architectures, its collision characteristics are much worse, resulting  shell disallow repo names beginning with dash.patch | (download) shell.c | 2 1 + 1 - 0 ! 1 file changed, 1 insertion(+), 1 deletion(-)  [patch] shell: disallow repo names beginning with dash When a remote server uses git-shell, the client side will connect to it like: ssh server "git-upload-pack 'foo.git'" and we literally exec ("git-upload-pack", "foo.git"). In early versions of upload-pack and receive-pack, we took a repository argument and nothing else. But over time they learned to accept dashed options. If the user passes a repository name that starts with a dash, the results are confusing at best (we complain of a bogus option instead of a non-existent repository) and malicious at worst (the user can start an interactive pager via "--help"). We could pass "--" to the sub-process to make sure the user's argument is interpreted as a branch name. I.e.: git-upload-pack -- -foo.git But adding "--" automatically would make us inconsistent with a normal shell (i.e., when git-shell is not in use), where "-foo.git" would still be an error. For that case, the client would have to specify the "--", but they can't do so reliably, as existing versions of git-shell do not allow more than a single argument. The simplest thing is to simply disallow "-" at the start of the repo name argument. This hasn't worked either with or without git-shell since version 1.0.0, and nobody has complained. Note that this patch just applies to do_generic_cmd(), which runs upload-pack, receive-pack, and upload-archive. There are two other types of commands that git-shell runs: - do_cvs_cmd(), but this already restricts the argument to be the literal string "server" - admin-provided commands in the git-shell-commands directory. We'll pass along arbitrary arguments there, so these commands could have similar problems. But these commands might actually understand dashed arguments, so we cannot just block them here. It's up to the writer of the commands to make sure they are safe. With great power comes great responsibility. Reported-by: Timo Schmid <tschmid@ernw.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>  connect reject ssh hostname that begins with a dash.diff | (download) connect.c | 3 3 + 0 - 0 ! 1 file changed, 3 insertions(+)  connect: reject ssh hostname that begins with a dash When commands like "git fetch" talk with ssh://$rest_of_URL/, the
code splits $rest_of_URL into components like host, port, etc., and then spawns the underlying "ssh" program by formulating argv[] array that has: - the path to ssh command taken from GIT_SSH_COMMAND, etc. - dashed options like '-batch' (for Tortoise), '-p <port>' as needed. - ssh_host, which is supposed to be the hostname parsed out of$rest_of_URL.

- then the command to be run on the other side, e.g. git

If the ssh_host ends up getting '-<anything>', the argv[] that is
used to spawn the command becomes something like:

{ "ssh", "-p", "22", "-<anything>", "command", "to", "run", NULL }

which obviously is bogus, but depending on the actual value of
"<anything>", will make "ssh" parse and use it as an option.

Prevent this by forbidding ssh_host that begins with a "-".

Noticed-by: Joern Schneeweisz of Recurity Labs
Reported-by: Brian at GitLab
Signed-off-by: Junio C Hamano <gitster@pobox.com>


t/t5813-proto-disable-ssh.sh | 9 9 + 0 - 0 !
1 file changed, 9 insertions(+)

 t5813: add test for hostname starting with dash

Per the explanation in the previous patch, this should be
(and is) rejected.

Signed-off-by: Jeff King <peff@peff.net>


cache.h | 8 8 + 0 - 0 !
connect.c | 2 1 + 1 - 0 !
path.c | 5 5 + 0 - 0 !
3 files changed, 14 insertions(+), 1 deletion(-)

 connect: factor out "looks like command line option" check

be confused for command-line options. Let's factor out that
notion into a helper function, as we'll use it in more
places. And while it's simple now, it's not clear if some
systems might need more complex logic to handle all cases.

Signed-off-by: Jeff King <peff@peff.net>


connect.c | 5 5 + 0 - 0 !
t/t5532-fetch-proxy.sh | 5 5 + 0 - 0 !
2 files changed, 10 insertions(+)

 connect: reject dashed arguments for proxy commands

If you have a GIT_PROXY_COMMAND configured, we will run it
with the host/port on the command-line. If a URL contains a
mischievous host like "--foo", we don't know how the proxy
command may handle it. It's likely to break, but it may also
do something dangerous and unwanted (technically it could
even do something useful, but that seems unlikely).

We should err on the side of caution and reject this before
we even run the command.

The hostname check matches the one we do in a similar
circumstance for ssh. The port check is not present for ssh,
but there it's not necessary because the syntax is "-p
<port>", and there's no ambiguity on the parsing side.

It's not clear whether you can actually get a negative port
to the proxy here or not. Doing:

git fetch git://remote:-1234/repo.git

keeps the "-1234" as part of the hostname, with the default
port of 9418. But it's a good idea to keep this check close
to the point of running the command to make it clear that
there's no way to circumvent it (and at worst it serves as a
belt-and-suspenders check).

Signed-off-by: Jeff King <peff@peff.net>


connect.c | 3 3 + 0 - 0 !
t/t5810-proto-disable-local.sh | 23 23 + 0 - 0 !
t/t5813-proto-disable-ssh.sh | 14 14 + 0 - 0 !
3 files changed, 40 insertions(+)

 connect: reject paths that look like command line options

If we get a repo path like "-repo.git", we may try to invoke
"git-upload-pack -repo.git". This is going to fail, since
upload-pack will interpret it as a set of bogus options. But
let's reject this before we even run the sub-program, since
we would not want to allow any mischief with repo names that
actually are real command-line options.

You can still ask for such a path via git-daemon, but there's no
security problem there, because git-daemon enters the repo itself
and then passes "."  on the command line.

Signed-off-by: Jeff King <peff@peff.net>


git-cvsserver.perl | 47 22 + 25 - 0 !
1 file changed, 22 insertions(+), 25 deletions(-)

 cvsserver: move safe_pipe_capture() to the main package

As a preparation for replacing command with a call to this
function from outside GITCVS::updater package, move it to the main
package.

Signed-off-by: Junio C Hamano <gitster@pobox.com>



git-cvsserver.perl | 22 11 + 11 - 0 !
1 file changed, 11 insertions(+), 11 deletions(-)

 cvsserver: use safe_pipe_capture instead of backticks

This makes the script pass arguments that are derived from end-user
input in safer way when invoking subcommands.

Reported-by: joernchen <joernchen@phenoelit.de>
Signed-off-by: joernchen <joernchen@phenoelit.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>



git-cvsserver.perl | 8 4 + 4 - 0 !
1 file changed, 4 insertions(+), 4 deletions(-)

 cvsserver: use safe_pipe_capture for constant commands as well

This is not strictly necessary, but it is a good code hygiene.

Signed-off-by: Junio C Hamano <gitster@pobox.com>



Documentation/git-shell.txt | 16 16 + 0 - 0 !
shell.c | 14 0 + 14 - 0 !
t/t9400-git-cvsserver-server.sh | 48 48 + 0 - 0 !
3 files changed, 64 insertions(+), 14 deletions(-)

 shell: drop git-cvsserver support by default

The git-cvsserver script is old and largely unmaintained
these days. But git-shell allows untrusted users to run it
out of the box, significantly increasing its attack surface.

Let's drop it from git-shell's list of internal handlers so
that it cannot be run by default.  This is not backwards
compatible. But given the age and development activity on
CVS-related parts of Git, this is likely to impact very few
users, while helping many more (i.e., anybody who runs
git-shell and had no intention of supporting CVS).

There's no configuration mechanism in git-shell for us to
add a boolean and flip it to "off". But there is a mechanism
fairly trivial. Let's document it to give guidance to
anybody who really is still running cvsserver.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>



git-archimport.perl | 4 2 + 2 - 0 !
1 file changed, 2 insertions(+), 2 deletions(-)

 archimport: use safe_pipe_capture for user input

Refnames can contain shell metacharacters which need to be
passed verbatim to sub-processes. Using safe_pipe_capture
skips the shell entirely.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>



git-cvsimport.perl | 1 1 + 0 - 0 !
1 file changed, 1 insertion(+)

 cvsimport: shell-quote variable used in backticks

We run git rev-parse though the shell, and quote its
argument only with single-quotes. This prevents most
metacharacters from being a problem, but misses the obvious
case when $name itself has single-quotes in it. We can fix this by applying the usual shell-quoting formula. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>  submodule config verify submodule names as paths.diff | (download) builtin/submodule--helper.c | 26 25 + 1 - 0 ! git-submodule.sh | 5 5 + 0 - 0 ! submodule-config.c | 31 31 + 0 - 0 ! submodule-config.h | 7 7 + 0 - 0 ! t/t7415-submodule-names.sh | 77 77 + 0 - 0 ! 5 files changed, 145 insertions(+), 1 deletion(-)  submodule-config: verify submodule names as paths commit 0383bbb9015898cbc79abd7b64316484d7713b44 upstream. Submodule "names" come from the untrusted .gitmodules file, but we blindly append them to$GIT_DIR/modules to create our
on-disk repo paths. This means you can do bad things by
putting "../" into the name (among other things).

Let's sanity-check these names to avoid building a path that
can be exploited. There are two main decisions:

1. What should the allowed syntax be?

It's tempting to reuse verify_path(), since submodule
names typically come from in-repo paths. But there are
two reasons not to:

a. It's technically more strict than what we need, as
we really care only about breaking out of the
$GIT_DIR/modules/ hierarchy. E.g., having a submodule named "foo/.git" isn't actually dangerous, and it's possible that somebody has manually given such a funny name. b. Since we'll eventually use this checking logic in fsck to prevent downstream repositories, it should be consistent across platforms. Because verify_path() relies on is_dir_sep(), it wouldn't block "foo\..\bar" on a non-Windows machine. 2. Where should we enforce it? These days most of the .gitmodules reads go through submodule-config.c, so I've put it there in the reading step. That should cover all of the C code. We also construct the name for "git submodule add" inside the git-submodule.sh script. This is probably not a big deal for security since the name is coming from the user anyway, but it would be polite to remind them if the name they pick is invalid (and we need to expose the name-checker to the shell anyway for our test scripts). This patch issues a warning when reading .gitmodules and just ignores the related config entry completely. This will generally end up producing a sensible error, as it works the same as a .gitmodules file which is missing a submodule entry (so "submodule update" will barf, but "git clone --recurse-submodules" will print an error but not abort the clone. There is one minor oddity, which is that we print the warning once per malformed config key (since that's how the config subsystem gives us the entries). So in the new test, for example, the user would see three warnings. That's OK, since the intent is that this case should never come up outside of malicious repositories (and then it might even benefit the user to see the message multiple times). Credit for finding this vulnerability and the proof of concept from which the test script was adapted goes to Etienne Stalmans. [jn: the original patch expects 'git clone' to succeed in the test because v2.13.0-rc0~10^2~3 (clone: teach --recurse-submodules to optionally take a pathspec, 2017-03-17) makes 'git clone' skip invalid submodules. Updated the test to pass in older Git versions where the submodule name check makes 'git clone' fail.] Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>  is_ntfs_dotgit use a size_t for traversing string.diff | (download) path.c | 2 1 + 1 - 0 ! 1 file changed, 1 insertion(+), 1 deletion(-)  is_ntfs_dotgit: use a size_t for traversing string commit 11a9f4d807a0d71dc6eff51bb87baf4ca2cccf1d upstream. We walk through the "name" string using an int, which can wrap to a negative value and cause us to read random memory before our array (e.g., by creating a tree with a name >2GB, since "int" is still 32 bits even on most 64-bit platforms). Worse, this is easy to trigger during the fsck_tree() check, which is supposed to be protecting us from malicious garbage. Note one bit of trickiness in the existing code: we sometimes assign -1 to "len" at the end of the loop, and then rely on the "len++" in the for-loop's increment to take it back to 0. This is still legal with a size_t, since assigning -1 will turn into SIZE_MAX, which then wraps around to 0 on increment. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>  is_hfs_dotgit match other .git files.diff | (download) utf8.c | 58 46 + 12 - 0 ! utf8.h | 5 5 + 0 - 0 ! 2 files changed, 51 insertions(+), 12 deletions(-)  is_hfs_dotgit: match other .git files commit 0fc333ba20b43a8afee5023e92cb3384ff4e59a6 upstream. Both verify_path() and fsck match ".git", ".GIT", and other variants specific to HFS+. Let's allow matching other special files like ".gitmodules", which we'll later use to enforce extra restrictions via verify_path() and fsck. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>  is_ntfs_dotgit match other .git files.diff | (download) cache.h | 10 9 + 1 - 0 ! path.c | 84 84 + 0 - 0 ! 2 files changed, 93 insertions(+), 1 deletion(-)  is_ntfs_dotgit: match other .git files commit e7cb0b4455c85b53aeba40f88ffddcf6d4002498 upstream. When we started to catch NTFS short names that clash with .git, we only looked for GIT~1. This is sufficient because we only ever clone into an empty directory, so .git is guaranteed to be the first subdirectory or file in that directory. However, even with a fresh clone, .gitmodules is *not* necessarily the first file to be written that would want the NTFS short name GITMOD~1: a malicious repository can add .gitmodul0000 and friends, which sorts before .gitmodules and is therefore checked out *first*. For that reason, we have to test not only for ~1 short names, but for others, too. It's hard to just adapt the existing checks in is_ntfs_dotgit(): since Windows 2000 (i.e., in all Windows versions still supported by Git), NTFS short names are only generated in the <prefix>~<number> form up to  is_ hfs ntfs _dotgitmodules add tests.diff | (download) t/helper/test-path-utils.c | 20 20 + 0 - 0 ! t/t0060-path-utils.sh | 86 86 + 0 - 0 ! 2 files changed, 106 insertions(+)  is_{hfs,ntfs}_dotgitmodules: add tests commit dc2d9ba3187fcd0ca8eeab9aa9ddef70cf8627a6 upstream. This tests primarily for NTFS issues, but also adds one example of an HFS+ issue. Thanks go to Congyi Wu for coming up with the list of examples where NTFS would possibly equate the filename with .gitmodules. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>  skip_prefix add case insensitive variant.diff | (download) git-compat-util.h | 17 17 + 0 - 0 ! 1 file changed, 17 insertions(+)  skip_prefix: add case-insensitive variant commit 41a80924aec0e94309786837b6f954a3b3f19b71 upstream. We have the convenient skip_prefix() helper, but if you want to do case-insensitive matching, you're stuck doing it by hand. We could add an extra parameter to the function to let callers ask for this, but the function is small and somewhat performance-critical. Let's just re-implement it for the case-insensitive version. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>  verify_path drop clever fallthrough.diff | (download) read-cache.c | 8 4 + 4 - 0 ! 1 file changed, 4 insertions(+), 4 deletions(-)  verify_path: drop clever fallthrough commit e19e5e66d691bdeeeb5e0ed2ffcecdd7666b0d7b upstream. We check ".git" and ".." in the same switch statement, and fall through the cases to share the end-of-component check. While this saves us a line or two, it makes modifying the function much harder. Let's just write it out. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>  verify_dotfile mention case insensitivity in comment.diff | (download) read-cache.c | 5 4 + 1 - 0 ! 1 file changed, 4 insertions(+), 1 deletion(-)  verify_dotfile: mention case-insensitivity in comment commit 641084b618ddbe099f0992161988c3e479ae848b upstream. We're more restrictive than we need to be in matching ".GIT" on case-sensitive filesystems; let's make a note that this is intentional. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>  update index stat updated files earlier.diff | (download) builtin/update-index.c | 25 17 + 8 - 0 ! 1 file changed, 17 insertions(+), 8 deletions(-)  update-index: stat updated files earlier commit eb12dd0c764d2b71bebd5ffffb7379a3835253ae upstream. In the update_one(), we check verify_path() on the proposed path before doing anything else. In preparation for having verify_path() look at the file mode, let's stat the file earlier, so we can check the mode accurately. This is made a bit trickier by the fact that this function only does an lstat in a few code paths (the ones that flow down through process_path()). So we can speculatively do the lstat() here and pass the results down, and just use a dummy mode for cases where we won't actually be updating the index from the filesystem. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>  verify_path disallow symlinks in .gitmodules.diff | (download) apply.c | 4 2 + 2 - 0 ! builtin/update-index.c | 6 3 + 3 - 0 ! cache.h | 2 1 + 1 - 0 ! read-cache.c | 40 31 + 9 - 0 ! 4 files changed, 37 insertions(+), 15 deletions(-)  verify_path: disallow symlinks in .gitmodules commit 10ecfa76491e4923988337b2e2243b05376b40de upstream. There are a few reasons it's not a good idea to make .gitmodules a symlink, including: 1. It won't be portable to systems without symlinks. 2. It may behave inconsistently, since Git may look at this file in the index or a tree without bothering to resolve any symbolic links. We don't do this _yet_, but the config infrastructure is there and it's planned for the future. With some clever code, we could make (2) work. And some people may not care about (1) if they only work on one platform. But there are a few security reasons to simply disallow it: a. A symlinked .gitmodules file may circumvent any fsck checks of the content. b. Git may read and write from the on-disk file without sanity checking the symlink target. So for example, if you link ".gitmodules" to "../oops" and run "git submodule add", we'll write to the file "oops" outside the repository. Again, both of those are problems that _could_ be solved with sufficient code, but given the complications in (1) and (2), we're better off just outlawing it explicitly. Note the slightly tricky call to verify_path() in update-index's update_one(). There we may not have a mode if we're not updating from the filesystem (e.g., we might just be removing the file). Passing "0" as the mode there works fine; since it's not a symlink, we'll just skip the extra checks. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>  sha1_file add read_loose_object function.diff | (download) cache.h | 13 13 + 0 - 0 ! sha1_file.c | 132 130 + 2 - 0 ! 2 files changed, 143 insertions(+), 2 deletions(-)  sha1_file: add read_loose_object() function commit f6371f9210418f1beabc85b097e2a3470aeeb54d upstream. It's surprisingly hard to ask the sha1_file code to open a _specific_ incarnation of a loose object. Most of the functions take a sha1, and loop over the various object types (packed versus loose) and locations (local versus alternates) at a low level. However, some tools like fsck need to look at a specific file. This patch gives them a function they can use to open the loose object at a given path. The implementation unfortunately ends up repeating bits of related functions, but there's not a good way around it without some major refactoring of the whole sha1_file stack. We need to mmap the specific file, then partially read the zlib stream to know whether we're streaming or not, and then finally either stream it or copy the data to a buffer. We can do that by assembling some of the more arcane internal sha1_file functions, but we end up having to essentially reimplement unpack_sha1_file(), along with the streaming bits of check_sha1_signature(). Still, most of the ugliness is contained in the new function, and the interface is clean enough that it may be reusable (though it seems unlikely anything but git-fsck would care about opening a specific file). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>  fsck parse loose object paths directly.diff | (download) builtin/fsck.c | 46 33 + 13 - 0 ! t/t1450-fsck.sh | 16 16 + 0 - 0 ! 2 files changed, 49 insertions(+), 13 deletions(-)  fsck: parse loose object paths directly commit c68b489e56431cf27f7719913ab09ddc62f95912 upstream. When we iterate over the list of loose objects to check, we get the actual path of each object. But we then throw it away and pass just the sha1 to fsck_sha1(), which will do a fresh lookup. Usually it would find the same object, but it may not if an object exists both as a loose and a packed object. We may end up checking the packed object twice, and never look at the loose one. In practice this isn't too terrible, because if fsck doesn't complain, it means you have at least one good copy. But since the point of fsck is to look for corruption, we should be thorough. The new read_loose_object() interface can help us get the data from disk, and then we replace parse_object() with parse_object_buffer(). As a bonus, our error messages now mention the path to a corrupted object, which should make it easier to track down errors when they do happen. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>  index pack make fsck error message more specific.diff | (download) builtin/index-pack.c | 2 1 + 1 - 0 ! builtin/unpack-objects.c | 2 1 + 1 - 0 ! 2 files changed, 2 insertions(+), 2 deletions(-)  index-pack: make fsck error message more specific commit db5a58c1bda5b20169b9958af1e8b05ddd178b01 upstream. If fsck reports an error, we say only "Error in object". This isn't quite as bad as it might seem, since the fsck code would have dumped some errors to stderr already. But it might help to give a little more context. The earlier output would not have even mentioned "fsck", and that may be a clue that the "fsck.*" or "*.fsckObjects" config may be relevant. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>  fsck simplify .git check.diff | (download) fsck.c | 4 1 + 3 - 0 ! 1 file changed, 1 insertion(+), 3 deletions(-)  fsck: simplify ".git" check commit ed9c3220621d634d543bc4dd998d12167dfc57d4 upstream. There's no need for us to manually check for ".git"; it's a subset of the other filesystem-specific tests. Dropping it makes our code slightly shorter. More importantly, the existing code may make a reader wonder why ".GIT" is not covered here, and whether that is a bug (it isn't, as it's also covered in the filesystem-specific tests). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>  fsck actually fsck blob data.diff | (download) builtin/fsck.c | 51 25 + 26 - 0 ! fsck.c | 8 7 + 1 - 0 ! sha1_file.c | 2 1 + 1 - 0 ! 3 files changed, 33 insertions(+), 28 deletions(-)  fsck: actually fsck blob data commit 7ac4f3a007e2567f9d2492806186aa063f9a08d6 upstream. Because fscking a blob has always been a noop, we didn't bother passing around the blob data. In preparation for content-level checks, let's fix up a few things: 1. The fsck_object() function just returns success for any blob. Let's a noop fsck_blob(), which we can fill in with actual logic later. 2. The fsck_loose() function in builtin/fsck.c just threw away blob content after loading it. Let's hold onto it until after we've called fsck_object(). The easiest way to do this is to just drop the parse_loose_object() helper entirely. Incidentally, this also fixes a memory leak: if we successfully loaded the object data but did not parse it, we would have left the function without freeing it. 3. When fsck_loose() loads the object data, it does so with a custom read_loose_object() helper. This function streams any blobs, regardless of size, under the assumption that we're only checking the sha1. Instead, let's actually load blobs smaller than big_file_threshold, as the normal object-reading code-paths would do. This lets us fsck small files, and a NULL return is an indication that the blob was so big that it needed to be streamed, and we can pass that information along to fsck_blob(). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>  fsck detect gitmodules files.diff | (download) fsck.c | 90 90 + 0 - 0 ! fsck.h | 7 7 + 0 - 0 ! 2 files changed, 97 insertions(+)  fsck: detect gitmodules files commit 159e7b080bfa5d34559467cacaa79df89a01afc0 upstream. In preparation for performing fsck checks on .gitmodules files, this commit plumbs in the actual detection of the files. Note that unlike most other fsck checks, this cannot be a property of a single object: we must know that the object is found at a ".gitmodules" path at the root tree of a commit. Since the fsck code only sees one object at a time, we have to mark the related objects to fit the puzzle together. When we see a commit we mark its tree as a root tree, and when we see a root tree with a .gitmodules file, we mark the corresponding blob to be checked. In an ideal world, we'd check the objects in topological order: commits followed by trees followed by blobs. In that case we can avoid ever loading an object twice, since all markings would be complete by the time we get to the marked objects. And indeed, if we are checking a single packfile, this is the order in which Git will generally write the objects. But we can't count on that: 1. git-fsck may show us the objects in arbitrary order (loose objects are fed in sha1 order, but we may also have multiple packs, and we process each pack fully in sequence). 2. The type ordering is just what git-pack-objects happens to write now. The pack format does not require a specific order, and it's possible that future versions of Git (or a custom version trying to fool official  fsck check .gitmodules content.diff | (download) fsck.c | 59 58 + 1 - 0 ! 1 file changed, 58 insertions(+), 1 deletion(-)  fsck: check .gitmodules content commit ed8b10f631c9a71df3351d46187bf7f3fa4f9b7e upstream. This patch detects and blocks submodule names which do not match the policy set forth in submodule-config. These should already be caught by the submodule code itself, but putting the check here means that newer versions of Git can protect older ones from malicious entries (e.g., a server with receive.fsckObjects will block the objects, protecting clients which fetch from it). As a side effect, this means fsck will also complain about .gitmodules files that cannot be parsed (or were larger than core.bigFileThreshold). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>  fsck call fsck_finish after fscking objects.diff | (download) builtin/fsck.c | 3 3 + 0 - 0 ! t/t7415-submodule-names.sh | 4 4 + 0 - 0 ! 2 files changed, 7 insertions(+)  fsck: call fsck_finish() after fscking objects commit 1995b5e03e1cc97116be58cdc0502d4a23547856 upstream. Now that the internal fsck code is capable of checking .gitmodules files, we just need to teach its callers to use the "finish" function to check any queued objects. With this, we can now catch the malicious case in t7415 with git-fsck. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>  unpack objects call fsck_finish after fscking objects.diff | (download) builtin/unpack-objects.c | 5 4 + 1 - 0 ! t/t7415-submodule-names.sh | 7 7 + 0 - 0 ! 2 files changed, 11 insertions(+), 1 deletion(-)  unpack-objects: call fsck_finish() after fscking objects commit 6e328d6caef218db320978e3e251009135d87d0e upstream. As with the previous commit, we must call fsck's "finish" function in order to catch any queued objects for .gitmodules checks. This second pass will be able to access any incoming objects, because we will have exploded them to loose objects by now. This isn't quite ideal, because it means that bad objects may have been written to the object database (and a subsequent operation could then reference them, even if the other side doesn't send the objects again). However, this is sufficient when used with receive.fsckObjects, since those loose objects will all be placed in a temporary quarantine area that will get wiped if we find any problems. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>  index pack check .gitmodules files with strict.diff | (download) builtin/index-pack.c | 10 10 + 0 - 0 ! t/lib-pack.sh | 12 12 + 0 - 0 ! t/t7415-submodule-names.sh | 38 38 + 0 - 0 ! 3 files changed, 60 insertions(+)  index-pack: check .gitmodules files with --strict commit 73c3f0f704a91b6792e0199a3f3ab6e3a1971675 upstream. Now that the internal fsck code has all of the plumbing we need, we can start checking incoming .gitmodules files. Naively, it seems like we would just need to add a call to fsck_finish() after we've processed all of the objects. And that would be enough to cover the initial test included here. But there are two extra bits: 1. We currently don't bother calling fsck_object() at all for blobs, since it has traditionally been a noop. We'd actually catch these blobs in fsck_finish() at the end, but it's more efficient to check them when we already have the object loaded in memory. 2. The second pass done by fsck_finish() needs to access the objects, but we're actually indexing the pack in this process. In theory we could give the fsck code a special callback for accessing the in-pack data, but it's actually quite tricky: a. We don't have an internal efficient index mapping oids to packfile offsets. We only generate it on the fly as part of writing out the .idx file. b. We'd still have to reconstruct deltas, which means we'd basically have to replicate all of the reading logic in packfile.c. Instead, let's avoid running fsck_finish() until after we've written out the .idx file, and then just add it to our internal packed_git list. This does mean that the objects are "in the repository" before we finish our fsck checks. But unpack-objects already exhibits this same behavior, and it's an acceptable tradeoff here for the same reason: the quarantine mechanism means that pushes will be fully protected. In addition to a basic push test in t7415, we add a sneaky pack that reverses the usual object order in the pack, requiring that index-pack access the tree and blob during the "finish" step. This already works for unpack-objects (since it will have written out loose objects), but we'll check it with this sneaky pack for good measure. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>  fsck complain when .gitmodules is a symlink.diff | (download) fsck.c | 11 9 + 2 - 0 ! t/t7415-submodule-names.sh | 29 29 + 0 - 0 ! 2 files changed, 38 insertions(+), 2 deletions(-)  fsck: complain when .gitmodules is a symlink commit b7b1fca175f1ed7933f361028c631b9ac86d868d upstream. We've recently forbidden .gitmodules to be a symlink in verify_path(). And it's an easy way to circumvent our fsck checks for .gitmodules content. So let's complain when we see it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>  submodule_init die cleanly on submodules without url .diff | (download) builtin/submodule--helper.c | 6 3 + 3 - 0 ! t/t7400-submodule-basic.sh | 8 8 + 0 - 0 ! 2 files changed, 11 insertions(+), 3 deletions(-)  submodule_init: die cleanly on submodules without url defined commit 627fde102515a7807dba89acaa88cb053b38a44a upstream. When we init a submodule, we try to die when it has no URL defined: url = xstrdup(sub->url); if (!url) die(...); But that's clearly nonsense. xstrdup() will never return NULL, and if sub->url is NULL, we'll segfault. These two bits of code need to be flipped, so we check sub->url before looking at it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>  submodule helper use to signal end of clone options.diff | (download) builtin/submodule--helper.c | 1 1 + 0 - 0 ! 1 file changed, 1 insertion(+)  submodule--helper: use "--" to signal end of clone options commit 98afac7a7cefdca0d2c4917dd8066a59f7088265 upstream. When we clone a submodule, we call "git clone$url $path". But there's nothing to say that those components can't begin with a dash themselves, confusing git-clone into thinking they're options. Let's pass "--" to make it clear what we expect. There's no test here, because it's actually quite hard to make these names work, even with "git clone" parsing them correctly. And we're going to restrict these cases even further in future commits. So we'll leave off testing until then; this is just the minimal fix to prevent us from doing something stupid with a badly formed entry. Reported-by: joernchen <joernchen@phenoelit.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>  submodule config ban submodule urls that start with d.diff | (download) submodule-config.c | 8 8 + 0 - 0 ! t/t7416-submodule-dash-url.sh | 34 34 + 0 - 0 ! 2 files changed, 42 insertions(+)  submodule-config: ban submodule urls that start with dash commit f6adec4e329ef0e25e14c63b735a5956dc67b8bc upstream. The previous commit taught the submodule code to invoke our "git clone$url $path" with a "--" separator so that we aren't confused by urls or paths that start with dashes. However, that's just one code path. It's not clear if there are others, and it would be an easy mistake to add one in the future. Moreover, even with the fix in the previous commit, it's quite hard to actually do anything useful with such an entry. Any url starting with a dash must fall into one of three categories: - it's meant as a file url, like "-path". But then any clone is not going to have the matching path, since it's by definition relative inside the newly created clone. If you spell it as "./-path", the submodule code sees the "/" and translates this to an absolute path, so it at least works (assuming the receiver has the same filesystem layout as you). But that trick does not apply for a bare "-path". - it's meant as an ssh url, like "-host:path". But this already doesn't work, as we explicitly disallow ssh hostnames that begin with a dash (to avoid option injection against ssh). - it's a remote-helper scheme, like "-scheme::data". This _could_ work if the receiver bends over backwards and creates a funny-named helper like "git-remote--scheme". But normally there would not be any helper that matches. Since such a url does not work today and is not likely to do anything useful in the future, let's simply disallow them entirely. That protects the existing "git clone" path (in a belt-and-suspenders way), along with any others that might exist. Our tests cover two cases: 1. A file url with "./" continues to work, showing that there's an escape hatch for people with truly silly repo names. 2. A url starting with "-" is rejected. Note that we expect case (2) to fail, but it would have done so even without this commit, for the reasons given above. So instead of just expecting failure, let's also check for the magic word "ignoring" on stderr. That lets us know that we failed for the right reason. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>  submodule config ban submodule paths that start with .diff | (download) submodule-config.c | 2 2 + 0 - 0 ! t/t7417-submodule-path-url.sh | 20 20 + 0 - 0 ! 2 files changed, 22 insertions(+)  submodule-config: ban submodule paths that start with a dash commit 273c61496f88c6495b886acb1041fe57965151da upstream. We recently banned submodule urls that look like command-line options. This is the matching change to ban leading-dash paths. As with the urls, this should not break any use cases that currently work. Even with our "--" separator passed to git-clone, git-submodule.sh gets confused. Without the code portion of this patch, the clone of "-sub" added in t7417 would yield results like: /path/to/git-submodule: 410: cd: Illegal option -s /path/to/git-submodule: 417: cd: Illegal option -s /path/to/git-submodule: 410: cd: Illegal option -s /path/to/git-submodule: 417: cd: Illegal option -s Fetched in submodule path '-sub', but it did not contain b56243f8f4eb91b2f1f8109452e659f14dd3fbe4. Direct fetching of that commit failed. Moreover, naively adding such a submodule doesn't work:$ git submodule add \$url -sub
The following path is ignored by one of your .gitignore files:
-sub

even though there is no such ignore pattern (the test script
hacks around this with a well-placed "git mv").

Unlike leading-dash urls, though, it's possible that such a
path _could_ be useful if we eventually made it work. So
this commit should be seen not as recommending a particular
policy, but rather temporarily closing off a broken and
possibly dangerous code-path. We may revisit this decision
later.



fsck.c | 7 7 + 0 - 0 !
t/t7416-submodule-dash-url.sh | 15 15 + 0 - 0 !
2 files changed, 22 insertions(+)

 fsck: detect submodule urls starting with dash

commit a124133e1e6ab5c7a9fef6d0e6bcb084e3455b46 upstream.

Urls with leading dashes can cause mischief on older
versions of Git. We should detect them so that they can be
rejected by receive.fsckObjects, preventing modern versions
of git from being a vector by which attacks can spread.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>



fsck.c | 7 7 + 0 - 0 !
t/t7417-submodule-path-url.sh | 8 8 + 0 - 0 !
2 files changed, 15 insertions(+)

 fsck: detect submodule paths starting with dash

commit 1a7fd1fb2998002da6e9ff2ee46e1bdd25ee8404 upstream.

As with urls, submodule paths with dashes are ignored by
git, but may end up confusing older versions. Detecting them
via fsck lets us prevent modern versions of git from being a
vector to spread broken .gitmodules to older versions.

Compared to blocking leading-dash urls, though, this
detection may be less of a good idea:

1. While such paths provide confusing and broken results,
they don't seem to actually work as option injections
against anything except "cd". In particular, the
submodule code seems to canonicalize to an absolute
path before running "git clone" (so it passes
/your/clone/-sub).

2. It's more likely that we may one day make such names
actually work correctly. Even after we revert this fsck
check, it will continue to be a hassle until hosting
servers are all updated.

On the other hand, it's not entirely clear that the behavior
in older versions is safe. And if we do want to eventually
allow this, we may end up doing so with a special syntax
anyway (e.g., writing "./-sub" in the .gitmodules file, and
teaching the submodule code to canonicalize it when
comparing).

So on balance, this is probably a good protection.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>



git-cvsimport.perl | 2 1 + 1 - 0 !
1 file changed, 1 insertion(+), 1 deletion(-)

 cvsimport: apply shell-quoting regex globally

commit 8c87bdfb2137c9e9e945df13e2f2e1eb995ddf83 upstream.

Commit 5b4efea666 (cvsimport: shell-quote variable used in
backticks, 2017-09-11) tried to shell-quote a variable, but
forgot to use the "/g" modifier to apply the quoting to the
whole variable. This means we'd miss any embedded
single-quotes after the first one.

Reported-by: <littlelailo@yahoo.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>