如何git计算文件哈希？

存储在树对象中的SHA1哈希（由git ls-tree返回）不匹配文件内容的SHA1哈希值（由sha1sum返回）

 $ git cat-file blob 4716ca912495c805b94a88ef6dc3fb4aff46bf3c | sha1sum de20247992af0f949ae8df4fa9a37e4a03d7063e -

如何git计算文件散列？计算哈希之前是否压缩内容？

Git用“blob”前缀对象，后跟长度（作为人类可读的整数），后跟NUL字符

$ echo -e 'blob 14\0Hello, World!' | shasum 8ab686eafeb1f44702738c8b0f24f2567c36da6d

资料来源： http ： //alblue.bandlem.com/2011/08/git-tip-of-week-objects.html

我只是扩展了@Leif Gruenwoldt的答案，并详细说明了@Leif Gruenwoldt提供的参考文献

自己做..

第1步。在存储库中创build一个空的文本文档（名称无关紧要）

第2步。阶段并提交文档

第3步。通过执行git ls-tree HEAD来识别blob的散列

第4步。findblob的散列是e69de29bb2d1d6434b8b29ae775ad8c2e48c5391

第5步。抓住你的惊喜，并阅读下文

GIT如何计算提交哈希值

  Commit Hash (SHA1) = SHA1("blob " + <size_of_file> + "\0" + <contents_of_file>)

文本blob⎵是一个常量前缀， \0也是常量，是NULL字符。 <size_of_file>和<contents_of_file>因文件而异。

这就是所有的人！

可是等等！ ，你有没有注意到<filename>不是用于散列计算的参数？如果两个文件的内容相同，它们的创builddate和时间以及名称相同，则两个文件可能具有相同的散列。这是Git处理移动和重命名比其他版本控制系统更好的原因之一。

自己动手（分机）

第6步。在同一目录下创build另一个具有不同filename空文件

第7步：比较两个文件的哈希值。

注意：

该链接没有提到tree对象是如何散列的。我不确定的algorithm和参数，但从我的观察，它可能计算基于所有的blobs和trees （哈希大概）散列它包含

git hash-object是validation你的testing方法的一个快速方法：

 s='abc' printf "$s" | git hash-object --stdin printf "blob $(printf "$s" | wc -c)\0$s" | sha1sum

输出：

 f2ba8f84ab5c1bce84a7b441cb1959cfc7093b7f f2ba8f84ab5c1bce84a7b441cb1959cfc7093b7f -

sha1sum在GNU Coreutils中。

基于Leif Gruenwoldt的答案，这是一个shell函数替代git hash-object ：

 git-hash-object () { # substitute when the `git` command is not available local type=blob [ "$1" = "-t" ] && shift && type=$1 && shift # depending on eol/autocrlf settings, you may want to substitute CRLFs by LFs # by using `perl -pe 's/\r$//g'` instead of `cat` in the next 2 commands local size=$(cat $1 | wc -c | sed 's/ .*$//') ( echo -en "$type $size\0"; cat "$1" ) | sha1sum | sed 's/ .*$//' }

testing：

 $ echo 'Hello, World!' > test.txt $ git hash-object test.txt 8ab686eafeb1f44702738c8b0f24f2567c36da6d $ git-hash-object test.txt 8ab686eafeb1f44702738c8b0f24f2567c36da6d

我需要在Python 3中进行一些unit testing，所以我想把它留在这里。

 def git_blob_hash(data): if isinstance(data, str): data = data.encode() data = b'blob ' + str(len(data)).encode() + b'\0' + data h = hashlib.sha1() h.update(data) return h.hexdigest()

我坚持每行都有\n行结尾，但在某些情况下，Git也可能在计算这个散列值之前改变行结束符，所以你可能还需要一个.replace('\r\n', '\n') 。

如何git计算文件哈希？

如何使用.NET快速比较2个文件？

计算机器上的文件的校验和的最佳方法是什么？

哈希代码和校验和 – 有什么区别？

在Java中获取文件的MD5校验和

有可能获得相同的SHA1哈希？

在C＃中为大文件创build校验和的最快方法是什么？

如何在C＃中执行SHA1文件校验和？

包含它自己的校验和的文件