multipackage.utilities.obj_hash module

Calculate a hex checksum over a list of lines.

Summary

Functions:

dict_hash Calculate a hash over a json-serializable dictionary.
directory_hash Hash all files in a given folder.
line_hash Calculate a hash over a list of strings.

Reference

multipackage.utilities.obj_hash.line_hash(lines, method='md5')[source]

Calculate a hash over a list of strings.

The strings are all joined with ‘n’ characters to canonicalize them so that line endings are not part of the calculated hash.

Parameters:
  • lines (list of str) – The list of strings to hash
  • method (str) – The name of the hash method, currently only md5 is supported.
Returns:

The hash digest as a hex string in uppercase.

Return type:

str

multipackage.utilities.obj_hash.dict_hash(obj, method='md5')[source]

Calculate a hash over a json-serializable dictionary.

The obj argument will be dumped to a json string with sorted keys encoded as utf-8 and then that string will be hashed to produce the hash value.

Parameters:
  • obj (dict) – The json serializable dictionary that should be hashed.
  • method (str) – The name of the hash method, currently only md5 is supported.
Returns:

The hash digest as a hex string in uppercase.

Return type:

str

multipackage.utilities.obj_hash.directory_hash(path, glob='*')[source]

Hash all files in a given folder.

This will return a hash value that will tell you if any file has changed in the given directory. You can calculate the hash only over a specific subset of the files by using glob which will be passed to fnmatch to select files.

The hash value will detect:
  • a file is added
  • a file is removed
  • a file name is changed
  • a file is changed in anyway

The hash value is stable on multiple operating systems and across line endings for text files. The name of the parent directory is not part of the hash value so it is useful for ensuring that a given folder has the same contents.

This function assumes all files are text files and calculates a line-ending independing hash

Parameters:
  • path (str) – The path to the directory that we want to hash.
  • glob (str) – Optional wildcard specifier for selecting which files should be hashed.
Returns:

The hash digest as a hex string in uppercase.

Return type:

str