Tools related to filesystem paths

This module provides utilities for filesystem paths and files.

Note

Most of the original rationale for this module has been rendered moot by recent changes in stdlib, making pathlib.Path the default format for function arguments that represent paths.

clldutils.path.ensure_cmd(cmd, **kw)[source]

Make sure an executable is installed and return its full path.

Just a wrapper around shutil.which which raises a useful exception when the command is not installed.

Return type:

str

clldutils.path.sys_path(p)[source]

Context manager providing a context with path p appended to sys.path.

See also

import_module()

clldutils.path.memorymapped(filename, access=1)[source]

Context manager to access a memory mapped file.

Parameters:

filename (typing.Union[str, pathlib.Path]) –

Return type:

mmap.mmap

clldutils.path.import_module(p)[source]

Import a python module from anywhere in the filesystem.

Parameters:

p (pathlib.Path) –

Return type:

types.ModuleType

clldutils.path.readlines(p, encoding=None, strip=False, comment=None, normalize=None, linenumbers=False)[source]

Read a list of lines from a text file (or iterable of lines).

Parameters:
  • p (typing.Union[pathlib.Path, str, list, tuple]) – File path (or list or tuple of text)

  • encoding (typing.Optional[str]) – Registered codec.

  • strip (bool) – If True, strip leading and trailing whitespace.

  • comment (typing.Optional[str]) – String used as syntax to mark comment lines. When not None, commented lines will be stripped. This implies strip=True.

  • normalize (typing.Optional[str]) – Do UNICODE normalization (‘NFC’, ‘NFKC’, ‘NFD’, ‘NFKD’)

  • linenumbers (bool) – return also line numbers.

Return type:

typing.List[typing.Union[typing.Tuple[int, str], str]]

Returns:

list of text lines or pairs (int, text or None).

clldutils.path.move(src, dst)[source]

Functionality of shutil.move accepting pathlib.Path as input.

clldutils.path.walk(p, mode='all', **kw)[source]

Wrapper for os.walk, yielding Path objects.

Parameters:
  • p – root of the directory tree to walk.

  • mode – ‘all|dirs|files’, defaulting to ‘all’.

  • kw – Keyword arguments are passed to os.walk.

Return type:

typing.Generator[pathlib.Path, None, None]

Returns:

Generator for the requested Path objects.

clldutils.path.md5(p, bufsize=32768)[source]

Compute md5 sum of the content of a file.

Parameters:
  • p (typing.Union[pathlib.Path, str]) –

  • bufsize (int) –

Return type:

str

class clldutils.path.Manifest[source]

A dict mapping relative path names to md5 sums of file contents.

A str(Manifest.from_dir(d, relative_to=d.parent)) is equivalent to the content of the file manifest-md5.txt of the BagIt specification.

clldutils.path.git_describe(dir_, git_command='git')[source]

Run git describe –always –tags on a directory.

Note

git_command must be in the PATH and is called in a subprocess.

class clldutils.path.TemporaryDirectory(suffix=None, prefix=None, dir=None, ignore_cleanup_errors=False)[source]

tempfile.TemporaryDirectory, but yielding a pathlib.Path