Tools to build OAI-PMH data consumers

A minimalistic implementation of an OAI-PMH harvester.

clldutils.oaipmh.qname(lname, prefix='oai')[source]

Returns a qualified name suitable for use with ElementTree’s namespace-aware functionality, see https://docs.python.org/3/library/xml.etree.elementtree.html#parsing-xml-with-namespaces

Parameters:
  • lname (str) –

  • prefix (str) –

Return type:

str

class clldutils.oaipmh.Record(identifier, datestamp, metadata=None, about=_Nothing.NOTHING, sets=_Nothing.NOTHING, status=None, oai_dc_metadata=None)[source]
Variables:
  • identifier – the unique identifier of an item in a repository.

  • oai_dc_metadataNone if no oai_dc metadata is available, otherwise a dict mapping Dublin Core terms (specified as local names) to lists of values.

clldutils.oaipmh.iter_records(baseURL, metadataPrefix='oai_dc', from_=None, until=None, set_=None)[source]

Runs a ListRecords request on the specified OAI-PMH repository (using resumption tokens as necessary).

>>> from clldutils.oaipmh import iter_records
>>> recs = iter_records('https://account.lddjournal.org/index.php/uv1-j-ldd/oai')
>>> next(recs).identifier
'oai:ojs.pkp.sfu.ca:article/2'
>>> next(recs).oai_dc_metadata['identifier']
['https://account.lddjournal.org/index.php/uv1-j-ldd/article/view/12', '10.25894/ldd12']
Parameters:
  • baseURL (str) – the base URL of the repository

  • metadataPrefix (str) – specifies the metadataPrefix of the format that should be included in the metadata part of the returned records.

  • from – an optional argument with a UTCdatetime value, which specifies a lower bound for datestamp-based selective harvesting.

  • until (typing.Union[str, datetime.date, datetime.datetime, None]) – an optional argument with a UTCdatetime value, which specifies a upper bound for datestamp-based selective harvesting.

  • set – an optional argument with a setSpec value , which specifies set criteria for selective harvesting.

  • from_ (typing.Union[str, datetime.date, datetime.datetime, None]) –

  • set_ (typing.Optional[str]) –

Return type:

typing.Generator[clldutils.oaipmh.Record, None, None]