Tools to build OAI-PMH data consumers
A minimalistic implementation of an OAI-PMH harvester.
- clldutils.oaipmh.qname(lname, prefix='oai')[source]
Returns a qualified name suitable for use with ElementTree’s namespace-aware functionality, see https://docs.python.org/3/library/xml.etree.elementtree.html#parsing-xml-with-namespaces
- Parameters:
lname (
str) –prefix (
str) –
- Return type:
str
- class clldutils.oaipmh.Record(identifier, datestamp, metadata=None, about=<factory>, sets=<factory>, status=None, oai_dc_metadata=None)[source]
- Variables:
identifier – the unique identifier of an item in a repository.
oai_dc_metadata – None if no oai_dc metadata is available, otherwise a dict mapping Dublin Core terms (specified as local names) to lists of values.
- Parameters:
identifier (
str) –datestamp (
typing.Union[datetime.datetime,str]) –metadata (
typing.Optional[xml.etree.ElementTree.Element]) –about (
list) –sets (
list) –status (
typing.Optional[str]) –oai_dc_metadata (
typing.Optional[dict]) –
- clldutils.oaipmh.iter_records(baseURL, metadataPrefix='oai_dc', from_=None, until=None, set_=None)[source]
Runs a ListRecords request on the specified OAI-PMH repository (using resumption tokens as necessary).
>>> from clldutils.oaipmh import iter_records >>> recs = iter_records('https://account.lddjournal.org/index.php/uv1-j-ldd/oai') >>> next(recs).identifier 'oai:ojs.pkp.sfu.ca:article/2' >>> next(recs).oai_dc_metadata['identifier'] ['https://account.lddjournal.org/index.php/uv1-j-ldd/article/view/12', '10.25894/ldd12']
- Parameters:
baseURL (
str) – the base URL of the repositorymetadataPrefix (
str) – specifies the metadataPrefix of the format that should be included in the metadata part of the returned records.from – an optional argument with a UTCdatetime value, which specifies a lower bound for datestamp-based selective harvesting.
until (
typing.Union[str,datetime.date,datetime.datetime,None]) – an optional argument with a UTCdatetime value, which specifies a upper bound for datestamp-based selective harvesting.set – an optional argument with a setSpec value , which specifies set criteria for selective harvesting.
from_ (
typing.Union[str,datetime.date,datetime.datetime,None]) –set_ (
typing.Optional[str]) –
- Return type:
collections.abc.Generator[clldutils.oaipmh.Record,None,None]