Handling SIL Standard Format (SFM) files
SIL’s Standard Format (SFM) files are used natively for Toolbox. Applications which can export in a SFM format include ELAN and Flex. In absence of other export formats, the need to read or write SFM files persists.
The (somewhat simplistic) SFM implementation provided in this module supports
multiline values
custom entry separator
Usage:
>>> from clldutils.sfm import SFM
>>> sfm = SFM.from_string('''\ex Yax bo’on ta sna Antonio.
... \exEn I’m going to Antonio’s house.
... \ex Ban yax ba’at?
... \exEn Where are you going?
... \exFr Ou allez-vous?''')
>>> sfm[0].markers()
Counter({'ex': 2, 'exEn': 2, 'exFr': 1})
>>> sfm[0].get('exFr')
'Ou allez-vous?'
- class clldutils.sfm.Entry(iterable=(), /)[source]
We store entries in SFM files as lists of (marker, value) pairs.
- class clldutils.sfm.SFM(iterable=(), /)[source]
A list of Entries
Simple usage to normalize a sfm file:
>>> sfm = SFM.from_file(fname, marker_map={'lexeme': 'lx'}) >>> sfm.write(fname)
- classmethod from_string(text, marker_map=None, entry_impl=<class 'clldutils.sfm.Entry'>, **kw)[source]
Initialize a SFM object from a SFM formatted string.
- Parameters:
text (
str
) –marker_map (
typing.Optional
[typing.Dict
[str
,str
]]) –
- read(filename, encoding='utf-8', marker_map=None, entry_impl=<class 'clldutils.sfm.Entry'>, entry_sep='\\n\\n', entry_prefix=None, keep_empty=False)[source]
Extend the entry list by parsing new entries from a file.
- Parameters:
filename –
encoding –
marker_map (
typing.Optional
[typing.Dict
[str
,str
]]) – A dict used to map marker names.entry_impl – Subclass of Entry or None
entry_sep (
str
) –entry_prefix (
typing.Optional
[str
]) –keep_empty (
bool
) –