pylablib.core.fileio package¶
Submodules¶
pylablib.core.fileio.binio module¶
Binary files input/output
-
pylablib.core.fileio.binio.write_num(x, f, dtype)[source]¶ Write a number x into file f.
dtype is the textual representation of data type (numpy-style).
-
pylablib.core.fileio.binio.write_str(s, f, dtype, strict=False)[source]¶ Write a string s into a file f.
dtype is the textual representation of data type. Can be
"s"(simply translate into bytes and write),"sp"+sdtype(e.g.,"sp<u2"), where the string is prepended by its length written usingsdtypeformat, or"s"+len(e.g.,"s16"), wherelenis the textual representation of string length (written data is equivalent to"s"format).If
strict==True, raise error if string length is incompatible with the format (too long for a given"sp"-type prefix, or doesn’t agree with"s"-type length).
-
pylablib.core.fileio.binio.write_pickle(v, f, dtype)[source]¶ Write a value v into file f as a Python pickle object.
dtype is the textual representation of data type (numpy-style), and should be
"pk"+proto+sdtype(e.g.,"pk3<u2"), whereprotois the textual representation of the pickle protocol, andsdtypeis the data type of the prepended string length (see"sp"+sdtypetype inwrite_str()).
-
pylablib.core.fileio.binio.write_val(v, f, dtype)[source]¶ Write an arbitrary value v into file f using the supplied dtype.
Storage type depends on dtype: can be string (see
write_str()), number (seewrite_num()), or pickled value (seewrite_pickle()). In addition, dtype can be a tuple of dtypes with length equal to the length of v, in which case the values in v are written sequentially.
-
pylablib.core.fileio.binio.read_num(f, dtype)[source]¶ Read a number from file f.
dtype is the textual representation of data type (numpy-style).
-
pylablib.core.fileio.binio.read_str(f, dtype)[source]¶ Read a string from file f.
dtype is the textual representation of data type. Can be
"sp"+sdtype(e.g.,"sp<u2"), where the string is prepended by its length written usingsdtypeformat, or"s"+len(e.g.,"s16"), wherelenis the textual representation of string length (i.e., readlenbytes and translate the result into string).
-
pylablib.core.fileio.binio.read_pickle(f, dtype)[source]¶ Read a value from file f as a Python pickle object.
dtype is the textual representation of data type (numpy-style), and should be
"pk"+proto+sdtype(e.g.,"pk3<u2"), whereprotois the textual representation of the pickle protocol (ignored, added for compatibility withwrite_pickle()), andsdtypeis the data type of the prepended string length (see"sp"+sdtypetype inwrite_str()).
-
pylablib.core.fileio.binio.read_val(f, dtype)[source]¶ Read an arbitrary value from file f using the supplied dtype.
Storage type depends on dtype: can be string (see
read_str()), number (seeread_num()), or pickled value (seeread_pickle()). In addition, dtype can be a tuple of dtypes with length equal to the length of v, in which case the values in v are read sequentially.
-
pylablib.core.fileio.binio.size_prepend(f, dtype, added=0)[source]¶ Context manager that prepends the data written inside the block with its size (after the writing is done).
dtype specifies size format; added is added to the size when saving. Note that this method requires back-seeking, so it doesn’t work if the file is opened in append (
"a") mode; use"w"or"r+"mode instead.
pylablib.core.fileio.datafile module¶
-
class
pylablib.core.fileio.datafile.DataFile(data, filepath=None, filetype=None, creation_time=None, comments=None, props=None)[source]¶ Bases:
objectDescribes a single datafile.
Parameters: - data – The main data of the file (usually
DataTableorDictionary). - filepath (str) – absolute path
- filetype (str) – a source type
- creation_time (datetime.datetime) – File creation time
- props (dict) – all the metainfo about the file (extracted from comments, filename etc.)
- comments (list) – all the comments excluding the ones containing props
- data – The main data of the file (usually
pylablib.core.fileio.dict_entry module¶
Classes for dealing with the Dictionary entries with special conversion rules when saved or loaded.
Used to redefine how certain objects (e.g., tables) are written into files and read from files.
-
pylablib.core.fileio.dict_entry.special_load_rules(branch)[source]¶ Detect if the branch requires special conversion rules.
-
class
pylablib.core.fileio.dict_entry.InlineTable(table)[source]¶ Bases:
objectSimple marker class that denotes that the wrapped numpy 2D array should be written inline
-
class
pylablib.core.fileio.dict_entry.IDictionaryEntry(data=None)[source]¶ Bases:
objectA generic Dictionary entry.
-
to_dict(dict_ptr, loc)[source]¶ Convert data to a dictionary branch on saving.
Virtual method, to be defined in subclasses.
Parameters: - dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be saved.
-
classmethod
build_entry(data, **kwargs)[source]¶ Create a DictionaryEntry object based on the supplied data and arguments.
Parameters: data – Data to be saved.
-
classmethod
from_dict(dict_ptr, loc, **kwargs)[source]¶ Convert a dictionary branch to a specific DictionaryEntry object.
Parameters: - dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be loaded.
-
-
pylablib.core.fileio.dict_entry.add_entry_class(cls, branch_predicate, object_predicate=None)[source]¶ Add an entry class which is automatically used in the
build_entry()andfrom_dict()functions to delegate work for converting objects and dictionary branches into dictionary entries.Parameters: - cls – the
IDictionaryEntrysubclass whose :meth:`build_entry andfrom_dict()methods will be used - branch_predicate – predicate used to determine whether the specified subclass is used in`:func:from_dict function;
it is a function which takes a single argument (dictionary branch) and returns
Trueif the conversion should be delegated to the subclass can be a string or a tuple of strings, in which case it is interpreted as passing branches with a given"_data_type__"value. - object_predicate – predicate used to determine whether the specified subclass is used in`:func:build_entry function;
it is a function which takes a single argument (an object to be converted) and returns
Trueif the conversion should be delegated to the subclass can be a type or a tuple of types, in which case it is interpreted as passing object of the given type; can also beNone, which means that the predicate always returnsFalse(i.e., this dictionary entries aren’t created automatically on dictionary saving, but need to be created manually)
- cls – the
-
pylablib.core.fileio.dict_entry.build_entry(data, **kwargs)[source]¶ Create a DictionaryEntry object based on the supplied data and arguments.
Parameters: - data – Data to be saved.
- dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be saved.
-
pylablib.core.fileio.dict_entry.from_dict(dict_ptr, loc, **kwargs)[source]¶ Convert a dictionary branch to a specific DictionaryEntry object.
Parameters: - dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be loaded.
-
class
pylablib.core.fileio.dict_entry.ITableDictionaryEntry(data=None, columns=None)[source]¶ Bases:
pylablib.core.fileio.dict_entry.IDictionaryEntryA generic table Dictionary entry.
Parameters: - data – Table data.
- columns (list) – If not
None, list of column names (ifNoneand data is a DataTable object, get column names from that).
-
classmethod
build_entry(data, table_format='inline', **kwargs)[source]¶ Create a DictionaryEntry object based on the supplied data and arguments.
Parameters: - data – Data to be saved.
- dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be saved.
- table_format (str) – Method of saving the table. Can be either
'inline'(table is saved directly in the dictionary file),'csv'(table is saved in an external CSV file) or'bin'(table is saved in an external binary file).
-
classmethod
from_dict(dict_ptr, loc, out_type='table', **kwargs)[source]¶ Convert a dictionary branch to a specific DictionaryEntry object.
Parameters: - dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be loaded.
- out_type (str) – Output format of the data (
'array'for numpy arrays or'table'forDataTableobjects).
-
class
pylablib.core.fileio.dict_entry.InlineTableDictionaryEntry(data=None, columns=None)[source]¶ Bases:
pylablib.core.fileio.dict_entry.ITableDictionaryEntryAn inlined table Dictionary entry.
Parameters: - data – Table data.
- columns (list) – If not
None, a list of column names (ifNoneand data is a DataTable object, get column names from that).
-
to_dict(dict_ptr, loc)[source]¶ Convert the data to a dictionary branch and write the table to the file.
-
classmethod
from_dict(dict_ptr, loc, out_type='table', **kwargs)[source]¶ Build an
InlineTableDictionaryEntryobject from the dictionary and read the inlined data.Parameters: - dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be loaded.
- out_type (str) – Output format of the data (
'array'for numpy arrays or'table'forDataTableobjects).
-
class
pylablib.core.fileio.dict_entry.IExternalTableDictionaryEntry(data, file_format, name, columns, force_name=True, **kwargs)[source]¶ Bases:
pylablib.core.fileio.dict_entry.ITableDictionaryEntry-
classmethod
from_dict(dict_ptr, loc, out_type='table', **kwargs)[source]¶ Convert a dictionary branch to a specific DictionaryEntry object.
Parameters: - dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be loaded.
- out_type (str) – Output format of the data (
'array'for numpy arrays or'table'forDataTableobjects).
-
classmethod
-
class
pylablib.core.fileio.dict_entry.ExternalTextTableDictionaryEntry(data=None, file_format='csv', name='', columns=None, force_name=True, **kwargs)[source]¶ Bases:
pylablib.core.fileio.dict_entry.IExternalTableDictionaryEntryAn external text table Dictionary entry.
Parameters: - data – Table data.
- file_format (str) – Output file format.
- name (str) – Name template for the external file (default is the full path connected with
"_"symbol). - columns (list) – If not
None, a list of column names (ifNoneand data is a DataTable object, get column names from that). - force_name (bool) – If
Falseand the target file already exists, generate a new unique name; otherwise, overwrite the file.
-
to_dict(dict_ptr, loc)[source]¶ Convert the data to a dictionary branch and save the table to an external file.
-
classmethod
from_dict(dict_ptr, loc, out_type='table')[source]¶ Build an
ExternalTextTableDictionaryEntryobject from the dictionary and load the external data.Parameters: - dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be loaded.
- out_type (str) – Output format of the data (
'array'for numpy arrays or'table'forDataTableobjects).
-
class
pylablib.core.fileio.dict_entry.ExternalBinTableDictionaryEntry(data=None, file_format='bin', name='', columns=None, force_name=True, **kwargs)[source]¶ Bases:
pylablib.core.fileio.dict_entry.IExternalTableDictionaryEntryAn external binary table Dictionary entry.
Parameters: - data – Table data.
- file_format (str) – Output file format.
- name (str) – Name template for the external file (default is the full path connected with
"_"symbol). - columns (list) – If not
None, a list of column names (ifNoneand data is a DataTable object, get column names from that). - force_name (bool) – If
Falseand the target file already exists, generate a new unique name; otherwise, overwrite the file.
-
to_dict(dict_ptr, loc)[source]¶ Convert the data to a dictionary branch and save the table to an external file.
-
classmethod
from_dict(dict_ptr, loc, out_type='table', **kwargs)[source]¶ Build an
ExternalBinTableDictionaryEntryobject from the dictionary and load the external data.Parameters: - dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be loaded.
- out_type (str) – Output format of the data (
'array'for numpy arrays or'table'forDataTableobjects).
-
class
pylablib.core.fileio.dict_entry.IExternalFileDictionaryEntry(data, name='', force_name=True, **kwargs)[source]¶ Bases:
pylablib.core.fileio.dict_entry.IDictionaryEntryGeneric dictionary entry for data in an external file.
Parameters: -
file_format= None¶
-
static
add_file_format(subclass)[source]¶ Register an
IExternalFileDictionaryEntryas a possible stored file format.Used to automatically invoke a correct loader when loading the dictionary file. Only needs to be done once after the subclass declaration.
-
to_dict(dict_ptr, loc)[source]¶ Convert the data to a dictionary branch and save the data to an external file.
-
classmethod
from_dict(dict_ptr, loc, **kwargs)[source]¶ Build an
IExternalFileDictionaryEntryobject from the dictionary and load the external data.Parameters: - dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be loaded.
-
get_preamble()[source]¶ Generate preamble (dictionary with supplementary data which allows to load the data from the file)
-
-
class
pylablib.core.fileio.dict_entry.ExternalNumpyDictionaryEntry(data, name='', force_name=True, dtype=None, **kwargs)[source]¶ Bases:
pylablib.core.fileio.dict_entry.IExternalFileDictionaryEntryA dictionary entry which stores the numpy array data into an external file in binary format.
Parameters: - data – Numpy array data.
- name (str) – Name template for the external file (default is the full path connected with
"_"symbol). - force_name (bool) – If
Falseand the target file already exists, generate a new unique name; otherwise, overwrite the file. - dtype – numpy dtype to load/save the data (by default, dtype of the supplied data).
-
file_format= 'numpy'¶
-
class
pylablib.core.fileio.dict_entry.ExpandedContainerDictionaryEntry(data, **kwargs)[source]¶ Bases:
pylablib.core.fileio.dict_entry.IDictionaryEntryA dictionary entry which expands containers (lists, tuples, dictionaries) into subdictionaries.
Useful when the data in the containers is complex, so writing it into one line (as is default for lists and tuples) wouldn’t work.
Parameters: data – Container data. -
classmethod
from_dict(dict_ptr, loc, **kwargs)[source]¶ Build an
ExpandedContainerDictionaryEntryobject from the dictionary.
-
classmethod
pylablib.core.fileio.loadfile module¶
Utilities for reading data files.
-
class
pylablib.core.fileio.loadfile.IInputFileFormat[source]¶ Bases:
objectGeneric class for an input file format.
Based on file_format or autodetection, calls one of its subclasses to read the file.
-
class
pylablib.core.fileio.loadfile.ITextInputFileFormat[source]¶ Bases:
pylablib.core.fileio.loadfile.IInputFileFormatGeneric class for a text input file format.
Based on file_format or autodetection, calls one of its subclasses to read the file.
-
class
pylablib.core.fileio.loadfile.CSVTableInputFileFormat[source]¶ Bases:
pylablib.core.fileio.loadfile.ITextInputFileFormatClass for CSV input file format.
-
static
read_file(location_file, out_type='default', dtype='numeric', columns=None, delimiters=None, empty_entry_substitute=None, ignore_corrupted_lines=True, skip_lines=0, **kwargs)[source]¶ Read CSV file.
See
parse_csv.load_table()for more description.Parameters: - location_file – Location of the data.
- out_type (str) – type of the result:
'array'for numpy array,'pandas'for pandas DataFrame,'table'forDataTableobject, or'default'(determined by the library default;'table'by default) - dtype – dtype of entries; can be either a single type, or a list of types (one per column).
Possible dtypes are:
'int','float','complex','numeric'(tries to coerce to minimal possible numeric type, raises error if data can’t be converted to complex),'generic'(accept arbitrary types, including lists, dictionaries, escaped strings, etc.),'raw'(keep raw string). - columns – either a number if columns, or a list of columns names.
- delimiters (str) – Regex string which recognizes entries delimiters (by default
r"\s*,\s*|\s+", i.e., commas and whitespaces). - empty_entry_substitute – Substitute for empty table entries. If
None, all empty table entries are skipped. - ignore_corrupted_lines (bool) – If
True, skip corrupted (e.g., non-numeric for numeric dtype, or with too few entries) lines; otherwise, raiseValueError. - skip_lines (int) – Number of lines to skip from the beginning of the file.
-
static
-
class
pylablib.core.fileio.loadfile.DictionaryInputFileFormat[source]¶ Bases:
pylablib.core.fileio.loadfile.ITextInputFileFormatClass for Dictionary input file format.
-
static
read_file(location_file, case_normalization=None, inline_dtype='generic', entry_format='value', skip_lines=0, **kwargs)[source]¶ Read Dictionary file.
Parameters: - location_file – Location of the data.
- case_normalization (str) – If
None, the dictionary paths are case-sensitive; otherwise, defines the way the entries are normalized ('lower'or'upper'). - inline_dtype (str) – dtype for inlined tables.
- entry_format (str) – Determines the way for dealing with
dict_entry.IDictionaryEntryobjects (objects transformed into dictionary branches with special recognition rules). Can be'branch'(don’t attempt to recognize those object, leave dictionary as in the file),'dict_entry'(recognize and leave asdict_entry.IDictionaryEntryobjects) or'value'(recognize and keep the value). - skip_lines (int) – Number of lines to skip from the beginning of the file.
-
static
-
class
pylablib.core.fileio.loadfile.BinaryTableInputFileFormatter[source]¶ Bases:
pylablib.core.fileio.loadfile.IInputFileFormatClass for binary input file format.
-
static
read_file(location_file, out_type='default', dtype='>f8', columns=None, packing='flatten', preamble=None, skip_bytes=0, **kwargs)[source]¶ Read binary file.
Parameters: - location_file – Location of the data.
- out_type (str) – type of the result:
'array'for numpy array,'pandas'for pandas DataFrame,'table'forDataTableobject, or'default'(determined by the library default;'table'by default) - dtype –
numpy.dtypedescribing the data. - columns – either number if columns, or a list of columns names.
- packing (str) – The way the 2D array is packed. Can be either
'flatten'(data is stored row-wise) or'transposed'(data is stored column-wise). - preamble (dict) – If not
None, defines binary file parameters that supersede the parameteres supplied to the function. The defined parameters are'dtype','packing','ncols'(number of columns) and'nrows'(number of rows). - skip_bytes (int) – Number of bytes to skip from the beginning of the file.
-
static
-
pylablib.core.fileio.loadfile.load(path=None, input_format=None, loc='file', return_file=False, **kwargs)[source]¶ Load data from the file.
Parameters: **kwargs are passed to the file formatter used to read the data (see
CSVTableInputFileFormat.read_file(),DictionaryInputFileFormat.read_file()andBinaryTableInputFileFormatter.read_file()for the possible arguments). The default format names are:'generic': Generic file format. Attempt to autodetect, raiseIOErrorif unsuccessful;'txt': Generic text file. Attempt to autodetect, raiseIOErrorif unsuccessful'csv': CSV file, corresponds toCSVTableInputFileFormat;'dict': Dictionary file, corresponds toDictionaryInputFileFormat;'bin': Binary file, corresponds toBinaryTableInputFileFormatter
pylablib.core.fileio.location module¶
Classes for describing a generic file location.
-
class
pylablib.core.fileio.location.LocationName(path=None, ext=None)[source]¶ Bases:
objectFile name inside a location.
Parameters: - path – Path inside the location. Gets normalized according to the Dictionary rules (not case-sensitive;
'/'and'\'are the delimiters). - ext (str) – Name extension (
Noneis default).
-
get_path(default_path='', sep='/')[source]¶ Get the string path.
If the object’s path is
None, use default_path instead. If sep is notNone, use it to join the path entries; otherwise, return the path in a list form.
-
get_ext(default_ext='')[source]¶ Get the extension.
If the object’s ext is
None, use default_ext instead.
-
to_string(default_path='', default_ext='', path_sep='/', ext_sep='|', add_empty_ext=True)[source]¶ Convert the path to a string representation.
Parameters: - default_path (str) – Use it as path if the object’s path is
None. - default_ext (str) – Use it as path if the object’s ext is
None. - path_sep (str) – Use it to join the path entries.
- ext_sep (str) – Use it to join path and extension.
- add_empty_ext (str) – If
Falseand the extension is empty, don’t add ext_sep in the end.
- default_path (str) – Use it as path if the object’s path is
-
to_path(default_path='', default_ext='', ext_sep='|', add_empty_ext=True)[source]¶ Convert the path to a list representation.
Extension is added with ext_sep to the last entry in the path.
Parameters:
-
static
from_string(expr, ext_sep='|')[source]¶ Create a
LocationNameobject from a string representation.ext_sep defines extension separator; the path separators are
'/'and'\'. Empty path or extension translate intoNone.
-
static
from_object(obj)[source]¶ Create a
LocationNameobject from an object.obj can be a
LocationName(return unchanged), tuple or list (use as construct arguments), string (treat as a string representation) orNone(return empty name).
- path – Path inside the location. Gets normalized according to the Dictionary rules (not case-sensitive;
-
class
pylablib.core.fileio.location.LocationFile(loc, name=None)[source]¶ Bases:
objectA file at a location.
Can be opened for reading or writing.
Parameters: - loc – File location.
- name – File’s name inside the location.
-
class
pylablib.core.fileio.location.IDataLocation[source]¶ Bases:
objectGeneric location.
-
generate_new_name(prefix_name, idx=0)[source]¶ Generate a new name inside the location using the given prefix and starting index.
If idx is
None, check just the prefix_name first before starting to append indices.
-
-
class
pylablib.core.fileio.location.IFileSystemDataLocation[source]¶ Bases:
pylablib.core.fileio.location.IDataLocationA generic filesystem data location.
A single file name describes a single file in the filesystem.
-
get_filesystem_path(name=None, path_type='absolute')[source]¶ Get the filesystem path corresponding to a given name.
path_type can be
'absolute'(return absolute path),'relative'(return relative path; level depends on the location) or'name'(only return path inside the location).
-
open_file(mode, name=None, data_type='text')[source]¶ Open a location file.
See
IDataLocation.open_file()for arguments.
-
-
class
pylablib.core.fileio.location.SingleFileSystemDataLocation(file_path)[source]¶ Bases:
pylablib.core.fileio.location.IFileSystemDataLocationA location describing a single file.
Any use of a non-default name raises
ValueError.Parameters: file_path (str) – The path to the file.
-
class
pylablib.core.fileio.location.PrefixedFileSystemDataLocation(file_path, prefix_template='{0}_{1}')[source]¶ Bases:
pylablib.core.fileio.location.IFileSystemDataLocationA location describing a set of prefixed files.
Parameters: - file_path (str) – A master path. Its name is used as a prefix, and its extension is used as a default.
- prefix_template (str) – A
str.format()string for generating prefixed files. Has two arguments: the first is the master name, the second is the sub_location.
Multi-level paths translate into nested folders (the top level folder is combined from the file_path prefix and the first path entry).
-
class
pylablib.core.fileio.location.FolderFileSystemDataLocation(folder_path, default_name='', default_ext='')[source]¶ Bases:
pylablib.core.fileio.location.IFileSystemDataLocationA location describing a single folder.
Parameters: Multi-level paths translate into nested subfolders.
-
pylablib.core.fileio.location.get_location(loc, path, *args, **kwargs)[source]¶ Build a location.
If loc or path are instances of
IDataLocation, return them unchanged. If loc is a string, it describes location kind:'single_file':SingleFileSystemDataLocationwith the given path.'file'or'prefixed_file':PrefixedFileSystemDataLocationwith the given path as a master path.'folder':FolderFileSystemDataLocationwith the given folder path.
Any additional arguments are relayed to the constructors.
pylablib.core.fileio.logfile module¶
-
class
pylablib.core.fileio.logfile.LogFile(path, default_fmt=None)[source]¶ Bases:
objectExpanding file.
Parameters: - path (str) – Path to the destination file.
- default_fmt (list) – If not
None, it’s a defult value of fmt forwrite_dataline()method.
-
write_lines(lines, header='', add_timestamp=True)[source]¶ Write a single line into the file.
Create the file if it doesn’t exist.
Parameters:
-
write_dataline(data, columns=None, fmt=None, add_timestamp=True)[source]¶ Write a single data line into the file.
Create the file if it doesn’t exist.
Parameters: - data (list or numpy.ndarray) – Data row to be added.
- columns (list) – If not
None, it’s a list of column names to be added as a header on creation. - fmt (str) – If not
None, it’s a list of format strings for the line entries. - add_timestamp (bool) – If
True, add the UNIX timestamp in the beginning of the line.
pylablib.core.fileio.parse_csv module¶
Utilities for parsing CSV files.
-
pylablib.core.fileio.parse_csv.read_table_and_comments(f, delimiters=re.compile('\\s*, \\s*|\\s+'), empty_entry_substitute=None, stop_comment=None, chunk_size=None, as_text=True, simple_entries=True)[source]¶ Load data table (in text format) and comments from the opened file f (must be open as binary).
Comment lines are the ones starting with
#.Parameters: - delimiters (str) – Regex string which recognizes delimiters (by default
r"\s*,\s*|\s+", i.e., commas and whitespaces). - empty_entry_substitute – Substitute for empty table entries. If
None, all empty table entries are skipped. - stop_comment (str) – Regex string for the stopping comment.
If not
None. the function will stop if comment satisfying stop_comment regex is encountered. - chunk_size (int) – Maximal size (number of lines) of the data to read.
- as_text (bool) – If
False, return entries as strings; otherwise, convert them into values. - simple_entries (bool) – If
True, assume that there are no escaped strings or parenthesis structures in the files, so line splitting routine is simplified.
Returns: (data, comments, finished), where data is 2D-list of table entries (already recognized unlessas_text==True)and comments is a list of strings. Data lines may have different lengths. finished indicates if file has been read through the end (it’s
Trueunless chunk_size is notNone).
Return type: - delimiters (str) – Regex string which recognizes delimiters (by default
-
class
pylablib.core.fileio.parse_csv.ChunksAccumulator(dtype='numeric', ignore_corrupted_lines=True)[source]¶ Bases:
objectClass for accumulating data chunks into a single array.
Parameters: - dtype – dtype of entries; can be either a single type, or a list of types (one per column).
Possible dtypes are:
'int','float','complex','numeric'(tries to coerce to minimal possible numeric type, raises error if data can’t be converted to complex),'generic'(accept arbitrary types, including lists, dictionaries, escaped strings, etc.),'raw'(keep raw string). - ignore_corrupted_lines – If
True, skip corrupted (e.g., non-numeric for numeric dtype, or with too few entries) lines; otherwise, raiseValueError.
- dtype – dtype of entries; can be either a single type, or a list of types (one per column).
Possible dtypes are:
-
pylablib.core.fileio.parse_csv.load_columns(f, dtype, delimiters='\\s*, \\s*|\\s+', empty_entry_substitute=None, ignore_corrupted_lines=True, stop_comment=None)[source]¶ Load columns from the file stream f.
Parameters: - dtype – dtype of entries; can be either a single type, or a list of types (one per column).
Possible dtypes are:
'int','float','complex','numeric'(tries to coerce to minimal possible numeric type, raises error if data can’t be converted to complex),'generic'(accept arbitrary types, including lists, dictionaries, escaped strings, etc.),'raw'(keep raw string). - delimiters (str) – Regex string which recognizes delimiters (by default
r"\s*,\s*|\s+", i.e., commas and whitespaces). - empty_entry_substitute – Substitute for empty table entries. If
None, all empty table entries are skipped. - ignore_corrupted_lines – If
True, skip corrupted (e.g., non-numeric for numeric dtype, or with too few entries) lines; otherwise, raiseValueError. - stop_comment (str) – Regex string for the stopping comment.
If not
None. the function will stop if comment satisfying stop_comment regex is encountered.
Returns: (columns, comments, corrupted_lines).columns is a list of columns with data.
comments is a list of comment strings.
corrupted_lines is a dict
{'size':list, 'type':list}of corrupted lines (already split into entries), based on the corruption type ('size'means too small size,'type'means it couldn’t be converted using provided dtype).Return type: - dtype – dtype of entries; can be either a single type, or a list of types (one per column).
Possible dtypes are:
-
pylablib.core.fileio.parse_csv.columns_to_table(data, columns=None, out_type='table')[source]¶ Convert data (columns list) into a table.
Parameters:
-
pylablib.core.fileio.parse_csv.load_table(f, dtype='numeric', columns=None, out_type='table', delimiters='\\s*, \\s*|\\s+', empty_entry_substitute=None, ignore_corrupted_lines=True, stop_comment=None)[source]¶ Load table from the file stream f.
Arguments are the same as in
load_columns()andcolumns_to_table().Returns: (table, comments, corrupted_lines).table is a table of the format out_type.corrupted_lines is a dict
{'size':list, 'type':list}of corrupted lines (already split into entries), based on the corruption type ('size'means too small size,'type'means it couldn’t be converted using provided dtype).comments is a list of comment strings.
Return type: tuple
pylablib.core.fileio.savefile module¶
Utilities for writing data files.
-
class
pylablib.core.fileio.savefile.IOutputFileFormat(format_name, default_kwargs=None)[source]¶ Bases:
objectGeneric class for an output file format.
Parameters:
-
class
pylablib.core.fileio.savefile.ITextOutputFileFormat(format_name, save_props=True, save_comments=True, save_time=True, new_time=True, default_kwargs=None)[source]¶ Bases:
pylablib.core.fileio.savefile.IOutputFileFormatGeneric class for a text output file format.
Parameters: - format_name (str) – The name of the format (to be defined in subclasses).
- save_props (bool) – If
Trueand saving ~datafile.DataFile object, save its props metainfo. - save_comments (bool) – If
Trueand saving ~datafile.DataFile object, save its comments metainfo. - save_time (bool) – If
True, append the file creation time in the end. - new_time (bool) – If saving ~datafile.DataFile object, determines if the time should be updated to the current time.
- default_kwargs (dict) – Default **kwargs values for the
IOutputFileFormat.write()method.
-
class
pylablib.core.fileio.savefile.CSVTableOutputFileFormat(save_props=True, save_comments=True, save_time=True, save_columns=True, columns_delimiter='t', custom_reps=None, use_rep_classes=False, **kwargs)[source]¶ Bases:
pylablib.core.fileio.savefile.ITextOutputFileFormatClass for CSV output file format.
Parameters: - save_props (bool) – If
Trueand saving ~datafile.DataFile object, save its props metainfo. - save_comments (bool) – If
Trueand saving ~datafile.DataFile object, save its comments metainfo. - save_time (bool) – If
True, append the file creation time in the end. - columns_delimiter (str) – Used to separate entries in a row.
- custom_reps (str) – If not
None, defines custom representations to be passed toutils.string.to_string()function. - use_rep_classes (bool) – If
True, use representation classes for Dictionary entries (e.g., numpy arrays will be represented as"array([1, 2, 3])"instead of just"[1, 2, 3]"); This improves storage fidelity, but makes result harder to parse (e.g., by external string parsers). - **kwargs (dict) – Default **kwargs values for the
IOutputFileFormat.write()method.
-
write_data(location_file, data, columns=None, **kwargs)[source]¶ Write data to a CSV file.
Parameters: - location_file – Location of the destination.
- data – Data to be saved. Can be
DataTableor an arbitrary 2D array (numpy array, 2D list, etc.). - columns ([str]) – If not
None, the list of column names. IfNoneand data is of typeDataTable, use its columns names. IfNoneand data is of other type, don’t put the column line in the output.
- save_props (bool) – If
-
class
pylablib.core.fileio.savefile.DictionaryOutputFileFormat(save_props=True, save_comments=True, save_time=True, table_format='inline', inline_columns_delimiter='t', inline_reps=None, param_reps=None, use_rep_classes=False, **kwargs)[source]¶ Bases:
pylablib.core.fileio.savefile.ITextOutputFileFormatClass for Dictionary output file format.
Parameters: - save_props (bool) – If
Trueand saving ~datafile.DataFile object, save its props metainfo. - save_comments (bool) – If
Trueand saving ~datafile.DataFile object, save its comments metainfo. - save_time (bool) – If
True, append the file creation time in the end. - table_format (str) – Default format for table (numpy arrays or
DataTableobjects) entries. Can be'inline'(table is written inside the file),'csv'(external CSV file) or'bin'(external binary file). - inline_columns_delimiter (str) – Used to separate entries in a row for inline tables.
- inline_reps (str) – If not
None, defines custom representations to be passed toutils.string.to_string()function when writing inline tables. - param_reps (str) – If not
None, defines custom representations to be passed toutils.string.to_string()function when writing Dictionary entries. - use_rep_classes (bool) – If
True, use representation classes for Dictionary entries (e.g., numpy arrays will be represented as"array([1, 2, 3])"instead of just"[1, 2, 3]"); This improves storage fidelity, but makes result harder to parse (e.g., by external string parsers). - **kwargs (dict) – Default **kwargs values for the
IOutputFileFormat.write()method.
- save_props (bool) – If
-
class
pylablib.core.fileio.savefile.IBinaryOutputFileFormat(format_name, default_kwargs=None)[source]¶
-
class
pylablib.core.fileio.savefile.TableBinaryOutputFileFormat(dtype=None, transposed=False, **kwargs)[source]¶ Bases:
pylablib.core.fileio.savefile.IBinaryOutputFileFormatClass for binary output file format.
Parameters: - dtype –
numpy.dtypedescribing the data. By default,'>f8'for real data and'>c16'for complex data. - transposed (bool) – If
False, write the data row-wise; otherwise, write it column-wise. - **kwargs (dict) – Default **kwargs values for the
IOutputFileFormat.write()method.
-
get_preamble(loc_file, data)[source]¶ Generate a preamble (dictionary describing the file format).
The parameters are
'dtype','packing'('transposed'or'flatten', depending on the transposed attribute),'ncol'(number of columns) and'nrows'(number of rows).
- dtype –
-
pylablib.core.fileio.savefile.save(data, path='', output_format=None, loc='file', **kwargs)[source]¶ Save data to a file.
Parameters: - data – Data to be saved.
- path (str) – Path to the file.
- output_format (str) – Output file format. Can be either
None(defaults to'csv'for table data and'dict'for Dictionary data), a string with one of the default format names, or an already preparedIOutputFileFormatobject. - loc (str) – Location type.
**kwargs are passed to the file formatter constructor (see
CSVTableOutputFileFormat,DictionaryOutputFileFormatandTableBinaryOutputFileFormatfor the possible arguments). The default format names are:'csv': CSV file, corresponds toCSVTableOutputFileFormat;'dict': Dictionary file, corresponds toDictionaryOutputFileFormat;'bin': Binary file, corresponds toTableBinaryOutputFileFormat