pylablib.core.fileio package¶

Submodules¶

pylablib.core.fileio.binio module¶

Binary files input/output

pylablib.core.fileio.binio.write_num(x, f, dtype)[source]¶

Write a number x into file f.

dtype is the textual representation of data type (numpy-style).

pylablib.core.fileio.binio.write_str(s, f, dtype, strict=False)[source]¶

Write a string s into a file f.

dtype is the textual representation of data type. Can be "s" (simply translate into bytes and write), "sp"+sdtype (e.g., "sp<u2"), where the string is prepended by its length written using sdtype format, or "s"+len (e.g., "s16"), where len is the textual representation of string length (written data is equivalent to "s" format).

If strict==True, raise error if string length is incompatible with the format (too long for a given "sp"-type prefix, or doesn’t agree with "s"-type length).

pylablib.core.fileio.binio.write_pickle(v, f, dtype)[source]¶

Write a value v into file f as a Python pickle object.

dtype is the textual representation of data type (numpy-style), and should be "pk"+proto+sdtype (e.g., "pk3<u2"), where proto is the textual representation of the pickle protocol, and sdtype is the data type of the prepended string length (see "sp"+sdtype type in write_str()).

pylablib.core.fileio.binio.write_val(v, f, dtype)[source]¶

Write an arbitrary value v into file f using the supplied dtype.

Storage type depends on dtype: can be string (see write_str()), number (see write_num()), or pickled value (see write_pickle()). In addition, dtype can be a tuple of dtypes with length equal to the length of v, in which case the values in v are written sequentially.

pylablib.core.fileio.binio.read_num(f, dtype)[source]¶

Read a number from file f.

dtype is the textual representation of data type (numpy-style).

pylablib.core.fileio.binio.read_str(f, dtype)[source]¶

Read a string from file f.

dtype is the textual representation of data type. Can be "sp"+sdtype (e.g., "sp<u2"), where the string is prepended by its length written using sdtype format, or "s"+len (e.g., "s16"), where len is the textual representation of string length (i.e., read len bytes and translate the result into string).

pylablib.core.fileio.binio.read_pickle(f, dtype)[source]¶

Read a value from file f as a Python pickle object.

dtype is the textual representation of data type (numpy-style), and should be "pk"+proto+sdtype (e.g., "pk3<u2"), where proto is the textual representation of the pickle protocol (ignored, added for compatibility with write_pickle()), and sdtype is the data type of the prepended string length (see "sp"+sdtype type in write_str()).

pylablib.core.fileio.binio.read_val(f, dtype)[source]¶

Read an arbitrary value from file f using the supplied dtype.

Storage type depends on dtype: can be string (see read_str()), number (see read_num()), or pickled value (see read_pickle()). In addition, dtype can be a tuple of dtypes with length equal to the length of v, in which case the values in v are read sequentially.

pylablib.core.fileio.binio.size_prepend(f, dtype, added=0)[source]¶

Context manager that prepends the data written inside the block with its size (after the writing is done).

dtype specifies size format; added is added to the size when saving. Note that this method requires back-seeking, so it doesn’t work if the file is opened in append ("a") mode; use "w" or "r+" mode instead.

pylablib.core.fileio.datafile module¶

class pylablib.core.fileio.datafile.DataFile(data, filepath=None, filetype=None, creation_time=None, comments=None, props=None)[source]¶

Bases: object

Describes a single datafile.

Parameters:	data – The main data of the file (usually `DataTable` or `Dictionary`). filepath (str) – absolute path filetype (str) – a source type creation_time (datetime.datetime) – File creation time props (dict) – all the metainfo about the file (extracted from comments, filename etc.) comments (list) – all the comments excluding the ones containing props

addprop(name, value)[source]¶: Add a property to the dictionary

getprop(name, default=None)[source]¶: Get a property from the dictionary. Use default value if it’s not found

pylablib.core.fileio.dict_entry module¶

Classes for dealing with the Dictionary entries with special conversion rules when saved or loaded. Used to redefine how certain objects (e.g., tables) are written into files and read from files.

pylablib.core.fileio.dict_entry.special_load_rules(branch)[source]¶: Detect if the branch requires special conversion rules.

class pylablib.core.fileio.dict_entry.InlineTable(table)[source]¶

Bases: object

Simple marker class that denotes that the wrapped numpy 2D array should be written inline

class pylablib.core.fileio.dict_entry.IDictionaryEntry(data=None)[source]¶

Bases: object

A generic Dictionary entry.

set_data(data=None)[source]¶: Set internal data.

get_data()[source]¶: Get internal data.

to_dict(dict_ptr, loc)[source]¶

Convert data to a dictionary branch on saving.

Virtual method, to be defined in subclasses.

Parameters:	dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry. loc – Location for the data to be saved.

classmethod build_entry(data, **kwargs)[source]¶

Create a DictionaryEntry object based on the supplied data and arguments.

Parameters:	data – Data to be saved.

classmethod from_dict(dict_ptr, loc, **kwargs)[source]¶

Convert a dictionary branch to a specific DictionaryEntry object.

Parameters:	dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry. loc – Location for the data to be loaded.

pylablib.core.fileio.dict_entry.add_entry_class(cls, branch_predicate, object_predicate=None)[source]¶

Add an entry class which is automatically used in the build_entry() and from_dict() functions to delegate work for converting objects and dictionary branches into dictionary entries.

Parameters:

cls – the IDictionaryEntry subclass whose :meth:`build_entry and from_dict() methods will be used
branch_predicate – predicate used to determine whether the specified subclass is used in`:func:from_dict function; it is a function which takes a single argument (dictionary branch) and returns True if the conversion should be delegated to the subclass can be a string or a tuple of strings, in which case it is interpreted as passing branches with a given "_data_type__" value.
object_predicate – predicate used to determine whether the specified subclass is used in`:func:build_entry function; it is a function which takes a single argument (an object to be converted) and returns True if the conversion should be delegated to the subclass can be a type or a tuple of types, in which case it is interpreted as passing object of the given type; can also be None, which means that the predicate always returns False (i.e., this dictionary entries aren’t created automatically on dictionary saving, but need to be created manually)

pylablib.core.fileio.dict_entry.build_entry(data, **kwargs)[source]¶

Create a DictionaryEntry object based on the supplied data and arguments.

Parameters:	data – Data to be saved. dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry. loc – Location for the data to be saved.

pylablib.core.fileio.dict_entry.from_dict(dict_ptr, loc, **kwargs)[source]¶

Convert a dictionary branch to a specific DictionaryEntry object.

Parameters:	dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry. loc – Location for the data to be loaded.

class pylablib.core.fileio.dict_entry.ITableDictionaryEntry(data=None, columns=None)[source]¶

Bases: pylablib.core.fileio.dict_entry.IDictionaryEntry

A generic table Dictionary entry.

Parameters:	data – Table data. columns (list) – If not `None`, list of column names (if `None` and data is a DataTable object, get column names from that).

classmethod build_entry(data, table_format='inline', **kwargs)[source]¶

Create a DictionaryEntry object based on the supplied data and arguments.

Parameters:

data – Data to be saved.
dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry.
loc – Location for the data to be saved.
table_format (str) – Method of saving the table. Can be either 'inline' (table is saved directly in the dictionary file), 'csv' (table is saved in an external CSV file) or 'bin' (table is saved in an external binary file).

classmethod from_dict(dict_ptr, loc, out_type='table', **kwargs)[source]¶

Convert a dictionary branch to a specific DictionaryEntry object.

Parameters:	dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry. loc – Location for the data to be loaded. out_type (str) – Output format of the data (`'array'` for numpy arrays or `'table'` for `DataTable` objects).

class pylablib.core.fileio.dict_entry.InlineTableDictionaryEntry(data=None, columns=None)[source]¶

Bases: pylablib.core.fileio.dict_entry.ITableDictionaryEntry

An inlined table Dictionary entry.

Parameters:	data – Table data. columns (list) – If not `None`, a list of column names (if `None` and data is a DataTable object, get column names from that).

to_dict(dict_ptr, loc)[source]¶: Convert the data to a dictionary branch and write the table to the file.

classmethod from_dict(dict_ptr, loc, out_type='table', **kwargs)[source]¶

Build an InlineTableDictionaryEntry object from the dictionary and read the inlined data.

Parameters:	dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry. loc – Location for the data to be loaded. out_type (str) – Output format of the data (`'array'` for numpy arrays or `'table'` for `DataTable` objects).

class pylablib.core.fileio.dict_entry.IExternalTableDictionaryEntry(data, file_format, name, columns, force_name=True, **kwargs)[source]¶

Bases: pylablib.core.fileio.dict_entry.ITableDictionaryEntry

classmethod from_dict(dict_ptr, loc, out_type='table', **kwargs)[source]¶

Convert a dictionary branch to a specific DictionaryEntry object.

Parameters:	dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry. loc – Location for the data to be loaded. out_type (str) – Output format of the data (`'array'` for numpy arrays or `'table'` for `DataTable` objects).

class pylablib.core.fileio.dict_entry.ExternalTextTableDictionaryEntry(data=None, file_format='csv', name='', columns=None, force_name=True, **kwargs)[source]¶

Bases: pylablib.core.fileio.dict_entry.IExternalTableDictionaryEntry

An external text table Dictionary entry.

Parameters:

data – Table data.
file_format (str) – Output file format.
name (str) – Name template for the external file (default is the full path connected with "_" symbol).
columns (list) – If not None, a list of column names (if None and data is a DataTable object, get column names from that).
force_name (bool) – If False and the target file already exists, generate a new unique name; otherwise, overwrite the file.

to_dict(dict_ptr, loc)[source]¶: Convert the data to a dictionary branch and save the table to an external file.

classmethod from_dict(dict_ptr, loc, out_type='table')[source]¶

Build an ExternalTextTableDictionaryEntry object from the dictionary and load the external data.

Parameters:	dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry. loc – Location for the data to be loaded. out_type (str) – Output format of the data (`'array'` for numpy arrays or `'table'` for `DataTable` objects).

class pylablib.core.fileio.dict_entry.ExternalBinTableDictionaryEntry(data=None, file_format='bin', name='', columns=None, force_name=True, **kwargs)[source]¶

Bases: pylablib.core.fileio.dict_entry.IExternalTableDictionaryEntry

An external binary table Dictionary entry.

Parameters:

data – Table data.
file_format (str) – Output file format.
name (str) – Name template for the external file (default is the full path connected with "_" symbol).
columns (list) – If not None, a list of column names (if None and data is a DataTable object, get column names from that).
force_name (bool) – If False and the target file already exists, generate a new unique name; otherwise, overwrite the file.

to_dict(dict_ptr, loc)[source]¶: Convert the data to a dictionary branch and save the table to an external file.

classmethod from_dict(dict_ptr, loc, out_type='table', **kwargs)[source]¶

Build an ExternalBinTableDictionaryEntry object from the dictionary and load the external data.

Parameters:	dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry. loc – Location for the data to be loaded. out_type (str) – Output format of the data (`'array'` for numpy arrays or `'table'` for `DataTable` objects).

class pylablib.core.fileio.dict_entry.IExternalFileDictionaryEntry(data, name='', force_name=True, **kwargs)[source]¶

Bases: pylablib.core.fileio.dict_entry.IDictionaryEntry

Generic dictionary entry for data in an external file.

Parameters:	data – Stored data. name (str) – Name template for the external file (default is the full path connected with `"_"` symbol). force_name (bool) – If `False` and the target file already exists, generate a new unique name; otherwise, overwrite the file.

file_format = None¶

static add_file_format(subclass)[source]¶

Register an IExternalFileDictionaryEntry as a possible stored file format.

Used to automatically invoke a correct loader when loading the dictionary file. Only needs to be done once after the subclass declaration.

to_dict(dict_ptr, loc)[source]¶: Convert the data to a dictionary branch and save the data to an external file.

classmethod from_dict(dict_ptr, loc, **kwargs)[source]¶

Build an IExternalFileDictionaryEntry object from the dictionary and load the external data.

Parameters:	dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry. loc – Location for the data to be loaded.

get_preamble()[source]¶: Generate preamble (dictionary with supplementary data which allows to load the data from the file)

save_file(loc_file)[source]¶

Save stored data into the given location.

Virtual method, should be overloaded in subclasses

classmethod load_file(loc_file, preamble, **kwargs)[source]¶

Load stored data from the given location, using the supplied preamble.

Virtual method, should be overloaded in subclasses

class pylablib.core.fileio.dict_entry.ExternalNumpyDictionaryEntry(data, name='', force_name=True, dtype=None, **kwargs)[source]¶

Bases: pylablib.core.fileio.dict_entry.IExternalFileDictionaryEntry

A dictionary entry which stores the numpy array data into an external file in binary format.

Parameters:	data – Numpy array data. name (str) – Name template for the external file (default is the full path connected with `"_"` symbol). force_name (bool) – If `False` and the target file already exists, generate a new unique name; otherwise, overwrite the file. dtype – numpy dtype to load/save the data (by default, dtype of the supplied data).

file_format = 'numpy'¶

get_preamble()[source]¶: Generate preamble (dictionary with supplementary data which allows to load the data from the file)

save_file(loc_file)[source]¶: Save stored data into the given location.

classmethod load_file(loc_file, preamble, **kwargs)[source]¶: Load stored data from the given location, using the supplied preamble.

class pylablib.core.fileio.dict_entry.ExpandedContainerDictionaryEntry(data, **kwargs)[source]¶

Bases: pylablib.core.fileio.dict_entry.IDictionaryEntry

A dictionary entry which expands containers (lists, tuples, dictionaries) into subdictionaries.

Useful when the data in the containers is complex, so writing it into one line (as is default for lists and tuples) wouldn’t work.

Parameters:	data – Container data.

to_dict(dict_ptr, loc)[source]¶: Convert the stored container to a dictionary branch.

classmethod from_dict(dict_ptr, loc, **kwargs)[source]¶: Build an ExpandedContainerDictionaryEntry object from the dictionary.

pylablib.core.fileio.loadfile module¶

Utilities for reading data files.

class pylablib.core.fileio.loadfile.IInputFileFormat[source]¶

Bases: object

Generic class for an input file format.

Based on file_format or autodetection, calls one of its subclasses to read the file.

static read_file(location_file, file_format, **kwargs)[source]¶

class pylablib.core.fileio.loadfile.ITextInputFileFormat[source]¶

Bases: pylablib.core.fileio.loadfile.IInputFileFormat

Generic class for a text input file format.

Based on file_format or autodetection, calls one of its subclasses to read the file.

static read_file(location_file, file_format, **kwargs)[source]¶

class pylablib.core.fileio.loadfile.CSVTableInputFileFormat[source]¶

Bases: pylablib.core.fileio.loadfile.ITextInputFileFormat

Class for CSV input file format.

static read_file(location_file, out_type='default', dtype='numeric', columns=None, delimiters=None, empty_entry_substitute=None, ignore_corrupted_lines=True, skip_lines=0, **kwargs)[source]¶

Read CSV file.

See parse_csv.load_table() for more description.

Parameters:

location_file – Location of the data.
out_type (str) – type of the result: 'array' for numpy array, 'pandas' for pandas DataFrame, 'table' for DataTable object, or 'default' (determined by the library default; 'table' by default)
dtype – dtype of entries; can be either a single type, or a list of types (one per column). Possible dtypes are: 'int', 'float', 'complex', 'numeric' (tries to coerce to minimal possible numeric type, raises error if data can’t be converted to complex), 'generic' (accept arbitrary types, including lists, dictionaries, escaped strings, etc.), 'raw' (keep raw string).
columns – either a number if columns, or a list of columns names.
delimiters (str) – Regex string which recognizes entries delimiters (by default r"\s*,\s*|\s+", i.e., commas and whitespaces).
empty_entry_substitute – Substitute for empty table entries. If None, all empty table entries are skipped.
ignore_corrupted_lines (bool) – If True, skip corrupted (e.g., non-numeric for numeric dtype, or with too few entries) lines; otherwise, raise ValueError.
skip_lines (int) – Number of lines to skip from the beginning of the file.

class pylablib.core.fileio.loadfile.DictionaryInputFileFormat[source]¶

Bases: pylablib.core.fileio.loadfile.ITextInputFileFormat

Class for Dictionary input file format.

static read_file(location_file, case_normalization=None, inline_dtype='generic', entry_format='value', skip_lines=0, **kwargs)[source]¶

Read Dictionary file.

Parameters:

location_file – Location of the data.
case_normalization (str) – If None, the dictionary paths are case-sensitive; otherwise, defines the way the entries are normalized ('lower' or 'upper').
inline_dtype (str) – dtype for inlined tables.
entry_format (str) – Determines the way for dealing with dict_entry.IDictionaryEntry objects (objects transformed into dictionary branches with special recognition rules). Can be 'branch' (don’t attempt to recognize those object, leave dictionary as in the file), 'dict_entry' (recognize and leave as dict_entry.IDictionaryEntry objects) or 'value' (recognize and keep the value).
skip_lines (int) – Number of lines to skip from the beginning of the file.

class pylablib.core.fileio.loadfile.BinaryTableInputFileFormatter[source]¶

Bases: pylablib.core.fileio.loadfile.IInputFileFormat

Class for binary input file format.

static read_file(location_file, out_type='default', dtype='>f8', columns=None, packing='flatten', preamble=None, skip_bytes=0, **kwargs)[source]¶

Read binary file.

Parameters:

location_file – Location of the data.
out_type (str) – type of the result: 'array' for numpy array, 'pandas' for pandas DataFrame, 'table' for DataTable object, or 'default' (determined by the library default; 'table' by default)
dtype – numpy.dtype describing the data.
columns – either number if columns, or a list of columns names.
packing (str) – The way the 2D array is packed. Can be either 'flatten' (data is stored row-wise) or 'transposed' (data is stored column-wise).
preamble (dict) – If not None, defines binary file parameters that supersede the parameteres supplied to the function. The defined parameters are 'dtype', 'packing', 'ncols' (number of columns) and 'nrows' (number of rows).
skip_bytes (int) – Number of bytes to skip from the beginning of the file.

pylablib.core.fileio.loadfile.load(path=None, input_format=None, loc='file', return_file=False, **kwargs)[source]¶

Load data from the file.

Parameters:	path (str) – Path to the file. input_format (str) – Input file format. If `None`, attempt to auto-detect file format (same as `'generic'`). loc (str) – Location type. return_file (bool) – If `True`, return `DataFile` object (contains some metainfo); otherwise, return just the file data.

**kwargs are passed to the file formatter used to read the data (see CSVTableInputFileFormat.read_file(), DictionaryInputFileFormat.read_file() and BinaryTableInputFileFormatter.read_file() for the possible arguments). The default format names are:

'generic': Generic file format. Attempt to autodetect, raise IOError if unsuccessful;

'txt': Generic text file. Attempt to autodetect, raise IOError if unsuccessful

'csv': CSV file, corresponds to CSVTableInputFileFormat;

'dict': Dictionary file, corresponds to DictionaryInputFileFormat;

'bin': Binary file, corresponds to BinaryTableInputFileFormatter

pylablib.core.fileio.location module¶

Classes for describing a generic file location.

class pylablib.core.fileio.location.LocationName(path=None, ext=None)[source]¶

Bases: object

File name inside a location.

Parameters:	path – Path inside the location. Gets normalized according to the Dictionary rules (not case-sensitive; `'/'` and `'\'` are the delimiters). ext (str) – Name extension (`None` is default).

get_path(default_path='', sep='/')[source]¶

Get the string path.

If the object’s path is None, use default_path instead. If sep is not None, use it to join the path entries; otherwise, return the path in a list form.

get_ext(default_ext='')[source]¶

Get the extension.

If the object’s ext is None, use default_ext instead.

to_string(default_path='', default_ext='', path_sep='/', ext_sep='|', add_empty_ext=True)[source]¶

Convert the path to a string representation.

Parameters:	default_path (str) – Use it as path if the object’s path is `None`. default_ext (str) – Use it as path if the object’s ext is `None`. path_sep (str) – Use it to join the path entries. ext_sep (str) – Use it to join path and extension. add_empty_ext (str) – If `False` and the extension is empty, don’t add ext_sep in the end.

to_path(default_path='', default_ext='', ext_sep='|', add_empty_ext=True)[source]¶

Convert the path to a list representation.

Extension is added with ext_sep to the last entry in the path.

Parameters:	default_path (str) – Use it as path if the object’s path is `None`. default_ext (str) – Use it as path if the object’s ext is `None`. ext_sep (str) – Use it to join path and extension. add_empty_ext (str) – If `False` and the extension is empty, don’t add ext_sep in the end.

static from_string(expr, ext_sep='|')[source]¶

Create a LocationName object from a string representation.

ext_sep defines extension separator; the path separators are '/' and '\'. Empty path or extension translate into None.

static from_object(obj)[source]¶

Create a LocationName object from an object.

obj can be a LocationName (return unchanged), tuple or list (use as construct arguments), string (treat as a string representation) or None (return empty name).

copy()[source]¶

class pylablib.core.fileio.location.LocationFile(loc, name=None)[source]¶

Bases: object

A file at a location.

Can be opened for reading or writing.

Parameters:	loc – File location. name – File’s name inside the location.

get_path()[source]¶: Get file path in a string representation.

open(mode='read', data_type='text')[source]¶

Open the file.

Parameters:	mode (str) – Opening mode. Can be `'read'`, `'write'` or `'append'`. data_type (str) – Either `'text'` or `'binary'`.

close()[source]¶: Close the file.

opening(mode='read', data_type='text')[source]¶

Context for working with opened file (file is closed automatically on exitting a with block),

Parameters:	mode (str) – Opening mode. Can be `'read'`, `'write'` or `'append'`. data_type (str) – Either `'text'` or `'binary'`.

class pylablib.core.fileio.location.IDataLocation[source]¶

Bases: object

Generic location.

is_name_free(name=None)[source]¶: Check if the name is unoccupied.

generate_new_name(prefix_name, idx=0)[source]¶

Generate a new name inside the location using the given prefix and starting index.

If idx is None, check just the prefix_name first before starting to append indices.

open_file(mode, name=None, data_type='text')[source]¶

Open a location file.

Parameters:	mode (str) – Opening mode. Can be `'read'`, `'write'` or `'append'`. name – File name inside the location. data_type (str) – Either `'text'` or `'binary'`.

close_file(name)[source]¶: Close a location file.

get_opened_files()[source]¶: Get a list for files opened in that location.

class pylablib.core.fileio.location.IFileSystemDataLocation[source]¶

Bases: pylablib.core.fileio.location.IDataLocation

A generic filesystem data location.

A single file name describes a single file in the filesystem.

get_filesystem_path(name=None, path_type='absolute')[source]¶

Get the filesystem path corresponding to a given name.

path_type can be 'absolute' (return absolute path), 'relative' (return relative path; level depends on the location) or 'name' (only return path inside the location).

is_name_free(name=None)[source]¶: Check if the name is unoccupied.

open_file(mode, name=None, data_type='text')[source]¶

Open a location file.

See IDataLocation.open_file() for arguments.

close_file(name)[source]¶: Close a location file.

get_opened_files()[source]¶: Get a list for files opened in that location.

class pylablib.core.fileio.location.SingleFileSystemDataLocation(file_path)[source]¶

Bases: pylablib.core.fileio.location.IFileSystemDataLocation

A location describing a single file.

Any use of a non-default name raises ValueError.

Parameters:	file_path (str) – The path to the file.

get_filesystem_path(name=None, path_type='absolute')[source]¶

Get the filesystem path corresponding to a given name.

path_type can be 'absolute' (return absolute path), 'relative' (return relative path; level depends on the location) or 'name' (only return path inside the location).

class pylablib.core.fileio.location.PrefixedFileSystemDataLocation(file_path, prefix_template='{0}_{1}')[source]¶

Bases: pylablib.core.fileio.location.IFileSystemDataLocation

A location describing a set of prefixed files.

Parameters:	file_path (str) – A master path. Its name is used as a prefix, and its extension is used as a default. prefix_template (str) – A `str.format()` string for generating prefixed files. Has two arguments: the first is the master name, the second is the sub_location.

Multi-level paths translate into nested folders (the top level folder is combined from the file_path prefix and the first path entry).

get_filesystem_path(name=None, path_type='absolute')[source]¶

Get the filesystem path corresponding to a given name.

path_type can be 'absolute' (return absolute path), 'relative' (return relative path; level depends on the location) or 'name' (only return path inside the location).

class pylablib.core.fileio.location.FolderFileSystemDataLocation(folder_path, default_name='', default_ext='')[source]¶

Bases: pylablib.core.fileio.location.IFileSystemDataLocation

A location describing a single folder.

Parameters:	folder_path (str) – A path to the folder. default_name (str) – The default file name. default_ext (str) – The default file extension.

Multi-level paths translate into nested subfolders.

get_filesystem_path(name=None, path_type='absolute')[source]¶

Get the filesystem path corresponding to a given name.

path_type can be 'absolute' (return absolute path), 'relative' (return relative path; level depends on the location) or 'name' (only return path inside the location).

pylablib.core.fileio.location.get_location(loc, path, *args, **kwargs)[source]¶

Build a location.

If loc or path are instances of IDataLocation, return them unchanged. If loc is a string, it describes location kind:

'single_file': SingleFileSystemDataLocation with the given path.

'file' or 'prefixed_file': PrefixedFileSystemDataLocation with the given path as a master path.

'folder': FolderFileSystemDataLocation with the given folder path.

Any additional arguments are relayed to the constructors.

pylablib.core.fileio.logfile module¶

class pylablib.core.fileio.logfile.LogFile(path, default_fmt=None)[source]¶

Bases: object

Expanding file.

Parameters:	path (str) – Path to the destination file. default_fmt (list) – If not `None`, it’s a defult value of fmt for `write_dataline()` method.

write_lines(lines, header='', add_timestamp=True)[source]¶

Write a single line into the file.

Create the file if it doesn’t exist.

Parameters:	line (str) – Data line to be added. header (str) – If non-empty, add it to the beginning of the file on creation. add_timestamp (bool) – If `True`, add the UNIX timestamp in the beginning of the line.

write_dataline(data, columns=None, fmt=None, add_timestamp=True)[source]¶

Write a single data line into the file.

Create the file if it doesn’t exist.

Parameters:	data (list or numpy.ndarray) – Data row to be added. columns (list) – If not `None`, it’s a list of column names to be added as a header on creation. fmt (str) – If not `None`, it’s a list of format strings for the line entries. add_timestamp (bool) – If `True`, add the UNIX timestamp in the beginning of the line.

write_multi_datalines(data, columns=None, fmt=None, add_timestamp=True)[source]¶

Write a multiple data lines into the file.

Create the file if it doesn’t exist.

Parameters:	data ([list]) – Data rows to be added. columns (list) – If not `None`, it’s a list of column names to be added as a header on creation. fmt (str) – If not `None`, it’s a list of format strings for the line entries. add_timestamp (bool) – If `True`, add the UNIX timestamp in the beginning of the line.

pylablib.core.fileio.parse_csv module¶

Utilities for parsing CSV files.

pylablib.core.fileio.parse_csv.read_table_and_comments(f, delimiters=re.compile('\\s*, \\s*|\\s+'), empty_entry_substitute=None, stop_comment=None, chunk_size=None, as_text=True, simple_entries=True)[source]¶

Load data table (in text format) and comments from the opened file f (must be open as binary).

Comment lines are the ones starting with #.

Parameters:

delimiters (str) – Regex string which recognizes delimiters (by default r"\s*,\s*|\s+", i.e., commas and whitespaces).
empty_entry_substitute – Substitute for empty table entries. If None, all empty table entries are skipped.
stop_comment (str) – Regex string for the stopping comment. If not None. the function will stop if comment satisfying stop_comment regex is encountered.
chunk_size (int) – Maximal size (number of lines) of the data to read.
as_text (bool) – If False, return entries as strings; otherwise, convert them into values.
simple_entries (bool) – If True, assume that there are no escaped strings or parenthesis structures in the files, so line splitting routine is simplified.

Returns:

(data, comments, finished), where data is 2D-list of table entries (already recognized unless as_text==True): and comments is a list of strings. Data lines may have different lengths. finished indicates if file has been read through the end (it’s True unless chunk_size is not None).

Return type:

tuple

class pylablib.core.fileio.parse_csv.ChunksAccumulator(dtype='numeric', ignore_corrupted_lines=True)[source]¶

Bases: object

Class for accumulating data chunks into a single array.

Parameters:

dtype – dtype of entries; can be either a single type, or a list of types (one per column). Possible dtypes are: 'int', 'float', 'complex', 'numeric' (tries to coerce to minimal possible numeric type, raises error if data can’t be converted to complex), 'generic' (accept arbitrary types, including lists, dictionaries, escaped strings, etc.), 'raw' (keep raw string).
ignore_corrupted_lines – If True, skip corrupted (e.g., non-numeric for numeric dtype, or with too few entries) lines; otherwise, raise ValueError.

corrupted_number()[source]¶

convert_columns(raw_columns)[source]¶: Convert raw columns into appropriate data structure (numpy array for numeric dtypes, list for generic and raw).

add_columns(columns)[source]¶: Append columns (lists or numpy arrays) to the existing data.

add_chunk(chunk)[source]¶: Add a chunk (2D list) to the pre-existing data.

pylablib.core.fileio.parse_csv.load_columns(f, dtype, delimiters='\\s*, \\s*|\\s+', empty_entry_substitute=None, ignore_corrupted_lines=True, stop_comment=None)[source]¶

Load columns from the file stream f.

Parameters:

dtype – dtype of entries; can be either a single type, or a list of types (one per column). Possible dtypes are: 'int', 'float', 'complex', 'numeric' (tries to coerce to minimal possible numeric type, raises error if data can’t be converted to complex), 'generic' (accept arbitrary types, including lists, dictionaries, escaped strings, etc.), 'raw' (keep raw string).
delimiters (str) – Regex string which recognizes delimiters (by default r"\s*,\s*|\s+", i.e., commas and whitespaces).
empty_entry_substitute – Substitute for empty table entries. If None, all empty table entries are skipped.
ignore_corrupted_lines – If True, skip corrupted (e.g., non-numeric for numeric dtype, or with too few entries) lines; otherwise, raise ValueError.
stop_comment (str) – Regex string for the stopping comment. If not None. the function will stop if comment satisfying stop_comment regex is encountered.

Returns:

(columns, comments, corrupted_lines).

columns is a list of columns with data.

comments is a list of comment strings.

corrupted_lines is a dict {'size':list, 'type':list} of corrupted lines (already split into entries), based on the corruption type ('size' means too small size, 'type' means it couldn’t be converted using provided dtype).

Return type:

tuple

pylablib.core.fileio.parse_csv.columns_to_table(data, columns=None, out_type='table')[source]¶

Convert data (columns list) into a table.

Parameters:	columns – either number if columns, or a list of columns names. out_type (str) – type of the result: `'array'` for numpy array, `'pandas'` for pandas DataFrame, `'table'` for `DataTable` object.

pylablib.core.fileio.parse_csv.load_table(f, dtype='numeric', columns=None, out_type='table', delimiters='\\s*, \\s*|\\s+', empty_entry_substitute=None, ignore_corrupted_lines=True, stop_comment=None)[source]¶

Load table from the file stream f.

Arguments are the same as in load_columns() and columns_to_table().

Returns:	`(table, comments, corrupted_lines)`. table is a table of the format out_type. corrupted_lines is a dict `{'size':list, 'type':list}` of corrupted lines (already split into entries), based on the corruption type (`'size'` means too small size, `'type'` means it couldn’t be converted using provided dtype). comments is a list of comment strings.
Return type:	tuple

pylablib.core.fileio.savefile module¶

Utilities for writing data files.

class pylablib.core.fileio.savefile.IOutputFileFormat(format_name, default_kwargs=None)[source]¶

Bases: object

Generic class for an output file format.

Parameters:	format_name (str) – The name of the format (to be defined in subclasses). default_kwargs (dict) – Default **kwargs values for the `write()` method.

write_file(location_file, to_save, *args, **kwargs)[source]¶

write(location_file, data, *args, **kwargs)[source]¶

class pylablib.core.fileio.savefile.ITextOutputFileFormat(format_name, save_props=True, save_comments=True, save_time=True, new_time=True, default_kwargs=None)[source]¶

Bases: pylablib.core.fileio.savefile.IOutputFileFormat

Generic class for a text output file format.

Parameters:

format_name (str) – The name of the format (to be defined in subclasses).
save_props (bool) – If True and saving ~datafile.DataFile object, save its props metainfo.
save_comments (bool) – If True and saving ~datafile.DataFile object, save its comments metainfo.
save_time (bool) – If True, append the file creation time in the end.
new_time (bool) – If saving ~datafile.DataFile object, determines if the time should be updated to the current time.
default_kwargs (dict) – Default **kwargs values for the IOutputFileFormat.write() method.

get_comment_line(comment)[source]¶

get_prop_line(name, value)[source]¶

get_time_line(time)[source]¶

static write_line(stream, line)[source]¶

write_comments(stream, comments)[source]¶

write_props(stream, props)[source]¶

write_time(stream, time)[source]¶

write_file(location_file, to_save, *args, **vargs)[source]¶

write_data(location_file, data, **kwargs)[source]¶

class pylablib.core.fileio.savefile.CSVTableOutputFileFormat(save_props=True, save_comments=True, save_time=True, save_columns=True, columns_delimiter='t', custom_reps=None, use_rep_classes=False, **kwargs)[source]¶

Bases: pylablib.core.fileio.savefile.ITextOutputFileFormat

Class for CSV output file format.

Parameters:

save_props (bool) – If True and saving ~datafile.DataFile object, save its props metainfo.
save_comments (bool) – If True and saving ~datafile.DataFile object, save its comments metainfo.
save_time (bool) – If True, append the file creation time in the end.
columns_delimiter (str) – Used to separate entries in a row.
custom_reps (str) – If not None, defines custom representations to be passed to utils.string.to_string() function.
use_rep_classes (bool) – If True, use representation classes for Dictionary entries (e.g., numpy arrays will be represented as "array([1, 2, 3])" instead of just "[1, 2, 3]"); This improves storage fidelity, but makes result harder to parse (e.g., by external string parsers).
**kwargs (dict) – Default **kwargs values for the IOutputFileFormat.write() method.

get_table_line(line)[source]¶

get_columns_line(columns)[source]¶

write_data(location_file, data, columns=None, **kwargs)[source]¶

Write data to a CSV file.

Parameters:	location_file – Location of the destination. data – Data to be saved. Can be `DataTable` or an arbitrary 2D array (numpy array, 2D list, etc.). columns ([str]) – If not `None`, the list of column names. If `None` and data is of type `DataTable`, use its columns names. If `None` and data is of other type, don’t put the column line in the output.

class pylablib.core.fileio.savefile.DictionaryOutputFileFormat(save_props=True, save_comments=True, save_time=True, table_format='inline', inline_columns_delimiter='t', inline_reps=None, param_reps=None, use_rep_classes=False, **kwargs)[source]¶

Bases: pylablib.core.fileio.savefile.ITextOutputFileFormat

Class for Dictionary output file format.

Parameters:

save_props (bool) – If True and saving ~datafile.DataFile object, save its props metainfo.
save_comments (bool) – If True and saving ~datafile.DataFile object, save its comments metainfo.
save_time (bool) – If True, append the file creation time in the end.
table_format (str) – Default format for table (numpy arrays or DataTable objects) entries. Can be 'inline' (table is written inside the file), 'csv' (external CSV file) or 'bin' (external binary file).
inline_columns_delimiter (str) – Used to separate entries in a row for inline tables.
inline_reps (str) – If not None, defines custom representations to be passed to utils.string.to_string() function when writing inline tables.
param_reps (str) – If not None, defines custom representations to be passed to utils.string.to_string() function when writing Dictionary entries.
use_rep_classes (bool) – If True, use representation classes for Dictionary entries (e.g., numpy arrays will be represented as "array([1, 2, 3])" instead of just "[1, 2, 3]"); This improves storage fidelity, but makes result harder to parse (e.g., by external string parsers).
**kwargs (dict) – Default **kwargs values for the IOutputFileFormat.write() method.

get_dictionary_line(path, value)[source]¶

write_data(loc_file, data, **kwargs)[source]¶

Write data to a Dictionary file.

Parameters:	location_file – Location of the destination. data – Data to be saved. Can be `DataTable` or an arbitrary 2D array (numpy array, 2D list, etc.).

class pylablib.core.fileio.savefile.IBinaryOutputFileFormat(format_name, default_kwargs=None)[source]¶

Bases: pylablib.core.fileio.savefile.IOutputFileFormat

get_preamble(loc_file, data)[source]¶

class pylablib.core.fileio.savefile.TableBinaryOutputFileFormat(dtype=None, transposed=False, **kwargs)[source]¶

Bases: pylablib.core.fileio.savefile.IBinaryOutputFileFormat

Class for binary output file format.

Parameters:	dtype – `numpy.dtype` describing the data. By default, `'>f8'` for real data and `'>c16'` for complex data. transposed (bool) – If `False`, write the data row-wise; otherwise, write it column-wise. *kwargs (dict) – Default *kwargs values for the `IOutputFileFormat.write()` method.

get_dtype(table)[source]¶

get_preamble(loc_file, data)[source]¶

Generate a preamble (dictionary describing the file format).

The parameters are 'dtype', 'packing' ('transposed' or 'flatten', depending on the transposed attribute), 'ncol' (number of columns) and 'nrows' (number of rows).

write_table(location_file, data)[source]¶

Write data to a binary file.

Parameters:	location_file – Location of the destination. data – Data to be saved. Can be `DataTable` or an arbitrary 2D array (numpy array, 2D list, etc.). Converted to numpy array before saving.

write_file(location_file, to_save, **kwargs)[source]¶

pylablib.core.fileio.savefile.get_output_format(data, output_format, **kwargs)[source]¶

pylablib.core.fileio.savefile.save(data, path='', output_format=None, loc='file', **kwargs)[source]¶

Save data to a file.

Parameters:	data – Data to be saved. path (str) – Path to the file. output_format (str) – Output file format. Can be either `None` (defaults to `'csv'` for table data and `'dict'` for Dictionary data), a string with one of the default format names, or an already prepared `IOutputFileFormat` object. loc (str) – Location type.

**kwargs are passed to the file formatter constructor (see CSVTableOutputFileFormat, DictionaryOutputFileFormat and TableBinaryOutputFileFormat for the possible arguments). The default format names are:

'csv': CSV file, corresponds to CSVTableOutputFileFormat;

'dict': Dictionary file, corresponds to DictionaryOutputFileFormat;

'bin': Binary file, corresponds to TableBinaryOutputFileFormat

pylablib.core.fileio package¶

Submodules¶

pylablib.core.fileio.binio module¶

pylablib.core.fileio.datafile module¶

pylablib.core.fileio.dict_entry module¶

pylablib.core.fileio.loadfile module¶

pylablib.core.fileio.location module¶

pylablib.core.fileio.logfile module¶

pylablib.core.fileio.parse_csv module¶

pylablib.core.fileio.savefile module¶

Module contents¶