pylablib.core.fileio package¶
Submodules¶
pylablib.core.fileio.binio module¶
Binary files input/output
-
pylablib.core.fileio.binio.
write_num
(x, f, dtype)[source]¶ Write a number x into file f.
dtype is the textual representation of data type (numpy-style).
-
pylablib.core.fileio.binio.
write_str
(s, f, dtype, strict=False)[source]¶ Write a string s into a file f.
dtype is the textual representation of data type. Can be
"s"
(simply translate into bytes and write),"sp"+sdtype
(e.g.,"sp<u2"
), where the string is prepended by its length written usingsdtype
format, or"s"+len
(e.g.,"s16"
), wherelen
is the textual representation of string length (written data is equivalent to"s"
format).If
strict==True
, raise error if string length is incompatible with the format (too long for a given"sp"
-type prefix, or doesn’t agree with"s"
-type length).
-
pylablib.core.fileio.binio.
write_pickle
(v, f, dtype)[source]¶ Write a value v into file f as a Python pickle object.
dtype is the textual representation of data type (numpy-style), and should be
"pk"+proto+sdtype
(e.g.,"pk3<u2"
), whereproto
is the textual representation of the pickle protocol, andsdtype
is the data type of the prepended string length (see"sp"+sdtype
type inwrite_str()
).
-
pylablib.core.fileio.binio.
write_val
(v, f, dtype)[source]¶ Write an arbitrary value v into file f using the supplied dtype.
Storage type depends on dtype: can be string (see
write_str()
), number (seewrite_num()
), or pickled value (seewrite_pickle()
). In addition, dtype can be a tuple of dtypes with length equal to the length of v, in which case the values in v are written sequentially.
-
pylablib.core.fileio.binio.
read_num
(f, dtype)[source]¶ Read a number from file f.
dtype is the textual representation of data type (numpy-style).
-
pylablib.core.fileio.binio.
read_str
(f, dtype)[source]¶ Read a string from file f.
dtype is the textual representation of data type. Can be
"sp"+sdtype
(e.g.,"sp<u2"
), where the string is prepended by its length written usingsdtype
format, or"s"+len
(e.g.,"s16"
), wherelen
is the textual representation of string length (i.e., readlen
bytes and translate the result into string).
-
pylablib.core.fileio.binio.
read_pickle
(f, dtype)[source]¶ Read a value from file f as a Python pickle object.
dtype is the textual representation of data type (numpy-style), and should be
"pk"+proto+sdtype
(e.g.,"pk3<u2"
), whereproto
is the textual representation of the pickle protocol (ignored, added for compatibility withwrite_pickle()
), andsdtype
is the data type of the prepended string length (see"sp"+sdtype
type inwrite_str()
).
-
pylablib.core.fileio.binio.
read_val
(f, dtype)[source]¶ Read an arbitrary value from file f using the supplied dtype.
Storage type depends on dtype: can be string (see
read_str()
), number (seeread_num()
), or pickled value (seeread_pickle()
). In addition, dtype can be a tuple of dtypes with length equal to the length of v, in which case the values in v are read sequentially.
-
pylablib.core.fileio.binio.
size_prepend
(f, dtype, added=0)[source]¶ Context manager that prepends the data written inside the block with its size (after the writing is done).
dtype specifies size format; added is added to the size when saving. Note that this method requires back-seeking, so it doesn’t work if the file is opened in append (
"a"
) mode; use"w"
or"r+"
mode instead.
pylablib.core.fileio.datafile module¶
-
class
pylablib.core.fileio.datafile.
DataFile
(data, filepath=None, filetype=None, creation_time=None, comments=None, props=None)[source]¶ Bases:
object
Describes a single datafile.
Parameters: - data – The main data of the file (usually
DataTable
orDictionary
). - filepath (str) – absolute path
- filetype (str) – a source type
- creation_time (datetime.datetime) – File creation time
- props (dict) – all the metainfo about the file (extracted from comments, filename etc.)
- comments (list) – all the comments excluding the ones containing props
- data – The main data of the file (usually
pylablib.core.fileio.dict_entry module¶
Classes for dealing with the Dictionary
entries with special conversion rules when saved or loaded.
Used to redefine how certain objects (e.g., tables) are written into files and read from files.
-
pylablib.core.fileio.dict_entry.
special_load_rules
(branch)[source]¶ Detect if the branch requires special conversion rules.
-
class
pylablib.core.fileio.dict_entry.
InlineTable
(table)[source]¶ Bases:
object
Simple marker class that denotes that the wrapped numpy 2D array should be written inline
-
class
pylablib.core.fileio.dict_entry.
IDictionaryEntry
(data=None)[source]¶ Bases:
object
A generic Dictionary entry.
-
to_dict
(dict_ptr, loc)[source]¶ Convert data to a dictionary branch on saving.
Virtual method, to be defined in subclasses.
Parameters: - dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be saved.
-
classmethod
build_entry
(data, **kwargs)[source]¶ Create a DictionaryEntry object based on the supplied data and arguments.
Parameters: data – Data to be saved.
-
classmethod
from_dict
(dict_ptr, loc, **kwargs)[source]¶ Convert a dictionary branch to a specific DictionaryEntry object.
Parameters: - dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be loaded.
-
-
pylablib.core.fileio.dict_entry.
add_entry_class
(cls, branch_predicate, object_predicate=None)[source]¶ Add an entry class which is automatically used in the
build_entry()
andfrom_dict()
functions to delegate work for converting objects and dictionary branches into dictionary entries.Parameters: - cls – the
IDictionaryEntry
subclass whose :meth:`build_entry andfrom_dict()
methods will be used - branch_predicate – predicate used to determine whether the specified subclass is used in`:func:from_dict function;
it is a function which takes a single argument (dictionary branch) and returns
True
if the conversion should be delegated to the subclass can be a string or a tuple of strings, in which case it is interpreted as passing branches with a given"_data_type__"
value. - object_predicate – predicate used to determine whether the specified subclass is used in`:func:build_entry function;
it is a function which takes a single argument (an object to be converted) and returns
True
if the conversion should be delegated to the subclass can be a type or a tuple of types, in which case it is interpreted as passing object of the given type; can also beNone
, which means that the predicate always returnsFalse
(i.e., this dictionary entries aren’t created automatically on dictionary saving, but need to be created manually)
- cls – the
-
pylablib.core.fileio.dict_entry.
build_entry
(data, **kwargs)[source]¶ Create a DictionaryEntry object based on the supplied data and arguments.
Parameters: - data – Data to be saved.
- dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be saved.
-
pylablib.core.fileio.dict_entry.
from_dict
(dict_ptr, loc, **kwargs)[source]¶ Convert a dictionary branch to a specific DictionaryEntry object.
Parameters: - dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be loaded.
-
class
pylablib.core.fileio.dict_entry.
ITableDictionaryEntry
(data=None, columns=None)[source]¶ Bases:
pylablib.core.fileio.dict_entry.IDictionaryEntry
A generic table Dictionary entry.
Parameters: - data – Table data.
- columns (list) – If not
None
, list of column names (ifNone
and data is a DataTable object, get column names from that).
-
classmethod
build_entry
(data, table_format='inline', **kwargs)[source]¶ Create a DictionaryEntry object based on the supplied data and arguments.
Parameters: - data – Data to be saved.
- dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be saved.
- table_format (str) – Method of saving the table. Can be either
'inline'
(table is saved directly in the dictionary file),'csv'
(table is saved in an external CSV file) or'bin'
(table is saved in an external binary file).
-
classmethod
from_dict
(dict_ptr, loc, out_type='table', **kwargs)[source]¶ Convert a dictionary branch to a specific DictionaryEntry object.
Parameters: - dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be loaded.
- out_type (str) – Output format of the data (
'array'
for numpy arrays or'table'
forDataTable
objects).
-
class
pylablib.core.fileio.dict_entry.
InlineTableDictionaryEntry
(data=None, columns=None)[source]¶ Bases:
pylablib.core.fileio.dict_entry.ITableDictionaryEntry
An inlined table Dictionary entry.
Parameters: - data – Table data.
- columns (list) – If not
None
, a list of column names (ifNone
and data is a DataTable object, get column names from that).
-
to_dict
(dict_ptr, loc)[source]¶ Convert the data to a dictionary branch and write the table to the file.
-
classmethod
from_dict
(dict_ptr, loc, out_type='table', **kwargs)[source]¶ Build an
InlineTableDictionaryEntry
object from the dictionary and read the inlined data.Parameters: - dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be loaded.
- out_type (str) – Output format of the data (
'array'
for numpy arrays or'table'
forDataTable
objects).
-
class
pylablib.core.fileio.dict_entry.
IExternalTableDictionaryEntry
(data, file_format, name, columns, force_name=True, **kwargs)[source]¶ Bases:
pylablib.core.fileio.dict_entry.ITableDictionaryEntry
-
classmethod
from_dict
(dict_ptr, loc, out_type='table', **kwargs)[source]¶ Convert a dictionary branch to a specific DictionaryEntry object.
Parameters: - dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be loaded.
- out_type (str) – Output format of the data (
'array'
for numpy arrays or'table'
forDataTable
objects).
-
classmethod
-
class
pylablib.core.fileio.dict_entry.
ExternalTextTableDictionaryEntry
(data=None, file_format='csv', name='', columns=None, force_name=True, **kwargs)[source]¶ Bases:
pylablib.core.fileio.dict_entry.IExternalTableDictionaryEntry
An external text table Dictionary entry.
Parameters: - data – Table data.
- file_format (str) – Output file format.
- name (str) – Name template for the external file (default is the full path connected with
"_"
symbol). - columns (list) – If not
None
, a list of column names (ifNone
and data is a DataTable object, get column names from that). - force_name (bool) – If
False
and the target file already exists, generate a new unique name; otherwise, overwrite the file.
-
to_dict
(dict_ptr, loc)[source]¶ Convert the data to a dictionary branch and save the table to an external file.
-
classmethod
from_dict
(dict_ptr, loc, out_type='table')[source]¶ Build an
ExternalTextTableDictionaryEntry
object from the dictionary and load the external data.Parameters: - dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be loaded.
- out_type (str) – Output format of the data (
'array'
for numpy arrays or'table'
forDataTable
objects).
-
class
pylablib.core.fileio.dict_entry.
ExternalBinTableDictionaryEntry
(data=None, file_format='bin', name='', columns=None, force_name=True, **kwargs)[source]¶ Bases:
pylablib.core.fileio.dict_entry.IExternalTableDictionaryEntry
An external binary table Dictionary entry.
Parameters: - data – Table data.
- file_format (str) – Output file format.
- name (str) – Name template for the external file (default is the full path connected with
"_"
symbol). - columns (list) – If not
None
, a list of column names (ifNone
and data is a DataTable object, get column names from that). - force_name (bool) – If
False
and the target file already exists, generate a new unique name; otherwise, overwrite the file.
-
to_dict
(dict_ptr, loc)[source]¶ Convert the data to a dictionary branch and save the table to an external file.
-
classmethod
from_dict
(dict_ptr, loc, out_type='table', **kwargs)[source]¶ Build an
ExternalBinTableDictionaryEntry
object from the dictionary and load the external data.Parameters: - dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be loaded.
- out_type (str) – Output format of the data (
'array'
for numpy arrays or'table'
forDataTable
objects).
-
class
pylablib.core.fileio.dict_entry.
IExternalFileDictionaryEntry
(data, name='', force_name=True, **kwargs)[source]¶ Bases:
pylablib.core.fileio.dict_entry.IDictionaryEntry
Generic dictionary entry for data in an external file.
Parameters: -
file_format
= None¶
-
static
add_file_format
(subclass)[source]¶ Register an
IExternalFileDictionaryEntry
as a possible stored file format.Used to automatically invoke a correct loader when loading the dictionary file. Only needs to be done once after the subclass declaration.
-
to_dict
(dict_ptr, loc)[source]¶ Convert the data to a dictionary branch and save the data to an external file.
-
classmethod
from_dict
(dict_ptr, loc, **kwargs)[source]¶ Build an
IExternalFileDictionaryEntry
object from the dictionary and load the external data.Parameters: - dict_ptr (DictionaryPointer) – Pointer to the dictionary location for the entry.
- loc – Location for the data to be loaded.
-
get_preamble
()[source]¶ Generate preamble (dictionary with supplementary data which allows to load the data from the file)
-
-
class
pylablib.core.fileio.dict_entry.
ExternalNumpyDictionaryEntry
(data, name='', force_name=True, dtype=None, **kwargs)[source]¶ Bases:
pylablib.core.fileio.dict_entry.IExternalFileDictionaryEntry
A dictionary entry which stores the numpy array data into an external file in binary format.
Parameters: - data – Numpy array data.
- name (str) – Name template for the external file (default is the full path connected with
"_"
symbol). - force_name (bool) – If
False
and the target file already exists, generate a new unique name; otherwise, overwrite the file. - dtype – numpy dtype to load/save the data (by default, dtype of the supplied data).
-
file_format
= 'numpy'¶
-
class
pylablib.core.fileio.dict_entry.
ExpandedContainerDictionaryEntry
(data, **kwargs)[source]¶ Bases:
pylablib.core.fileio.dict_entry.IDictionaryEntry
A dictionary entry which expands containers (lists, tuples, dictionaries) into subdictionaries.
Useful when the data in the containers is complex, so writing it into one line (as is default for lists and tuples) wouldn’t work.
Parameters: data – Container data. -
classmethod
from_dict
(dict_ptr, loc, **kwargs)[source]¶ Build an
ExpandedContainerDictionaryEntry
object from the dictionary.
-
classmethod
pylablib.core.fileio.loadfile module¶
Utilities for reading data files.
-
class
pylablib.core.fileio.loadfile.
IInputFileFormat
[source]¶ Bases:
object
Generic class for an input file format.
Based on file_format or autodetection, calls one of its subclasses to read the file.
-
class
pylablib.core.fileio.loadfile.
ITextInputFileFormat
[source]¶ Bases:
pylablib.core.fileio.loadfile.IInputFileFormat
Generic class for a text input file format.
Based on file_format or autodetection, calls one of its subclasses to read the file.
-
class
pylablib.core.fileio.loadfile.
CSVTableInputFileFormat
[source]¶ Bases:
pylablib.core.fileio.loadfile.ITextInputFileFormat
Class for CSV input file format.
-
static
read_file
(location_file, out_type='default', dtype='numeric', columns=None, delimiters=None, empty_entry_substitute=None, ignore_corrupted_lines=True, skip_lines=0, **kwargs)[source]¶ Read CSV file.
See
parse_csv.load_table()
for more description.Parameters: - location_file – Location of the data.
- out_type (str) – type of the result:
'array'
for numpy array,'pandas'
for pandas DataFrame,'table'
forDataTable
object, or'default'
(determined by the library default;'table'
by default) - dtype – dtype of entries; can be either a single type, or a list of types (one per column).
Possible dtypes are:
'int'
,'float'
,'complex'
,'numeric'
(tries to coerce to minimal possible numeric type, raises error if data can’t be converted to complex),'generic'
(accept arbitrary types, including lists, dictionaries, escaped strings, etc.),'raw'
(keep raw string). - columns – either a number if columns, or a list of columns names.
- delimiters (str) – Regex string which recognizes entries delimiters (by default
r"\s*,\s*|\s+"
, i.e., commas and whitespaces). - empty_entry_substitute – Substitute for empty table entries. If
None
, all empty table entries are skipped. - ignore_corrupted_lines (bool) – If
True
, skip corrupted (e.g., non-numeric for numeric dtype, or with too few entries) lines; otherwise, raiseValueError
. - skip_lines (int) – Number of lines to skip from the beginning of the file.
-
static
-
class
pylablib.core.fileio.loadfile.
DictionaryInputFileFormat
[source]¶ Bases:
pylablib.core.fileio.loadfile.ITextInputFileFormat
Class for Dictionary input file format.
-
static
read_file
(location_file, case_normalization=None, inline_dtype='generic', entry_format='value', skip_lines=0, **kwargs)[source]¶ Read Dictionary file.
Parameters: - location_file – Location of the data.
- case_normalization (str) – If
None
, the dictionary paths are case-sensitive; otherwise, defines the way the entries are normalized ('lower'
or'upper'
). - inline_dtype (str) – dtype for inlined tables.
- entry_format (str) – Determines the way for dealing with
dict_entry.IDictionaryEntry
objects (objects transformed into dictionary branches with special recognition rules). Can be'branch'
(don’t attempt to recognize those object, leave dictionary as in the file),'dict_entry'
(recognize and leave asdict_entry.IDictionaryEntry
objects) or'value'
(recognize and keep the value). - skip_lines (int) – Number of lines to skip from the beginning of the file.
-
static
-
class
pylablib.core.fileio.loadfile.
BinaryTableInputFileFormatter
[source]¶ Bases:
pylablib.core.fileio.loadfile.IInputFileFormat
Class for binary input file format.
-
static
read_file
(location_file, out_type='default', dtype='>f8', columns=None, packing='flatten', preamble=None, skip_bytes=0, **kwargs)[source]¶ Read binary file.
Parameters: - location_file – Location of the data.
- out_type (str) – type of the result:
'array'
for numpy array,'pandas'
for pandas DataFrame,'table'
forDataTable
object, or'default'
(determined by the library default;'table'
by default) - dtype –
numpy.dtype
describing the data. - columns – either number if columns, or a list of columns names.
- packing (str) – The way the 2D array is packed. Can be either
'flatten'
(data is stored row-wise) or'transposed'
(data is stored column-wise). - preamble (dict) – If not
None
, defines binary file parameters that supersede the parameteres supplied to the function. The defined parameters are'dtype'
,'packing'
,'ncols'
(number of columns) and'nrows'
(number of rows). - skip_bytes (int) – Number of bytes to skip from the beginning of the file.
-
static
-
pylablib.core.fileio.loadfile.
load
(path=None, input_format=None, loc='file', return_file=False, **kwargs)[source]¶ Load data from the file.
Parameters: **kwargs are passed to the file formatter used to read the data (see
CSVTableInputFileFormat.read_file()
,DictionaryInputFileFormat.read_file()
andBinaryTableInputFileFormatter.read_file()
for the possible arguments). The default format names are:'generic'
: Generic file format. Attempt to autodetect, raiseIOError
if unsuccessful;'txt'
: Generic text file. Attempt to autodetect, raiseIOError
if unsuccessful'csv'
: CSV file, corresponds toCSVTableInputFileFormat
;'dict'
: Dictionary file, corresponds toDictionaryInputFileFormat
;'bin'
: Binary file, corresponds toBinaryTableInputFileFormatter
pylablib.core.fileio.location module¶
Classes for describing a generic file location.
-
class
pylablib.core.fileio.location.
LocationName
(path=None, ext=None)[source]¶ Bases:
object
File name inside a location.
Parameters: - path – Path inside the location. Gets normalized according to the Dictionary rules (not case-sensitive;
'/'
and'\'
are the delimiters). - ext (str) – Name extension (
None
is default).
-
get_path
(default_path='', sep='/')[source]¶ Get the string path.
If the object’s path is
None
, use default_path instead. If sep is notNone
, use it to join the path entries; otherwise, return the path in a list form.
-
get_ext
(default_ext='')[source]¶ Get the extension.
If the object’s ext is
None
, use default_ext instead.
-
to_string
(default_path='', default_ext='', path_sep='/', ext_sep='|', add_empty_ext=True)[source]¶ Convert the path to a string representation.
Parameters: - default_path (str) – Use it as path if the object’s path is
None
. - default_ext (str) – Use it as path if the object’s ext is
None
. - path_sep (str) – Use it to join the path entries.
- ext_sep (str) – Use it to join path and extension.
- add_empty_ext (str) – If
False
and the extension is empty, don’t add ext_sep in the end.
- default_path (str) – Use it as path if the object’s path is
-
to_path
(default_path='', default_ext='', ext_sep='|', add_empty_ext=True)[source]¶ Convert the path to a list representation.
Extension is added with ext_sep to the last entry in the path.
Parameters:
-
static
from_string
(expr, ext_sep='|')[source]¶ Create a
LocationName
object from a string representation.ext_sep defines extension separator; the path separators are
'/'
and'\'
. Empty path or extension translate intoNone
.
-
static
from_object
(obj)[source]¶ Create a
LocationName
object from an object.obj can be a
LocationName
(return unchanged), tuple or list (use as construct arguments), string (treat as a string representation) orNone
(return empty name).
- path – Path inside the location. Gets normalized according to the Dictionary rules (not case-sensitive;
-
class
pylablib.core.fileio.location.
LocationFile
(loc, name=None)[source]¶ Bases:
object
A file at a location.
Can be opened for reading or writing.
Parameters: - loc – File location.
- name – File’s name inside the location.
-
class
pylablib.core.fileio.location.
IDataLocation
[source]¶ Bases:
object
Generic location.
-
generate_new_name
(prefix_name, idx=0)[source]¶ Generate a new name inside the location using the given prefix and starting index.
If idx is
None
, check just the prefix_name first before starting to append indices.
-
-
class
pylablib.core.fileio.location.
IFileSystemDataLocation
[source]¶ Bases:
pylablib.core.fileio.location.IDataLocation
A generic filesystem data location.
A single file name describes a single file in the filesystem.
-
get_filesystem_path
(name=None, path_type='absolute')[source]¶ Get the filesystem path corresponding to a given name.
path_type can be
'absolute'
(return absolute path),'relative'
(return relative path; level depends on the location) or'name'
(only return path inside the location).
-
open_file
(mode, name=None, data_type='text')[source]¶ Open a location file.
See
IDataLocation.open_file()
for arguments.
-
-
class
pylablib.core.fileio.location.
SingleFileSystemDataLocation
(file_path)[source]¶ Bases:
pylablib.core.fileio.location.IFileSystemDataLocation
A location describing a single file.
Any use of a non-default name raises
ValueError
.Parameters: file_path (str) – The path to the file.
-
class
pylablib.core.fileio.location.
PrefixedFileSystemDataLocation
(file_path, prefix_template='{0}_{1}')[source]¶ Bases:
pylablib.core.fileio.location.IFileSystemDataLocation
A location describing a set of prefixed files.
Parameters: - file_path (str) – A master path. Its name is used as a prefix, and its extension is used as a default.
- prefix_template (str) – A
str.format()
string for generating prefixed files. Has two arguments: the first is the master name, the second is the sub_location.
Multi-level paths translate into nested folders (the top level folder is combined from the file_path prefix and the first path entry).
-
class
pylablib.core.fileio.location.
FolderFileSystemDataLocation
(folder_path, default_name='', default_ext='')[source]¶ Bases:
pylablib.core.fileio.location.IFileSystemDataLocation
A location describing a single folder.
Parameters: Multi-level paths translate into nested subfolders.
-
pylablib.core.fileio.location.
get_location
(loc, path, *args, **kwargs)[source]¶ Build a location.
If loc or path are instances of
IDataLocation
, return them unchanged. If loc is a string, it describes location kind:'single_file'
:SingleFileSystemDataLocation
with the given path.'file'
or'prefixed_file'
:PrefixedFileSystemDataLocation
with the given path as a master path.'folder'
:FolderFileSystemDataLocation
with the given folder path.
Any additional arguments are relayed to the constructors.
pylablib.core.fileio.logfile module¶
-
class
pylablib.core.fileio.logfile.
LogFile
(path, default_fmt=None)[source]¶ Bases:
object
Expanding file.
Parameters: - path (str) – Path to the destination file.
- default_fmt (list) – If not
None
, it’s a defult value of fmt forwrite_dataline()
method.
-
write_lines
(lines, header='', add_timestamp=True)[source]¶ Write a single line into the file.
Create the file if it doesn’t exist.
Parameters:
-
write_dataline
(data, columns=None, fmt=None, add_timestamp=True)[source]¶ Write a single data line into the file.
Create the file if it doesn’t exist.
Parameters: - data (list or numpy.ndarray) – Data row to be added.
- columns (list) – If not
None
, it’s a list of column names to be added as a header on creation. - fmt (str) – If not
None
, it’s a list of format strings for the line entries. - add_timestamp (bool) – If
True
, add the UNIX timestamp in the beginning of the line.
pylablib.core.fileio.parse_csv module¶
Utilities for parsing CSV files.
-
pylablib.core.fileio.parse_csv.
read_table_and_comments
(f, delimiters=re.compile('\\s*, \\s*|\\s+'), empty_entry_substitute=None, stop_comment=None, chunk_size=None, as_text=True, simple_entries=True)[source]¶ Load data table (in text format) and comments from the opened file f (must be open as binary).
Comment lines are the ones starting with
#
.Parameters: - delimiters (str) – Regex string which recognizes delimiters (by default
r"\s*,\s*|\s+"
, i.e., commas and whitespaces). - empty_entry_substitute – Substitute for empty table entries. If
None
, all empty table entries are skipped. - stop_comment (str) – Regex string for the stopping comment.
If not
None
. the function will stop if comment satisfying stop_comment regex is encountered. - chunk_size (int) – Maximal size (number of lines) of the data to read.
- as_text (bool) – If
False
, return entries as strings; otherwise, convert them into values. - simple_entries (bool) – If
True
, assume that there are no escaped strings or parenthesis structures in the files, so line splitting routine is simplified.
Returns: (data, comments, finished)
, where data is 2D-list of table entries (already recognized unlessas_text==True
)and comments is a list of strings. Data lines may have different lengths. finished indicates if file has been read through the end (it’s
True
unless chunk_size is notNone
).
Return type: - delimiters (str) – Regex string which recognizes delimiters (by default
-
class
pylablib.core.fileio.parse_csv.
ChunksAccumulator
(dtype='numeric', ignore_corrupted_lines=True)[source]¶ Bases:
object
Class for accumulating data chunks into a single array.
Parameters: - dtype – dtype of entries; can be either a single type, or a list of types (one per column).
Possible dtypes are:
'int'
,'float'
,'complex'
,'numeric'
(tries to coerce to minimal possible numeric type, raises error if data can’t be converted to complex),'generic'
(accept arbitrary types, including lists, dictionaries, escaped strings, etc.),'raw'
(keep raw string). - ignore_corrupted_lines – If
True
, skip corrupted (e.g., non-numeric for numeric dtype, or with too few entries) lines; otherwise, raiseValueError
.
- dtype – dtype of entries; can be either a single type, or a list of types (one per column).
Possible dtypes are:
-
pylablib.core.fileio.parse_csv.
load_columns
(f, dtype, delimiters='\\s*, \\s*|\\s+', empty_entry_substitute=None, ignore_corrupted_lines=True, stop_comment=None)[source]¶ Load columns from the file stream f.
Parameters: - dtype – dtype of entries; can be either a single type, or a list of types (one per column).
Possible dtypes are:
'int'
,'float'
,'complex'
,'numeric'
(tries to coerce to minimal possible numeric type, raises error if data can’t be converted to complex),'generic'
(accept arbitrary types, including lists, dictionaries, escaped strings, etc.),'raw'
(keep raw string). - delimiters (str) – Regex string which recognizes delimiters (by default
r"\s*,\s*|\s+"
, i.e., commas and whitespaces). - empty_entry_substitute – Substitute for empty table entries. If
None
, all empty table entries are skipped. - ignore_corrupted_lines – If
True
, skip corrupted (e.g., non-numeric for numeric dtype, or with too few entries) lines; otherwise, raiseValueError
. - stop_comment (str) – Regex string for the stopping comment.
If not
None
. the function will stop if comment satisfying stop_comment regex is encountered.
Returns: (columns, comments, corrupted_lines)
.columns is a list of columns with data.
comments is a list of comment strings.
corrupted_lines is a dict
{'size':list, 'type':list}
of corrupted lines (already split into entries), based on the corruption type ('size'
means too small size,'type'
means it couldn’t be converted using provided dtype).Return type: - dtype – dtype of entries; can be either a single type, or a list of types (one per column).
Possible dtypes are:
-
pylablib.core.fileio.parse_csv.
columns_to_table
(data, columns=None, out_type='table')[source]¶ Convert data (columns list) into a table.
Parameters:
-
pylablib.core.fileio.parse_csv.
load_table
(f, dtype='numeric', columns=None, out_type='table', delimiters='\\s*, \\s*|\\s+', empty_entry_substitute=None, ignore_corrupted_lines=True, stop_comment=None)[source]¶ Load table from the file stream f.
Arguments are the same as in
load_columns()
andcolumns_to_table()
.Returns: (table, comments, corrupted_lines)
.table is a table of the format out_type.corrupted_lines is a dict
{'size':list, 'type':list}
of corrupted lines (already split into entries), based on the corruption type ('size'
means too small size,'type'
means it couldn’t be converted using provided dtype).comments is a list of comment strings.
Return type: tuple
pylablib.core.fileio.savefile module¶
Utilities for writing data files.
-
class
pylablib.core.fileio.savefile.
IOutputFileFormat
(format_name, default_kwargs=None)[source]¶ Bases:
object
Generic class for an output file format.
Parameters:
-
class
pylablib.core.fileio.savefile.
ITextOutputFileFormat
(format_name, save_props=True, save_comments=True, save_time=True, new_time=True, default_kwargs=None)[source]¶ Bases:
pylablib.core.fileio.savefile.IOutputFileFormat
Generic class for a text output file format.
Parameters: - format_name (str) – The name of the format (to be defined in subclasses).
- save_props (bool) – If
True
and saving ~datafile.DataFile object, save its props metainfo. - save_comments (bool) – If
True
and saving ~datafile.DataFile object, save its comments metainfo. - save_time (bool) – If
True
, append the file creation time in the end. - new_time (bool) – If saving ~datafile.DataFile object, determines if the time should be updated to the current time.
- default_kwargs (dict) – Default **kwargs values for the
IOutputFileFormat.write()
method.
-
class
pylablib.core.fileio.savefile.
CSVTableOutputFileFormat
(save_props=True, save_comments=True, save_time=True, save_columns=True, columns_delimiter='t', custom_reps=None, use_rep_classes=False, **kwargs)[source]¶ Bases:
pylablib.core.fileio.savefile.ITextOutputFileFormat
Class for CSV output file format.
Parameters: - save_props (bool) – If
True
and saving ~datafile.DataFile object, save its props metainfo. - save_comments (bool) – If
True
and saving ~datafile.DataFile object, save its comments metainfo. - save_time (bool) – If
True
, append the file creation time in the end. - columns_delimiter (str) – Used to separate entries in a row.
- custom_reps (str) – If not
None
, defines custom representations to be passed toutils.string.to_string()
function. - use_rep_classes (bool) – If
True
, use representation classes for Dictionary entries (e.g., numpy arrays will be represented as"array([1, 2, 3])"
instead of just"[1, 2, 3]"
); This improves storage fidelity, but makes result harder to parse (e.g., by external string parsers). - **kwargs (dict) – Default **kwargs values for the
IOutputFileFormat.write()
method.
-
write_data
(location_file, data, columns=None, **kwargs)[source]¶ Write data to a CSV file.
Parameters: - location_file – Location of the destination.
- data – Data to be saved. Can be
DataTable
or an arbitrary 2D array (numpy array, 2D list, etc.). - columns ([str]) – If not
None
, the list of column names. IfNone
and data is of typeDataTable
, use its columns names. IfNone
and data is of other type, don’t put the column line in the output.
- save_props (bool) – If
-
class
pylablib.core.fileio.savefile.
DictionaryOutputFileFormat
(save_props=True, save_comments=True, save_time=True, table_format='inline', inline_columns_delimiter='t', inline_reps=None, param_reps=None, use_rep_classes=False, **kwargs)[source]¶ Bases:
pylablib.core.fileio.savefile.ITextOutputFileFormat
Class for Dictionary output file format.
Parameters: - save_props (bool) – If
True
and saving ~datafile.DataFile object, save its props metainfo. - save_comments (bool) – If
True
and saving ~datafile.DataFile object, save its comments metainfo. - save_time (bool) – If
True
, append the file creation time in the end. - table_format (str) – Default format for table (numpy arrays or
DataTable
objects) entries. Can be'inline'
(table is written inside the file),'csv'
(external CSV file) or'bin'
(external binary file). - inline_columns_delimiter (str) – Used to separate entries in a row for inline tables.
- inline_reps (str) – If not
None
, defines custom representations to be passed toutils.string.to_string()
function when writing inline tables. - param_reps (str) – If not
None
, defines custom representations to be passed toutils.string.to_string()
function when writing Dictionary entries. - use_rep_classes (bool) – If
True
, use representation classes for Dictionary entries (e.g., numpy arrays will be represented as"array([1, 2, 3])"
instead of just"[1, 2, 3]"
); This improves storage fidelity, but makes result harder to parse (e.g., by external string parsers). - **kwargs (dict) – Default **kwargs values for the
IOutputFileFormat.write()
method.
- save_props (bool) – If
-
class
pylablib.core.fileio.savefile.
IBinaryOutputFileFormat
(format_name, default_kwargs=None)[source]¶
-
class
pylablib.core.fileio.savefile.
TableBinaryOutputFileFormat
(dtype=None, transposed=False, **kwargs)[source]¶ Bases:
pylablib.core.fileio.savefile.IBinaryOutputFileFormat
Class for binary output file format.
Parameters: - dtype –
numpy.dtype
describing the data. By default,'>f8'
for real data and'>c16'
for complex data. - transposed (bool) – If
False
, write the data row-wise; otherwise, write it column-wise. - **kwargs (dict) – Default **kwargs values for the
IOutputFileFormat.write()
method.
-
get_preamble
(loc_file, data)[source]¶ Generate a preamble (dictionary describing the file format).
The parameters are
'dtype'
,'packing'
('transposed'
or'flatten'
, depending on the transposed attribute),'ncol'
(number of columns) and'nrows'
(number of rows).
- dtype –
-
pylablib.core.fileio.savefile.
save
(data, path='', output_format=None, loc='file', **kwargs)[source]¶ Save data to a file.
Parameters: - data – Data to be saved.
- path (str) – Path to the file.
- output_format (str) – Output file format. Can be either
None
(defaults to'csv'
for table data and'dict'
for Dictionary data), a string with one of the default format names, or an already preparedIOutputFileFormat
object. - loc (str) – Location type.
**kwargs are passed to the file formatter constructor (see
CSVTableOutputFileFormat
,DictionaryOutputFileFormat
andTableBinaryOutputFileFormat
for the possible arguments). The default format names are:'csv'
: CSV file, corresponds toCSVTableOutputFileFormat
;'dict'
: Dictionary file, corresponds toDictionaryOutputFileFormat
;'bin'
: Binary file, corresponds toTableBinaryOutputFileFormat