loader Module¶

Utilities for loading data sets.

@author: drusk

pml.data.loader.load(path, has_ids=True, has_header=True, has_labels=True, delimiter=', ')[source]¶

Loads a data set from a delimited text file.

Args:

path:: the path to the file containing the data set.
has_ids: boolean: set to False if the first column in the loaded dataset should not be interpreted as a feature instead of sample identifiers. Defaults to True, i.e. first column are interpreted as sample identifiers.
has_header: boolean: set to False if the data being loaded does not have column headers on the first line. Defaults to true.
has_labels: boolean: set to False if the data being loaded does not have classification labels for each sample. Defaults to True. The labels should be the last column in the dataset being loaded.
delimiter: string: the symbol used to separate columns in the file. Default value is ‘,’. Hint: delimiter for tab-delimited files is ‘ ‘.

Returns:

A DataSet object.

pml.data.loader.shell_load(path, has_ids=True, has_header=True, has_labels=True, delimiter=', ')[source]¶

Loads a data set from a delimited text file. Will search through sample data sets.

Args:

path:: the path to the file containing the data set.
has_ids: boolean: set to False if the first column in the loaded dataset should not be interpreted as a feature instead of sample identifiers. Defaults to True, i.e. first column are interpreted as sample identifiers.
has_header: boolean: set to False if the data being loaded does not have column headers on the first line. Defaults to true.
has_labels: boolean: set to False if the data being loaded does not have classification labels for each sample. Defaults to True. The labels should be the last column in the dataset being loaded.
delimiter: string: the symbol used to separate columns in the file. Default value is ‘,’. Hint: delimiter for tab-delimited files is ‘ ‘.

Returns:

A DataSet object.