Dataset class.
Macros:
Enumerations:
Type of field.
Type of column.
Typedefs:
Type for the number of fields in a row.
Type for the number of rows in a dataset.
Type for the number of different values in a field.
Struct CapyDatasetRow :
Struct CapyDatasetRow's properties:
Original string of the row with commas replaced with \0.
Pointers to each field in the row.
Index of the row in the dataset, starting at 0 not counting the header lines.
Struct CapyDatasetRow's methods:
Destructor
Struct CapyDatasetFieldDesc :
Struct CapyDatasetFieldDesc's properties:
Pointer to the field label.
Field type.
Field interface.
Index in the row.
For fields of categorical types, number of value in the category, for field of numerical types, number of row in the dataset
For fields of categorical types, array of pointer to the category's values
Range of values (converted to numerical if the field is categorical)
Struct CapyDatasetFieldDesc's methods:
Destructor
Struct CapyDataset :
Struct CapyDataset's properties:
Number of rows.
Number of fields in each row.
Fields interface row with comma replaced with \0.
Fields type row with comma replaced with \0.
Fields label row with comma replaced with \0.
Fields description.
Array of rows.
Number of threads for multithreaded operation (default: 10)
Struct CapyDataset's methods:
Destructor
Load the dataset from a file at a given path
Input argument(s):
path: path to the dataset file
Exception(s):
May raise CapyExc_MallocFailed, CapyExc_StreamReadError, CapyExc_InvalidStream.
Print the dataset description of the dataset on a given stream.
Input argument(s):
stream: the stream to print onto
Get the number of input fields
Output and side effect(s):
Return the number of input fields
Get the number of output fields
Output and side effect(s):
Return the number of output fields
Get the field index of the i-th input
Input argument(s):
iInput: index of the input
Output and side effect(s):
Return the index
Exception(s):
May raise CapyExc_InvalidElemIdx.
Get the field index of the i-th output
Input argument(s):
iOutput: index of the output
Output and side effect(s):
Return the index
Exception(s):
May raise CapyExc_InvalidElemIdx.
Get a value as a numeral. Inputs: iRow: index of the row iField: index of the field
Output and side effect(s):
For numeral fields return the value as it is, for categorical fields return the index of the value in the list of possible values (fieldDesc.categoryVals)
Exception(s):
May raise CapyExc_UndefinedExecution.
Get a value as a normalised numeral. Inputs: iRow: index of the row iField: index of the field
Output and side effect(s):
Return the value, converted to numerical if the field is categorical, after normalisation according to the 'range' property of the field description.
Exception(s):
May raise CapyExc_UndefinedExecution.
Convert a dataset to a matrix to be used by a single category classifier
Input argument(s):
iOutput: index of the output
Output and side effect(s):
Return a matrix with as many rows as there are rows in the dataset, and as many columns as there are inputs in the dataset plus one. The output must be of type capyDatasetFieldType_cat. Input values are converted using getValAsNum. The output value is assigned to the last column in the matrix, and equal to the category index.
Exception(s):
May raise CapyExc_UnsupportedFormat.
Get the number of different values for a given output Inputs: iField: index of the output
Output and side effect(s):
Return the number of different values for a categorical output field or the number of rows for a numerical output field
Convert a dataset to a matrix to be used by a one hot classifier
Input argument(s):
iOutput: index of the output
Output and side effect(s):
Return a matrix with as many rows as there are rows in the dataset, and as many columns as there are inputs in the dataset plus as many values the given output takes. The output must be of type capyDatasetFieldType_cat. Input values are converted using getValAsNum. The one hot encoding of the output value is assigned to the last columns in the matrix, and take values 0 for 'is not this category' and 1 for 'is this category'.
Exception(s):
May raise CapyExc_UnsupportedFormat.
Functions:
Create a CapyDatasetRow.
Output and side effect(s):
Return a CapyDatasetRow.
Allocate memory for a new CapyDatasetRow and create it.
Output and side effect(s):
Return a CapyDatasetRow.
Exception(s):
May raise CapyExc_MallocFailed.
Free the memory used by a CapyDatasetRow* and reset '*that' to NULL.
Input argument(s):
that: a pointer to the CapyDatasetRow to free
Create a CapyDatasetFieldDesc.
Output and side effect(s):
Return a CapyDatasetFieldDesc.
Allocate memory for a new CapyDatasetFieldDesc and create it.
Output and side effect(s):
Return a CapyDatasetFieldDesc.
Exception(s):
May raise CapyExc_MallocFailed.
Free the memory used by a CapyDatasetFieldDesc* and reset '*that' to NULL.
Input argument(s):
that: a pointer to the CapyDatasetFieldDesc to free
Create a CapyDataset.
Output and side effect(s):
Return a CapyDataset.
Allocate memory for a new CapyDataset and create it.
Output and side effect(s):
Return a CapyDataset.
Exception(s):
May raise CapyExc_MallocFailed.
Free the memory used by a CapyDataset* and reset '*that' to NULL.
Input argument(s):
that: a pointer to the CapyDataset to free.