![]() |
SG++-Doxygen-Documentation
|
Configuration structure used for all kinds of SampleProviders including default values. More...
#include <DataSourceConfig.hpp>
Public Attributes | |
size_t | batchSize_ = 0 |
datadriven::DataTransformationConfig | dataTransformationConfig_ |
size_t | epochs_ = 1 |
The number of epochs to train on. | |
std::string | filePath_ = "" |
Valid path to a file on disk. | |
DataSourceFileType | fileType_ = DataSourceFileType::NONE |
Which type of input file are we dealing with? NONE for auto detection or generated artificial datasets. | |
bool | hasTargets_ = true |
whether the file has targets (i.e. | |
bool | isCompressed_ = false |
The dataset is gzip compressed. | |
size_t | numBatches_ = 1 |
How many batches should the dataset be split into for batch learning - if 1, take the entire dataset. | |
int64_t | randomSeed_ = -1 |
Seed for the shuffling prng. | |
std::vector< double > | readinClasses_ = std::vector<double>() |
Specifies the set of classes (targets) to be read-in from the data file Any line with a class not contained in this vector is skipped If hasTargets=false this is ignored If empty then all classes/targets are considered (default) | |
std::vector< size_t > | readinColumns_ = std::vector<size_t>() |
Specifies the set of columns (dimensions) to be read-in from the data file Starts at 0, order matters; Any column not contained in this vector is ignored as a dimension If empty, then all columns are read in (default) | |
size_t | readinCutoff_ = -1 |
After how many (valid) lines of the sourcefile to stop reading. | |
DataSourceShufflingType | shuffling_ = DataSourceShufflingType::sequential |
The type of shuffling to be applied to the data. | |
size_t | testBatchSize_ = 0 |
std::string | testFilePath_ = "" |
Valid path to a file on disk. | |
DataSourceFileType | testFileType_ = DataSourceFileType::NONE |
Which type of input file are we dealing with? NONE for auto detection or generated artificial datasets. | |
bool | testHasTargets_ = true |
whether the file has targets (i.e. | |
bool | testIsCompressed_ = false |
The dataset is gzip compressed. | |
size_t | testNumBatches_ = 1 |
How many batches should the dataset be split into for batch learning - if 1, take the entire dataset. | |
std::vector< double > | testReadinClasses_ = std::vector<double>() |
Specifies the set of classes (targets) to be read-in from the data file Any line with a class not contained in this vector is skipped If hasTargets=false this is ignored If empty then all classes/targets are considered (default) | |
std::vector< size_t > | testReadinColumns_ = std::vector<size_t>() |
Specifies the set of columns (dimensions) to be read-in from the data file Starts at 0, order matters; Any column not contained in this vector is ignored as a dimension If empty, then all columns are read in (default) | |
size_t | testReadinCutoff_ = -1 |
After how many (valid) lines of the sourcefile to stop reading. | |
double | validationPortion_ = 0.3 |
Configuration structure used for all kinds of SampleProviders including default values.
size_t sgpp::datadriven::DataSourceConfig::batchSize_ = 0 |
datadriven::DataTransformationConfig sgpp::datadriven::DataSourceConfig::dataTransformationConfig_ |
size_t sgpp::datadriven::DataSourceConfig::epochs_ = 1 |
The number of epochs to train on.
Referenced by sgpp::datadriven::DataMiningConfigParser::getDataSourceConfig().
std::string sgpp::datadriven::DataSourceConfig::filePath_ = "" |
Valid path to a file on disk.
Empty for generated artificial datasets
Referenced by sgpp::datadriven::DataSource::DataSource(), sgpp::datadriven::DataMiningConfigParser::getDataSourceConfig(), and sgpp::datadriven::DataSourceBuilder::withPath().
DataSourceFileType sgpp::datadriven::DataSourceConfig::fileType_ = DataSourceFileType::NONE |
Which type of input file are we dealing with? NONE for auto detection or generated artificial datasets.
Referenced by sgpp::datadriven::DataSourceBuilder::crossValidationAssemble(), sgpp::datadriven::DataSourceBuilder::crossValidationFromConfig(), sgpp::datadriven::DataMiningConfigParser::getDataSourceConfig(), sgpp::datadriven::DataSourceBuilder::splittingAssemble(), sgpp::datadriven::DataSourceBuilder::splittingFromConfig(), sgpp::datadriven::DataSourceBuilder::withFileType(), and sgpp::datadriven::DataSourceBuilder::withPath().
bool sgpp::datadriven::DataSourceConfig::hasTargets_ = true |
whether the file has targets (i.e.
supervised learning)
Referenced by sgpp::datadriven::DataMiningConfigParser::getDataSourceConfig().
bool sgpp::datadriven::DataSourceConfig::isCompressed_ = false |
size_t sgpp::datadriven::DataSourceConfig::numBatches_ = 1 |
How many batches should the dataset be split into for batch learning - if 1, take the entire dataset.
Referenced by sgpp::datadriven::DataSource::end(), sgpp::datadriven::DataMiningConfigParser::getDataSourceConfig(), sgpp::datadriven::DataSource::getNextSamples(), and sgpp::datadriven::DataSourceBuilder::inBatches().
int64_t sgpp::datadriven::DataSourceConfig::randomSeed_ = -1 |
Seed for the shuffling prng.
Referenced by sgpp::datadriven::DataShufflingFunctorFactory::buildDataShufflingFunctor(), and sgpp::datadriven::DataMiningConfigParser::getDataSourceConfig().
std::vector<double> sgpp::datadriven::DataSourceConfig::readinClasses_ = std::vector<double>() |
Specifies the set of classes (targets) to be read-in from the data file Any line with a class not contained in this vector is skipped If hasTargets=false this is ignored If empty then all classes/targets are considered (default)
Referenced by sgpp::datadriven::DataMiningConfigParser::getDataSourceConfig().
std::vector<size_t> sgpp::datadriven::DataSourceConfig::readinColumns_ = std::vector<size_t>() |
Specifies the set of columns (dimensions) to be read-in from the data file Starts at 0, order matters; Any column not contained in this vector is ignored as a dimension If empty, then all columns are read in (default)
Referenced by sgpp::datadriven::DataMiningConfigParser::getDataSourceConfig().
size_t sgpp::datadriven::DataSourceConfig::readinCutoff_ = -1 |
After how many (valid) lines of the sourcefile to stop reading.
Referenced by sgpp::datadriven::DataMiningConfigParser::getDataSourceConfig().
DataSourceShufflingType sgpp::datadriven::DataSourceConfig::shuffling_ = DataSourceShufflingType::sequential |
The type of shuffling to be applied to the data.
Referenced by sgpp::datadriven::DataShufflingFunctorFactory::buildDataShufflingFunctor(), and sgpp::datadriven::DataMiningConfigParser::getDataSourceConfig().
size_t sgpp::datadriven::DataSourceConfig::testBatchSize_ = 0 |
std::string sgpp::datadriven::DataSourceConfig::testFilePath_ = "" |
Valid path to a file on disk.
Empty for generated artificial datasets
Referenced by sgpp::datadriven::DataMiningConfigParser::getDataSourceConfig().
DataSourceFileType sgpp::datadriven::DataSourceConfig::testFileType_ = DataSourceFileType::NONE |
Which type of input file are we dealing with? NONE for auto detection or generated artificial datasets.
Referenced by sgpp::datadriven::DataMiningConfigParser::getDataSourceConfig().
bool sgpp::datadriven::DataSourceConfig::testHasTargets_ = true |
whether the file has targets (i.e.
supervised learning)
Referenced by sgpp::datadriven::DataMiningConfigParser::getDataSourceConfig().
bool sgpp::datadriven::DataSourceConfig::testIsCompressed_ = false |
The dataset is gzip compressed.
Referenced by sgpp::datadriven::DataMiningConfigParser::getDataSourceConfig().
size_t sgpp::datadriven::DataSourceConfig::testNumBatches_ = 1 |
How many batches should the dataset be split into for batch learning - if 1, take the entire dataset.
Referenced by sgpp::datadriven::DataMiningConfigParser::getDataSourceConfig().
std::vector<double> sgpp::datadriven::DataSourceConfig::testReadinClasses_ = std::vector<double>() |
Specifies the set of classes (targets) to be read-in from the data file Any line with a class not contained in this vector is skipped If hasTargets=false this is ignored If empty then all classes/targets are considered (default)
Referenced by sgpp::datadriven::DataMiningConfigParser::getDataSourceConfig().
std::vector<size_t> sgpp::datadriven::DataSourceConfig::testReadinColumns_ = std::vector<size_t>() |
Specifies the set of columns (dimensions) to be read-in from the data file Starts at 0, order matters; Any column not contained in this vector is ignored as a dimension If empty, then all columns are read in (default)
Referenced by sgpp::datadriven::DataMiningConfigParser::getDataSourceConfig().
size_t sgpp::datadriven::DataSourceConfig::testReadinCutoff_ = -1 |
After how many (valid) lines of the sourcefile to stop reading.
Referenced by sgpp::datadriven::DataMiningConfigParser::getDataSourceConfig().
double sgpp::datadriven::DataSourceConfig::validationPortion_ = 0.3 |