This example shows how to perform offline/online-classification using sparse grid density estimation and matrix decomposition methods.
It creates an instance of LearnerSGDEOnOff and runs the function train() where the main functionality is implemented.
Currently, only binary classification with class labels -1 and 1 is possible.
The example provides the option to execute several runs over differently ordered data and perform a 5-fold cross-validation within each run. Therefore, already randomly ordered and partitioned data is required. Average results from several runs might be more reliable in an online-learning scenario, because the ordering of the data points seen by the learner can affect the result.
int main(
int argc,
char *argv[]) {
#ifdef USE_MPI
#ifdef USE_GSL
omp_set_num_threads(1);
std::cout << "LearnerSGDEOnOffParallelTest" << std::endl;
if (argc != 5) {
std::cout << "Usage:" << std::endl
<< "learnerSGDEOnOffParallelTest <trainDataFile> "
<< "<testDataFile> <batchSize> <refPeriod>" << std::endl;
return -1;
}
int main()
Definition densityMultiplication.cpp:22
Specify the number of runs to perform. If only one specific example should be executed, set totalSets=1.
Get the training, test and validation data
Definition Dataset.hpp:19
Specify the number of classes and the corresponding class labels.
size_t classNum = 2;
classLabels[0] = -1;
classLabels[1] = 1;
A class to store one-dimensional data.
Definition DataVector.hpp:25
The grid configuration.
std::cout << "# create grid config" << std::endl;
size_t getDimension() const
Definition Dataset.cpp:23
int level_
number of levels
Definition Grid.hpp:92
size_t dim_
number of dimensions
Definition Grid.hpp:90
sgpp::base::GridType type_
Grid Type, see enum.
Definition Grid.hpp:88
structure that can be used by applications to cluster regular grid information
Definition Grid.hpp:111
Configure regularization.
std::cout << "# create regularization config" << std::endl;
regularizationConfig.lambda_ = 0.01;
Definition RegularizationConfiguration.hpp:17
RegularizationType type_
Definition RegularizationConfiguration.hpp:18
Select the desired decomposition type for the offline step. Note: Refinement/Coarsening only possible for Cholesky decomposition.
std::string decompType;
decompType = "Incomplete Cholesky decomposition on Dense Matrix";
std::cout << "Decomposition type: " << decompType << std::endl;
MatrixDecompositionType
Definition DensityEstimationConfiguration.hpp:20
Definition DensityEstimationConfiguration.hpp:22
size_t iCholSweepsDecompose_
Definition DensityEstimationConfiguration.hpp:36
MatrixDecompositionType decomposition_
Definition DensityEstimationConfiguration.hpp:26
size_t iCholSweepsRefine_
Definition DensityEstimationConfiguration.hpp:37
Configure adaptive refinement (if Cholesky is chosen). As refinement monitor the periodic monitor or the convergence monitor can be chosen. Possible refinement indicators are surplus refinement, data-based refinement, zero-crossings-based refinement.
std::cout << "# create adaptive refinement configuration" << std::endl;
std::string refMonitor;
refMonitor = "periodic";
size_t refPeriod = 0;
parseInputValue(argv[4], refPeriod);
double accDeclineThreshold = 0.001;
size_t accDeclineBufferSize = 140;
size_t minRefInterval = 10;
std::cout << "Refinement monitor: " << refMonitor << std::endl;
std::string refType;
refType = "zero";
std::cout << "Refinement type: " << refType << std::endl;
sgpp::base::AdaptivityConfiguration adaptivityConfig
Definition multHPX.cpp:37
structure that can be used by application to define adaptivity strategies
Definition Grid.hpp:143
Specify number of refinement steps and the max number of grid points to refine each step.
double beta = 0.0;
bool usePrior = false;
size_t batchSize = 0;
parseInputValue(argv[3], batchSize);
Definition RoundRobinScheduler.hpp:14
size_t numRefinementPoints_
max. number of points to be refined
Definition Grid.hpp:157
double refinementThreshold_
refinement threshold for surpluses
Definition Grid.hpp:149
size_t numRefinements_
number of refinements
Definition Grid.hpp:145
Create the learner.
std::cout << "# create learner" << std::endl;
gridConfig,
adaptivityConfig, regularizationConfig, densityEstimationConfig, trainDataset,
testDataset, nullptr, classLabels, classNum, usePrior, beta, scheduler);
size_t maxDataPasses = 1;
LearnerSGDEOnOffParallel learns the data using sparse grid density estimation.
Definition LearnerSGDEOnOffParallel.hpp:46
Learn the data.
MPI_Barrier(MPI_COMM_WORLD);
std::cout << "# start to train the learner" << std::endl;
learner.trainParallel(batchSize, maxDataPasses, refType, refMonitor, refPeriod,
accDeclineThreshold, accDeclineBufferSize, minRefInterval);
double deltaTime = stopwatch.
stop();
MPI_Barrier(MPI_COMM_WORLD);
OS-independent version of a stop watch (using std::chrono).
Definition SGppStopwatch.hpp:22
void start()
Starts the stop watch.
Definition SGppStopwatch.cpp:21
double stop()
Stops the stop watch.
Definition SGppStopwatch.cpp:25
Accuracy on test data.
double acc = learner.getAccuracy();
std::cout << "# accuracy (test data): " << acc << std::endl;
std::cout << "# delta time training: " << deltaTime << std::endl;
} else {
std::cout << "# accuracy (client, test data): " << acc << std::endl;
std::cout << "# delta time training (client): " << deltaTime << std::endl;
}
#else
std::cout << "GSL not enabled at compile time" << std::endl;
#endif
#endif
static bool isMaster()
Check whether the current role of this node is master.
Definition MPIMethods.cpp:29
}
#ifdef USE_MPI
std::cout << "# loading file: " << filename << std::endl;
std::cout << "# Failed to read dataset! " << filename << std::endl;
exit(-1);
} else {
}
}
void parseInputValue(char *inputString, size_t &outputValue) {
std::basic_stringstream<char> argumentParser = std::stringstream(inputString);
argumentParser >> outputValue;
if (argumentParser.fail()) {
}
}
#endif
Exception that is thrown in case of an application failure.
Definition application_exception.hpp:22
sgpp::datadriven::Dataset dataset
Definition multHPX.cpp:42