This examples demonstrates density estimation.
#include <random>
#include <string>
A class to store two-dimensional data.
Definition DataMatrix.hpp:28
A class to store one-dimensional data.
Definition DataVector.hpp:25
Abstract class that defines the interfaces for the different grid's GridGenerators.
Definition GridGenerator.hpp:26
abstract base class for all types grids used in sgpp the class gives pure virtual function definition...
Definition Grid.hpp:191
Generic hash table based storage of grid points.
Definition HashGridStorage.hpp:42
First define an auxiliary function randu that returns a random point (uniform, normal).
void randu(DataVector& rvar, std::mt19937& generator) {
std::uniform_real_distribution<double> distribution(0.0, 1.0);
for (size_t j = 0; j < rvar.getSize(); ++j) {
rvar[j] = distribution(generator);
}
}
void randn(DataVector& rvar, std::mt19937& generator) {
std::normal_distribution<double> distribution(0.5, 0.1);
for (size_t j = 0; j < rvar.getSize(); ++j) {
double value = -1.0;
while (value <= 0.0 || value >= 1.0) {
value = distribution(generator);
rvar[j] = value;
}
}
}
Second define another auxiliary function that calls the one defined above multiple times and returns a matrix of random points (uniform, normal)
void randu(DataMatrix& rvar, std::uint64_t seedValue = std::mt19937_64::default_seed) {
size_t nsamples = rvar.getNrows(), ndim = rvar.getNcols();
std::mt19937 generator(static_cast<std::mt19937::result_type>(seedValue));
DataVector sample(ndim);
for (size_t i = 0; i < nsamples; ++i) {
randu(sample, generator);
rvar.setRow(i, sample);
}
}
void randn(DataMatrix& rvar, std::uint64_t seedValue = std::mt19937_64::default_seed) {
size_t nsamples = rvar.getNrows(), ndim = rvar.getNcols();
std::mt19937 generator(static_cast<std::mt19937::result_type>(seedValue));
DataVector sample(ndim);
for (size_t i = 0; i < nsamples; ++i) {
randn(sample, generator);
rvar.setRow(i, sample);
}
}
Now the main function begins by loading the test data from a file specified in the string filename.
int main(
int argc,
char** argv) {
int main()
Definition densityMultiplication.cpp:22
Define number of dimensions of the toy problem.
Load normally distributed samples.
Configure the sparse grid of level 3 with linear basis functions and the same dimension as the given test data.
Alternatively load a sparse grid that has been saved to a file, see the commented line.
std::cout << "# create grid config" << std::endl;
gridConfig.
dim_ = numDims;
int level_
number of levels
Definition Grid.hpp:92
size_t dim_
number of dimensions
Definition Grid.hpp:90
sgpp::base::GridType type_
Grid Type, see enum.
Definition Grid.hpp:88
structure that can be used by applications to cluster regular grid information
Definition Grid.hpp:111
Configure the adaptive refinement. Therefore the number of refinements and the number of points are specified.
std::cout << "# create adaptive refinement config" << std::endl;
sgpp::base::AdaptivityConfiguration adaptivityConfig
Definition multHPX.cpp:37
structure that can be used by application to define adaptivity strategies
Definition Grid.hpp:143
size_t numRefinementPoints_
max. number of points to be refined
Definition Grid.hpp:157
size_t numRefinements_
number of refinements
Definition Grid.hpp:145
Configure the solver. The solver type is set to the conjugent gradient method and the maximum number of iterations, the tolerance epsilon and the threshold are specified.
std::cout << "# create solver config" << std::endl;
solverConfig.
eps_ = 1e-14;
Definition TypesSolver.hpp:19
sgpp::solver::SLESolverType type_
Definition TypesSolver.hpp:20
double eps_
Definition TypesSolver.hpp:21
bool verbose_
Definition TypesSolver.hpp:24
double threshold_
Definition TypesSolver.hpp:23
size_t maxIterations_
Definition TypesSolver.hpp:22
Configure the regularization for the laplacian operator.
std::cout << "# create regularization config" << std::endl;
Definition RegularizationConfiguration.hpp:17
RegularizationType type_
Definition RegularizationConfiguration.hpp:18
Configure the learner by specifying:
- enable kfold,
- an initial value for the Lagrangian multiplier \(\lambda\) and the interval \( [\lambda_{Start} , \lambda_{End}] \) in which \(\lambda\) will be searched,
- whether a logarithmic scale is used,
- the parameters shuffle and an initial seed for the random value generation,
- whether parts of the output shall be kept off.
std::cout << "# create learner config" << std::endl;
crossvalidationConfig.
enable_ =
false;
crossvalidationConfig.
kfold_ = 3;
crossvalidationConfig.
lambda_ = 3.16228e-06;
crossvalidationConfig.
seed_ = 1234567;
crossvalidationConfig.
silent_ =
false;
std::cout << "# create learner config" << std::endl;
@ InterpolateBoundaries1d
@ HybridFullIntersections
Definition SparseGridDensityEstimator.hpp:28
int seed_
Definition SparseGridDensityEstimator.hpp:32
size_t kfold_
Definition SparseGridDensityEstimator.hpp:31
double lambdaEnd_
Definition SparseGridDensityEstimator.hpp:39
bool shuffle_
Definition SparseGridDensityEstimator.hpp:33
bool enable_
Definition SparseGridDensityEstimator.hpp:30
double lambdaStart_
Definition SparseGridDensityEstimator.hpp:38
bool logScale_
Definition SparseGridDensityEstimator.hpp:43
size_t lambdaSteps_
Definition SparseGridDensityEstimator.hpp:42
double lambda_
Definition SparseGridDensityEstimator.hpp:37
bool silent_
Definition SparseGridDensityEstimator.hpp:34
Definition SparseGridDensityEstimator.hpp:48
datadriven::MakePositiveInterpolationAlgorithm makePositive_interpolationAlgorithm_
Definition SparseGridDensityEstimator.hpp:51
bool makePositive_
Definition SparseGridDensityEstimator.hpp:49
datadriven::MakePositiveCandidateSearchAlgorithm makePositive_candidateSearchAlgorithm_
Definition SparseGridDensityEstimator.hpp:50
bool unitIntegrand_
Definition SparseGridDensityEstimator.hpp:53
Create the learner using the configurations set above. Then initialize it with the data read from the file in the first step and train the learner.
std::cout << "# creating the learner" << std::endl;
regularizationConfig, crossvalidationConfig,
sgdeConfig);
learner.initialize(samples);
Definition SparseGridDensityEstimator.hpp:82
Estimate the probability density function (pdf) via a Gaussian kernel density estimation (KDE) and print the corresponding values.
std::cout << "--------------------------------------------------------\n";
std::cout << learner.getSurpluses().getSize() << " -> " << learner.getSurpluses().sum() << "\n";
std::cout << "pdf_SGDE(x) = " << learner.pdf(x) << " ~ " << kde.pdf(x) << " =pdf_KDE(x)\n";
std::cout << "mean_SGDE(x) = " << learner.mean() << " ~ " << kde.mean() << " = mean_KDE(x)\n";
std::cout << "var_SGDE(x) = " << learner.variance() << " ~ " << kde.variance() << "=var_KDE(x)\n";
for (
size_t idim = 0; idim < gridConfig.
dim_; idim++) {
bounds->
set(idim, 0, 0.0);
bounds->
set(idim, 1, 1.0);
}
void set(size_t row, size_t col, double value)
Sets the element at position [row,col] to value.
Definition DataMatrix.hpp:293
Definition KernelDensityEstimator.hpp:68
Print the covariances.
std::cout << "---------------------- Cov_SGDE ------------------------------" << std::endl;
learner.cov(C, bounds);
std::cout << C.toString() << std::endl;
std::cout << "---------------------- Cov KDE--------------------------------" << std::endl;
kde.cov(C);
std::cout << C.toString() << std::endl;
Apply the inverse Rosenblatt transformation to a matrix of random points. To do this first generate the random points via randu, then initialize an inverse Rosenblatt transformation operation and apply it to the points. Finally print the calculated values.
std::cout << "------------------------------------------------------" << std::endl;
randu(points);
std::cout << "------------------------------------------------------" << std::endl;
std::cout << "uniform space" << std::endl;
std::cout << points.toString() << std::endl;
opInvRos->doTransformation(&learner.getSurpluses(), &points, &pointsCdf);
datadriven::OperationInverseRosenblattTransformation * createOperationInverseRosenblattTransformation(base::Grid &grid)
Factory method, returning an OperationInverseRosenblattTransformation for the grid.
Definition DatadrivenOpFactory.cpp:325
To check whether the results are correct, perform a Rosenblatt transformation on the data that has been created by the inverse Rosenblatt transformation above and print the calculated values.
points.setAll(0.0);
opRos->doTransformation(&learner.getSurpluses(), &pointsCdf, &points);
std::cout << "------------------------------------------------------" << std::endl;
std::cout << "original space" << std::endl;
std::cout << pointsCdf.toString() << std::endl;
std::cout << "------------------------------------------------------" << std::endl;
std::cout << "uniform space" << std::endl;
std::cout << points.toString() << std::endl;
}
datadriven::OperationRosenblattTransformation * createOperationRosenblattTransformation(base::Grid &grid)
Factory method, returning an OperationRosenblattTransformation for the grid.
Definition DatadrivenOpFactory.cpp:263