Major rework to improve code quality and add automation checks (#805)

* delete secant method - it is identical to regula falsi * document + improvize root finding algorithms * attempt to document gaussian elimination * added file brief * commented doxygen-mainpage, added files-list link * corrected files list link path * files-list link correction - this time works :) * document successive approximations * cleaner equation * updating DIRECTORY.md * documented kmp string search * document brute force string search * document rabin-karp string search * fixed mainpage readme * doxygen v1.8.18 will suppress out the #minipage in the markdown * cpplint correction for header guard style * github action to auto format source code per cpplint standard * updated setting to add 1 space before `private` and `public` keywords * auto rename files and auto format code * added missing "run" for step * corrected asignmemt operation * fixed trim and assign syntax * added git move for renaming bad filenames * added missing pipe for trim * added missing space * use old and new fnames * store old fname using echo * move files only if there is a change in filename * put old filenames in quotes * use double quote for old filename * escape double quotes * remove old_fname * try escape characters and echo" * add file-type to find * cleanup echo * ensure all trim variables are also in quotes * try escape -quote again * remove second escpe quote * use single quote for first check * use carets instead of quotes * put variables in brackets * remove -e from echo * add debug echos * try print0 flag * find command with while instead of for-loop * find command using IFS instead * 🎉 IFS fix worked - escaped quotes for git mv * protetc each word in git mv .. * filename exists in lower cases - renamed * 🎉 git push enabled * updating DIRECTORY.md * git pull & then push * formatting filenames d7af6fdc8c * formatting source-code for d7af6fdc8c * remove allman break before braces * updating DIRECTORY.md * added missing comma lost in previous commit * orchestrate all workflows * fix yml indentation * force push format changes, add title to DIRECTORY.md * pull before proceeding * reorganize pull commands * use master branches for actions * rename .cc files to .cpp * added class destructor to clean up dynamic memory allocation * rename to awesome workflow * commented whole repo cpplint - added modified files lint check * removed need for cpplint * attempt to use actions/checkout@master * temporary: no dependency on cpplint * formatting filenames 153fb7b8a5 * formatting source-code for 153fb7b8a5 * updating DIRECTORY.md * fix diff filename * added comments to the code * added test case * formatting source-code for a850308fba * updating DIRECTORY.md * added machine learning folder * added adaline algorithm * updating DIRECTORY.md * fixed issue [LWG2192](https://cplusplus.github.io/LWG/issue2192) for std::abs on MacOS * add cmath for same bug: [LWG2192](https://cplusplus.github.io/LWG/issue2192) for std::abs on MacOS * formatting source-code for f8925e4822 * use STL's inner_product * formatting source-code for f94a330594 * added range comments * define activation function * use equal initial weights * change test2 function to predict * activation function not friend * previous commit correction * added option for predict function to return value before applying activation function as optional argument * added test case to classify points lying within a sphere * improve documentation for adaline * formatting source-code for 15ec4c3aba * added cmake to geometry folder * added algorithm include for std::max * add namespace - machine_learning * add namespace - statistics * add namespace - sorting * added sorting algos to namespace sorting * added namespace string_search * formatting source-code for fd69530515 * added documentation to string_search namespace * feat: Add BFS and DFS algorithms to check for cycle in a directed graph * Remove const references for input of simple types Reason: overhead on access * fix bad code sorry for force push * Use pointer instead of the non-const reference because apparently google says so. * Remove a useless and possibly bad Graph constuctor overload * Explicitely specify type of vector during graph instantiation * updating DIRECTORY.md * find openMP before adding subdirectories * added kohonen self organizing map * updating DIRECTORY.md * remove older files and folders from gh-pages before adding new files * remove chronos library due to inacceptability by cpplint * use c++ specific static_cast instead * initialize radom number generator * updated image links with those from CPP repository * rename computer.... folder to numerical methods * added durand kerner method for root computation for arbitrarily large polynomials * fixed additional comma * fix cpplint errors * updating DIRECTORY.md * convert to function module * update documentation * move openmp to main loop * added two test cases * use INT16_MAX * remove return statement from omp-for loop and use "break" * run tests when no input is provided and skip tests when input polynomial is provided * while loop cannot have break - replaced with continue and check is present in the main while condition * (1) break while loop (2) skip runs on break_loop instead of hard-break * add documentation images * use long double for errors and tolerance checks * make iterator variable i local to threads * add critical secions to omp threads * bugfix: move file writing outside of the parallel loop othersie, there is no gurantee of the order of roots written to file * rename folder to data_structures * updating DIRECTORY.md * fix ambiguous symbol `size` * add data_structures to cmake * docs: enable tree view, add timestamp in footer, try clang assistaed parsing * doxygen - open links in external window * remove invalid parameter from function docs * use HTML5 img tag to resize images * move file to proper folder * fix documentations and cpplint * formatting source-code for aacaf9828c * updating DIRECTORY.md * cpplint: add braces for multiple statement if * add explicit link to badges * remove duplicate line Signed-off-by: Krishna Vedala <7001608+kvedala@users.noreply.github.com> * remove namespace indentation * remove file associations in settings * add author name * enable cmake in subfolders of data_structures * create and link object file * cpp lint fixes and instantiate template classes * cpp lint fixes and instantiate template classes Signed-off-by: Krishna Vedala <7001608+kvedala@users.noreply.github.com> * cpplint - ignore `build/include` Signed-off-by: Krishna Vedala <7001608+kvedala@users.noreply.github.com> * disable redundant gcc compilation in cpplint workflow Signed-off-by: Krishna Vedala <7001608+kvedala@users.noreply.github.com> * template header files contain function codes as well and removed redundant subfolders Signed-off-by: Krishna Vedala <7001608+kvedala@users.noreply.github.com> * updating DIRECTORY.md * remove semicolons after functions in a class Signed-off-by: Krishna Vedala <7001608+kvedala@users.noreply.github.com> * cpplint header guard style Signed-off-by: Krishna Vedala <7001608+kvedala@users.noreply.github.com> * remove semilon Signed-off-by: Krishna Vedala <7001608+kvedala@users.noreply.github.com> * added LU decomposition algorithm Signed-off-by: Krishna Vedala <7001608+kvedala@users.noreply.github.com> * added QR decomposition algorithm Signed-off-by: Krishna Vedala <7001608+kvedala@users.noreply.github.com> * use QR decomposition to find eigen values Signed-off-by: Krishna Vedala <7001608+kvedala@users.noreply.github.com> * updating DIRECTORY.md * use std::rand for thread safety Signed-off-by: Krishna Vedala <7001608+kvedala@users.noreply.github.com> * move srand to main() Signed-off-by: Krishna Vedala <7001608+kvedala@users.noreply.github.com> * cpplint braces correction Signed-off-by: Krishna Vedala <7001608+kvedala@users.noreply.github.com> * updated eigen value documentation Signed-off-by: Krishna Vedala <7001608+kvedala@users.noreply.github.com> * fix matrix shift doc Signed-off-by: Krishna Vedala <7001608+kvedala@users.noreply.github.com> * rename CONTRIBUTION.md to CONTRIBUTING.md #836 * remove 'sort alphabetical order' check * added documentation check * remove extra paranthesis * added gitpod * added gitpod link from README * attempt to add vscode gitpod extensions * update gitpod extensions * add gitpod extensions cmake-tools and git-graph * remove gitpod init and add commands * use init to one time install doxygen, graphviz, cpplint * use gitpod dockerfile * add ninja build system to docker * remove configure task * add github prebuild specs to gitpod * disable gitpod addcommit * update documentation for kohonen_som * added ode solve using forward euler method * added mid-point euler ode solver * fixed itegration step equation * added semi-implicit euler ODE solver * updating DIRECTORY.md * fix cpplint issues - lines 117 and 124 * added documentation to ode group * corrected semi-implicit euler function * updated docs and test cases better structure * replace `free` with `delete` operator * formatting source-code for f55ab50cf2 * updating DIRECTORY.md * main function must return * added machine learning group * added kohonen som topology algorithm * fix graph image path * updating DIRECTORY.md * fix braces * use snprintf instead of sprintf * use static_cast * hardcode character buffer size * fix machine learning groups in documentation * fix missing namespace function * replace kvedala fork references to TheAlgorithms * fix bug in counting_sort Co-authored-by: github-actions <${GITHUB_ACTOR}@users.noreply.github.com> Co-authored-by: Anmol3299 <mittalanmol22@gmail.com>
2026-04-08 21:19:52 +08:00 · 2020-06-19 12:04:56 -04:00
parent 70a2aeedc3
commit aaa08b0150
313 changed files with 49332 additions and 9833 deletions
--- a/machine_learning/CMakeLists.txt
+++ b/machine_learning/CMakeLists.txt
@@ -0,0 +1,18 @@
+# If necessary, use the RELATIVE flag, otherwise each source file may be listed
+# with full pathname. RELATIVE may makes it easier to extract an executable name
+# automatically.
+file( GLOB APP_SOURCES RELATIVE ${CMAKE_CURRENT_SOURCE_DIR} *.cpp )
+# file( GLOB APP_SOURCES ${CMAKE_SOURCE_DIR}/*.c )
+# AUX_SOURCE_DIRECTORY(${CMAKE_CURRENT_SOURCE_DIR} APP_SOURCES)
+foreach( testsourcefile ${APP_SOURCES} )
+    # I used a simple string replace, to cut off .cpp.
+    string( REPLACE ".cpp" "" testname ${testsourcefile} )
+    add_executable( ${testname} ${testsourcefile} )
+
+    set_target_properties(${testname} PROPERTIES LINKER_LANGUAGE CXX)
+    if(OpenMP_CXX_FOUND)
+        target_link_libraries(${testname} OpenMP::OpenMP_CXX)
+    endif()
+    install(TARGETS ${testname} DESTINATION "bin/machine_learning")
+
+endforeach( testsourcefile ${APP_SOURCES} )
--- a/machine_learning/adaline_learning.cpp
+++ b/machine_learning/adaline_learning.cpp
@@ -0,0 +1,351 @@
+/**
+ * \addtogroup machine_learning Machine Learning Algorithms
+ * @{
+ * \file
+ * \brief [Adaptive Linear Neuron
+ * (ADALINE)](https://en.wikipedia.org/wiki/ADALINE) implementation
+ *
+ * \author [Krishna Vedala](https://github.com/kvedala)
+ *
+ * <img
+ * src="https://upload.wikimedia.org/wikipedia/commons/b/be/Adaline_flow_chart.gif"
+ * width="200px">
+ * [source](https://commons.wikimedia.org/wiki/File:Adaline_flow_chart.gif)
+ * ADALINE is one of the first and simplest single layer artificial neural
+ * network. The algorithm essentially implements a linear function
+ * \f[ f\left(x_0,x_1,x_2,\ldots\right) =
+ * \sum_j x_jw_j+\theta
+ * \f]
+ * where \f$x_j\f$ are the input features of a sample, \f$w_j\f$ are the
+ * coefficients of the linear function and \f$\theta\f$ is a constant. If we
+ * know the \f$w_j\f$, then for any given set of features, \f$y\f$ can be
+ * computed. Computing the \f$w_j\f$ is a supervised learning algorithm wherein
+ * a set of features and their corresponding outputs are given and weights are
+ * computed using stochastic gradient descent method.
+ */
+
+#include <cassert>
+#include <climits>
+#include <cmath>
+#include <cstdlib>
+#include <ctime>
+#include <iostream>
+#include <numeric>
+#include <vector>
+
+#define MAX_ITER 500  // INT_MAX  ///< Maximum number of iterations to learn
+
+/** \namespace machine_learning
+ * \brief Machine learning algorithms
+ */
+namespace machine_learning {
+class adaline {
+ public:
+    /**
+     * Default constructor
+     * \param[in] num_features number of features present
+     * \param[in] eta learning rate (optional, default=0.1)
+     * \param[in] convergence accuracy (optional,
+     * default=\f$1\times10^{-5}\f$)
+     */
+    adaline(int num_features, const double eta = 0.01f,
+            const double accuracy = 1e-5)
+        : eta(eta), accuracy(accuracy) {
+        if (eta <= 0) {
+            std::cerr << "learning rate should be positive and nonzero"
+                      << std::endl;
+            std::exit(EXIT_FAILURE);
+        }
+
+        weights = std::vector<double>(
+            num_features +
+            1);  // additional weight is for the constant bias term
+
+        // initialize with random weights in the range [-50, 49]
+        for (int i = 0; i < weights.size(); i++) weights[i] = 1.f;
+        // weights[i] = (static_cast<double>(std::rand() % 100) - 50);
+    }
+
+    /**
+     * Operator to print the weights of the model
+     */
+    friend std::ostream &operator<<(std::ostream &out, const adaline &ada) {
+        out << "<";
+        for (int i = 0; i < ada.weights.size(); i++) {
+            out << ada.weights[i];
+            if (i < ada.weights.size() - 1)
+                out << ", ";
+        }
+        out << ">";
+        return out;
+    }
+
+    /**
+     * predict the output of the model for given set of features
+     * \param[in] x input vector
+     * \param[out] out optional argument to return neuron output before
+     * applying activation function (optional, `nullptr` to ignore) \returns
+     * model prediction output
+     */
+    int predict(const std::vector<double> &x, double *out = nullptr) {
+        if (!check_size_match(x))
+            return 0;
+
+        double y = weights.back();  // assign bias value
+
+        // for (int i = 0; i < x.size(); i++) y += x[i] * weights[i];
+        y = std::inner_product(x.begin(), x.end(), weights.begin(), y);
+
+        if (out != nullptr)  // if out variable is provided
+            *out = y;
+
+        return activation(y);  // quantizer: apply ADALINE threshold function
+    }
+
+    /**
+     * Update the weights of the model using supervised learning for one
+     * feature vector \param[in] x feature vector \param[in] y known output
+     * value \returns correction factor
+     */
+    double fit(const std::vector<double> &x, const int &y) {
+        if (!check_size_match(x))
+            return 0;
+
+        /* output of the model with current weights */
+        int p = predict(x);
+        int prediction_error = y - p;  // error in estimation
+        double correction_factor = eta * prediction_error;
+
+        /* update each weight, the last weight is the bias term */
+        for (int i = 0; i < x.size(); i++) {
+            weights[i] += correction_factor * x[i];
+        }
+        weights[x.size()] += correction_factor;  // update bias
+
+        return correction_factor;
+    }
+
+    /**
+     * Update the weights of the model using supervised learning for an
+     * array of vectors. \param[in] X array of feature vector \param[in] y
+     * known output value for each feature vector
+     */
+    template <int N>
+    void fit(std::vector<double> const (&X)[N], const int *y) {
+        double avg_pred_error = 1.f;
+
+        int iter;
+        for (iter = 0; (iter < MAX_ITER) && (avg_pred_error > accuracy);
+             iter++) {
+            avg_pred_error = 0.f;
+
+            // perform fit for each sample
+            for (int i = 0; i < N; i++) {
+                double err = fit(X[i], y[i]);
+                avg_pred_error += std::abs(err);
+            }
+            avg_pred_error /= N;
+
+            // Print updates every 200th iteration
+            // if (iter % 100 == 0)
+            std::cout << "\tIter " << iter << ": Training weights: " << *this
+                      << "\tAvg error: " << avg_pred_error << std::endl;
+        }
+
+        if (iter < MAX_ITER)
+
+            std::cout << "Converged after " << iter << " iterations."
+                      << std::endl;
+        else
+            std::cout << "Did not converge after " << iter << " iterations."
+                      << std::endl;
+    }
+
+    int activation(double x) { return x > 0 ? 1 : -1; }
+
+ private:
+    /**
+     * convenient function to check if input feature vector size matches the
+     * model weights size
+     * \param[in] x fecture vector to check
+     * \returns `true` size matches
+     * \returns `false` size does not match
+     */
+    bool check_size_match(const std::vector<double> &x) {
+        if (x.size() != (weights.size() - 1)) {
+            std::cerr << __func__ << ": "
+                      << "Number of features in x does not match the feature "
+                         "dimension in model!"
+                      << std::endl;
+            return false;
+        }
+        return true;
+    }
+
+    const double eta;             ///< learning rate of the algorithm
+    const double accuracy;        ///< model fit convergence accuracy
+    std::vector<double> weights;  ///< weights of the neural network
+};
+
+}  // namespace machine_learning
+
+using machine_learning::adaline;
+
+/** @} */
+
+/**
+ * test function to predict points in a 2D coordinate system above the line
+ * \f$x=y\f$ as +1 and others as -1.
+ * Note that each point is defined by 2 values or 2 features.
+ * \param[in] eta learning rate (optional, default=0.01)
+ */
+void test1(double eta = 0.01) {
+    adaline ada(2, eta);  // 2 features
+
+    const int N = 10;  // number of sample points
+
+    std::vector<double> X[N] = {{0, 1},  {1, -2},   {2, 3},   {3, -1},
+                                {4, 1},  {6, -5},   {-7, -3}, {-8, 5},
+                                {-9, 2}, {-10, -15}};
+    int y[] = {1, -1, 1, -1, -1, -1, 1, 1, 1, -1};  // corresponding y-values
+
+    std::cout << "------- Test 1 -------" << std::endl;
+    std::cout << "Model before fit: " << ada << std::endl;
+
+    ada.fit(X, y);
+    std::cout << "Model after fit: " << ada << std::endl;
+
+    int predict = ada.predict({5, -3});
+    std::cout << "Predict for x=(5,-3): " << predict;
+    assert(predict == -1);
+    std::cout << " ...passed" << std::endl;
+
+    predict = ada.predict({5, 8});
+    std::cout << "Predict for x=(5,8): " << predict;
+    assert(predict == 1);
+    std::cout << " ...passed" << std::endl;
+}
+
+/**
+ * test function to predict points in a 2D coordinate system above the line
+ * \f$x+3y=-1\f$ as +1 and others as -1.
+ * Note that each point is defined by 2 values or 2 features.
+ * The function will create random sample points for training and test purposes.
+ * \param[in] eta learning rate (optional, default=0.01)
+ */
+void test2(double eta = 0.01) {
+    adaline ada(2, eta);  // 2 features
+
+    const int N = 50;  // number of sample points
+
+    std::vector<double> X[N];
+    int Y[N];  // corresponding y-values
+
+    // generate sample points in the interval
+    // [-range2/100 , (range2-1)/100]
+    int range = 500;          // sample points full-range
+    int range2 = range >> 1;  // sample points half-range
+    for (int i = 0; i < N; i++) {
+        double x0 = ((std::rand() % range) - range2) / 100.f;
+        double x1 = ((std::rand() % range) - range2) / 100.f;
+        X[i] = {x0, x1};
+        Y[i] = (x0 + 3. * x1) > -1 ? 1 : -1;
+    }
+
+    std::cout << "------- Test 2 -------" << std::endl;
+    std::cout << "Model before fit: " << ada << std::endl;
+
+    ada.fit(X, Y);
+    std::cout << "Model after fit: " << ada << std::endl;
+
+    int N_test_cases = 5;
+    for (int i = 0; i < N_test_cases; i++) {
+        double x0 = ((std::rand() % range) - range2) / 100.f;
+        double x1 = ((std::rand() % range) - range2) / 100.f;
+
+        int predict = ada.predict({x0, x1});
+
+        std::cout << "Predict for x=(" << x0 << "," << x1 << "): " << predict;
+
+        int expected_val = (x0 + 3. * x1) > -1 ? 1 : -1;
+        assert(predict == expected_val);
+        std::cout << " ...passed" << std::endl;
+    }
+}
+
+/**
+ * test function to predict points in a 3D coordinate system lying within the
+ * sphere of radius 1 and centre at origin as +1 and others as -1. Note that
+ * each point is defined by 3 values but we use 6 features. The function will
+ * create random sample points for training and test purposes.
+ * The sphere centred at origin and radius 1 is defined as:
+ * \f$x^2+y^2+z^2=r^2=1\f$ and if the \f$r^2<1\f$, point lies within the sphere
+ * else, outside.
+ *
+ * \param[in] eta learning rate (optional, default=0.01)
+ */
+void test3(double eta = 0.01) {
+    adaline ada(6, eta);  // 2 features
+
+    const int N = 100;  // number of sample points
+
+    std::vector<double> X[N];
+    int Y[N];  // corresponding y-values
+
+    // generate sample points in the interval
+    // [-range2/100 , (range2-1)/100]
+    int range = 200;          // sample points full-range
+    int range2 = range >> 1;  // sample points half-range
+    for (int i = 0; i < N; i++) {
+        double x0 = ((std::rand() % range) - range2) / 100.f;
+        double x1 = ((std::rand() % range) - range2) / 100.f;
+        double x2 = ((std::rand() % range) - range2) / 100.f;
+        X[i] = {x0, x1, x2, x0 * x0, x1 * x1, x2 * x2};
+        Y[i] = ((x0 * x0) + (x1 * x1) + (x2 * x2)) <= 1.f ? 1 : -1;
+    }
+
+    std::cout << "------- Test 3 -------" << std::endl;
+    std::cout << "Model before fit: " << ada << std::endl;
+
+    ada.fit(X, Y);
+    std::cout << "Model after fit: " << ada << std::endl;
+
+    int N_test_cases = 5;
+    for (int i = 0; i < N_test_cases; i++) {
+        double x0 = ((std::rand() % range) - range2) / 100.f;
+        double x1 = ((std::rand() % range) - range2) / 100.f;
+        double x2 = ((std::rand() % range) - range2) / 100.f;
+
+        int predict = ada.predict({x0, x1, x2, x0 * x0, x1 * x1, x2 * x2});
+
+        std::cout << "Predict for x=(" << x0 << "," << x1 << "," << x2
+                  << "): " << predict;
+
+        int expected_val = ((x0 * x0) + (x1 * x1) + (x2 * x2)) <= 1.f ? 1 : -1;
+        assert(predict == expected_val);
+        std::cout << " ...passed" << std::endl;
+    }
+}
+
+/** Main function */
+int main(int argc, char **argv) {
+    std::srand(std::time(nullptr));  // initialize random number generator
+
+    double eta = 0.1;  // default value of eta
+    if (argc == 2)     // read eta value from commandline argument if present
+        eta = strtof(argv[1], nullptr);
+
+    test1(eta);
+
+    std::cout << "Press ENTER to continue..." << std::endl;
+    std::cin.get();
+
+    test2(eta);
+
+    std::cout << "Press ENTER to continue..." << std::endl;
+    std::cin.get();
+
+    test3(eta);
+
+    return 0;
+}
--- a/machine_learning/kohonen_som_topology.cpp
+++ b/machine_learning/kohonen_som_topology.cpp
@@ -0,0 +1,595 @@
+/**
+ * \addtogroup machine_learning Machine Learning Algorithms
+ * @{
+ * \file
+ * \author [Krishna Vedala](https://github.com/kvedala)
+ * \brief [Kohonen self organizing
+ * map](https://en.wikipedia.org/wiki/Self-organizing_map) (topological map)
+ *
+ * This example implements a powerful unsupervised learning algorithm called as
+ * a self organizing map. The algorithm creates a connected network of weights
+ * that closely follows the given data points. This thus creates a topological
+ * map of the given data i.e., it maintains the relationship between varipus
+ * data points in a much higher dimesional space by creating an equivalent in a
+ * 2-dimensional space.
+ * <img alt="Trained topological maps for the test cases in the program"
+ * src="https://raw.githubusercontent.com/TheAlgorithms/C-Plus-Plus/docs/images/machine_learning/2D_Kohonen_SOM.svg"
+ * />
+ * \note This C++ version of the program is considerable slower than its [C
+ * counterpart](https://github.com/kvedala/C/blob/master/machine_learning/kohonen_som_trace.c)
+ * \note The compiled code is much slower when compiled with MS Visual C++ 2019
+ * than with GCC on windows
+ * \see kohonen_som_trace.cpp
+ */
+#define _USE_MATH_DEFINES  // required for MS Visual C++
+#include <algorithm>
+#include <cmath>
+#include <cstdlib>
+#include <ctime>
+#include <fstream>
+#include <iostream>
+#include <valarray>
+#include <vector>
+#ifdef _OPENMP  // check if OpenMP based parallellization is available
+#include <omp.h>
+#endif
+
+/**
+ * Helper function to generate a random number in a given interval.
+ * \n Steps:
+ * 1. `r1 = rand() % 100` gets a random number between 0 and 99
+ * 2. `r2 = r1 / 100` converts random number to be between 0 and 0.99
+ * 3. scale and offset the random number to given range of \f$[a,b]\f$
+ *
+ * \param[in] a lower limit
+ * \param[in] b upper limit
+ * \returns random number in the range \f$[a,b]\f$
+ */
+double _random(double a, double b) {
+    return ((b - a) * (std::rand() % 100) / 100.f) + a;
+}
+
+/**
+ * Save a given n-dimensional data martix to file.
+ *
+ * \param[in] fname filename to save in (gets overwriten without confirmation)
+ * \param[in] X matrix to save
+ * \returns 0 if all ok
+ * \returns -1 if file creation failed
+ */
+int save_2d_data(const char *fname,
+                 const std::vector<std::valarray<double>> &X) {
+    size_t num_points = X.size();       // number of rows
+    size_t num_features = X[0].size();  // number of columns
+
+    std::ofstream fp;
+    fp.open(fname);
+    if (!fp.is_open()) {
+        // error with opening file to write
+        std::cerr << "Error opening file " << fname << "\n";
+        return -1;
+    }
+
+    // for each point in the array
+    for (int i = 0; i < num_points; i++) {
+        // for each feature in the array
+        for (int j = 0; j < num_features; j++) {
+            fp << X[i][j];             // print the feature value
+            if (j < num_features - 1)  // if not the last feature
+                fp << ",";             // suffix comma
+        }
+        if (i < num_points - 1)  // if not the last row
+            fp << "\n";          // start a new line
+    }
+
+    fp.close();
+    return 0;
+}
+
+/**
+ * Get minimum value and index of the value in a matrix
+ * \param[in] X matrix to search
+ * \param[in] N number of points in the vector
+ * \param[out] val minimum value found
+ * \param[out] idx_x x-index where minimum value was found
+ * \param[out] idx_y y-index where minimum value was found
+ */
+void get_min_2d(const std::vector<std::valarray<double>> &X, double *val,
+                int *x_idx, int *y_idx) {
+    val[0] = INFINITY;  // initial min value
+    int N = X.size();
+
+    for (int i = 0; i < N; i++) {  // traverse each x-index
+        auto result = std::min_element(std::begin(X[i]), std::end(X[i]));
+        double d_min = *result;
+        int j = std::distance(std::begin(X[i]), result);
+
+        if (d_min < val[0]) {  // if a lower value is found
+                               // save the value and its index
+            x_idx[0] = i;
+            y_idx[0] = j;
+            val[0] = d_min;
+        }
+    }
+}
+
+/** \namespace machine_learning
+ * \brief Machine learning algorithms
+ */
+namespace machine_learning {
+#define MIN_DISTANCE 1e-4  ///< Minimum average distance of image nodes
+
+/**
+ * Create the distance matrix or
+ * [U-matrix](https://en.wikipedia.org/wiki/U-matrix) from the trained
+ * 3D weiths matrix and save to disk.
+ *
+ * \param [in] fname filename to save in (gets overwriten without
+ * confirmation)
+ * \param [in] W model matrix to save
+ * \returns 0 if all ok
+ * \returns -1 if file creation failed
+ */
+int save_u_matrix(const char *fname,
+                  const std::vector<std::vector<std::valarray<double>>> &W) {
+    std::ofstream fp(fname);
+    if (!fp) {  // error with fopen
+        char msg[120];
+        std::snprintf(msg, sizeof(msg), "File error (%s): ", fname);
+        std::perror(msg);
+        return -1;
+    }
+
+    // neighborhood range
+    unsigned int R = 1;
+
+    for (int i = 0; i < W.size(); i++) {         // for each x
+        for (int j = 0; j < W[0].size(); j++) {  // for each y
+            double distance = 0.f;
+
+            int from_x = std::max<int>(0, i - R);
+            int to_x = std::min<int>(W.size(), i + R + 1);
+            int from_y = std::max<int>(0, j - R);
+            int to_y = std::min<int>(W[0].size(), j + R + 1);
+            int l, m;
+#ifdef _OPENMP
+#pragma omp parallel for reduction(+ : distance)
+#endif
+            for (l = from_x; l < to_x; l++) {      // scan neighborhoor in x
+                for (m = from_y; m < to_y; m++) {  // scan neighborhood in y
+                    auto d = W[i][j] - W[l][m];
+                    double d2 = std::pow(d, 2).sum();
+                    distance += std::sqrt(d2);
+                    // distance += d2;
+                }
+            }
+
+            distance /= R * R;          // mean distance from neighbors
+            fp << distance;             // print the mean separation
+            if (j < W[0].size() - 1) {  // if not the last column
+                fp << ',';              // suffix comma
+            }
+        }
+        if (i < W.size() - 1)  // if not the last row
+            fp << '\n';        // start a new line
+    }
+
+    fp.close();
+    return 0;
+}
+
+/**
+ * Update weights of the SOM using Kohonen algorithm
+ *
+ * \param[in] X data point - N features
+ * \param[in,out] W weights matrix - PxQxN
+ * \param[in,out] D temporary vector to store distances PxQ
+ * \param[in] alpha learning rate \f$0<\alpha\le1\f$
+ * \param[in] R neighborhood range
+ * \returns minimum distance of sample and trained weights
+ */
+double update_weights(const std::valarray<double> &X,
+                      std::vector<std::vector<std::valarray<double>>> *W,
+                      std::vector<std::valarray<double>> *D, double alpha,
+                      int R) {
+    int x, y;
+    int num_out_x = static_cast<int>(W->size());       // output nodes - in X
+    int num_out_y = static_cast<int>(W[0][0].size());  // output nodes - in Y
+    int num_features = static_cast<int>(W[0][0][0].size());  //  features = in Z
+    double d_min = 0.f;
+
+#ifdef _OPENMP
+#pragma omp for
+#endif
+    // step 1: for each output point
+    for (x = 0; x < num_out_x; x++) {
+        for (y = 0; y < num_out_y; y++) {
+            (*D)[x][y] = 0.f;
+            // compute Euclidian distance of each output
+            // point from the current sample
+            auto d = ((*W)[x][y] - X);
+            (*D)[x][y] = (d * d).sum();
+            (*D)[x][y] = std::sqrt((*D)[x][y]);
+        }
+    }
+
+    // step 2:  get closest node i.e., node with snallest Euclidian distance
+    // to the current pattern
+    int d_min_x, d_min_y;
+    get_min_2d(*D, &d_min, &d_min_x, &d_min_y);
+
+    // step 3a: get the neighborhood range
+    int from_x = std::max(0, d_min_x - R);
+    int to_x = std::min(num_out_x, d_min_x + R + 1);
+    int from_y = std::max(0, d_min_y - R);
+    int to_y = std::min(num_out_y, d_min_y + R + 1);
+
+    // step 3b: update the weights of nodes in the
+    // neighborhood
+#ifdef _OPENMP
+#pragma omp for
+#endif
+    for (x = from_x; x < to_x; x++) {
+        for (y = from_y; y < to_y; y++) {
+            /* you can enable the following normalization if needed.
+   personally, I found it detrimental to convergence */
+            // const double s2pi = sqrt(2.f * M_PI);
+            // double normalize = 1.f / (alpha * s2pi);
+
+            /* apply scaling inversely proportional to distance from the
+               current node */
+            double d2 =
+                (d_min_x - x) * (d_min_x - x) + (d_min_y - y) * (d_min_y - y);
+            double scale_factor = std::exp(-d2 / (2.f * alpha * alpha));
+
+            (*W)[x][y] += (X - (*W)[x][y]) * alpha * scale_factor;
+        }
+    }
+    return d_min;
+}
+
+/**
+ * Apply incremental algorithm with updating neighborhood and learning
+ * rates on all samples in the given datset.
+ *
+ * \param[in] X data set
+ * \param[in,out] W weights matrix
+ * \param[in] alpha_min terminal value of alpha
+ */
+void kohonen_som(const std::vector<std::valarray<double>> &X,
+                 std::vector<std::vector<std::valarray<double>>> *W,
+                 double alpha_min) {
+    int num_samples = X.size();      // number of rows
+    int num_features = X[0].size();  // number of columns
+    int num_out = W->size();         // output matrix size
+    int R = num_out >> 2, iter = 0;
+    double alpha = 1.f;
+
+    std::vector<std::valarray<double>> D(num_out);
+    for (int i = 0; i < num_out; i++) D[i] = std::valarray<double>(num_out);
+
+    double dmin = 1.f;        // average minimum distance of all samples
+    double past_dmin = 1.f;   // average minimum distance of all samples
+    double dmin_ratio = 1.f;  // change per step
+
+    // Loop alpha from 1 to slpha_min
+    for (; alpha > 0 && dmin_ratio > 1e-5; alpha -= 1e-4, iter++) {
+        // Loop for each sample pattern in the data set
+        for (int sample = 0; sample < num_samples; sample++) {
+            // update weights for the current input pattern sample
+            dmin += update_weights(X[sample], W, &D, alpha, R);
+        }
+
+        // every 100th iteration, reduce the neighborhood range
+        if (iter % 300 == 0 && R > 1)
+            R--;
+
+        dmin /= num_samples;
+
+        // termination condition variable -> % change in minimum distance
+        dmin_ratio = (past_dmin - dmin) / past_dmin;
+        if (dmin_ratio < 0)
+            dmin_ratio = 1.f;
+        past_dmin = dmin;
+
+        std::cout << "iter: " << iter << "\t alpha: " << alpha << "\t R: " << R
+                  << "\t d_min: " << dmin_ratio << "\r";
+    }
+
+    std::cout << "\n";
+}
+
+}  // namespace machine_learning
+
+using machine_learning::kohonen_som;
+using machine_learning::save_u_matrix;
+
+/** @} */
+
+/** Creates a random set of points distributed in four clusters in
+ * 3D space with centroids at the points
+ * * \f$(0,5, 0.5, 0.5)\f$
+ * * \f$(0,5,-0.5, -0.5)\f$
+ * * \f$(-0,5, 0.5, 0.5)\f$
+ * * \f$(-0,5,-0.5, -0.5)\f$
+ *
+ * \param[out] data matrix to store data in
+ */
+void test_2d_classes(std::vector<std::valarray<double>> *data) {
+    const int N = data->size();
+    const double R = 0.3;  // radius of cluster
+    int i;
+    const int num_classes = 4;
+    const double centres[][2] = {
+        // centres of each class cluster
+        {.5, .5},   // centre of class 1
+        {.5, -.5},  // centre of class 2
+        {-.5, .5},  // centre of class 3
+        {-.5, -.5}  // centre of class 4
+    };
+
+#ifdef _OPENMP
+#pragma omp for
+#endif
+    for (i = 0; i < N; i++) {
+        // select a random class for the point
+        int cls = std::rand() % num_classes;
+
+        // create random coordinates (x,y,z) around the centre of the class
+        data[0][i][0] = _random(centres[cls][0] - R, centres[cls][0] + R);
+        data[0][i][1] = _random(centres[cls][1] - R, centres[cls][1] + R);
+
+        /* The follosing can also be used
+        for (int j = 0; j < 2; j++)
+            data[i][j] = _random(centres[class][j] - R, centres[class][j] + R);
+        */
+    }
+}
+
+/** Test that creates a random set of points distributed in four clusters in
+ * circumference of a circle and trains an SOM that finds that circular pattern.
+ * The following [CSV](https://en.wikipedia.org/wiki/Comma-separated_values)
+ * files are created to validate the execution:
+ * * `test1.csv`: random test samples points with a circular pattern
+ * * `w11.csv`: initial random map
+ * * `w12.csv`: trained SOM map
+ */
+void test1() {
+    int j, N = 300;
+    int features = 2;
+    int num_out = 30;
+    std::vector<std::valarray<double>> X(N);
+    std::vector<std::vector<std::valarray<double>>> W(num_out);
+    for (int i = 0; i < std::max(num_out, N); i++) {
+        // loop till max(N, num_out)
+        if (i < N)  // only add new arrays if i < N
+            X[i] = std::valarray<double>(features);
+        if (i < num_out) {  // only add new arrays if i < num_out
+            W[i] = std::vector<std::valarray<double>>(num_out);
+            for (int k = 0; k < num_out; k++) {
+                W[i][k] = std::valarray<double>(features);
+#ifdef _OPENMP
+#pragma omp for
+#endif
+                for (j = 0; j < features; j++)
+                    // preallocate with random initial weights
+                    W[i][k][j] = _random(-10, 10);
+            }
+        }
+    }
+
+    test_2d_classes(&X);  // create test data around circumference of a circle
+    save_2d_data("test1.csv", X);  // save test data points
+    save_u_matrix("w11.csv", W);   // save initial random weights
+    kohonen_som(X, &W, 1e-4);      // train the SOM
+    save_u_matrix("w12.csv", W);   // save the resultant weights
+}
+
+/** Creates a random set of points distributed in four clusters in
+ * 3D space with centroids at the points
+ * * \f$(0,5, 0.5, 0.5)\f$
+ * * \f$(0,5,-0.5, -0.5)\f$
+ * * \f$(-0,5, 0.5, 0.5)\f$
+ * * \f$(-0,5,-0.5, -0.5)\f$
+ *
+ * \param[out] data matrix to store data in
+ */
+void test_3d_classes1(std::vector<std::valarray<double>> *data) {
+    const int N = data->size();
+    const double R = 0.3;  // radius of cluster
+    int i;
+    const int num_classes = 4;
+    const double centres[][3] = {
+        // centres of each class cluster
+        {.5, .5, .5},    // centre of class 1
+        {.5, -.5, -.5},  // centre of class 2
+        {-.5, .5, .5},   // centre of class 3
+        {-.5, -.5 - .5}  // centre of class 4
+    };
+
+#ifdef _OPENMP
+#pragma omp for
+#endif
+    for (i = 0; i < N; i++) {
+        // select a random class for the point
+        int cls = std::rand() % num_classes;
+
+        // create random coordinates (x,y,z) around the centre of the class
+        data[0][i][0] = _random(centres[cls][0] - R, centres[cls][0] + R);
+        data[0][i][1] = _random(centres[cls][1] - R, centres[cls][1] + R);
+        data[0][i][2] = _random(centres[cls][2] - R, centres[cls][2] + R);
+
+        /* The follosing can also be used
+        for (int j = 0; j < 3; j++)
+            data[i][j] = _random(centres[class][j] - R, centres[class][j] + R);
+        */
+    }
+}
+
+/** Test that creates a random set of points distributed in 4 clusters in
+ * 3D space and trains an SOM that finds the topological pattern. The following
+ * [CSV](https://en.wikipedia.org/wiki/Comma-separated_values) files are created
+ * to validate the execution:
+ * * `test2.csv`: random test samples points with a lamniscate pattern
+ * * `w21.csv`: initial random map
+ * * `w22.csv`: trained SOM map
+ */
+void test2() {
+    int j, N = 300;
+    int features = 3;
+    int num_out = 30;
+    std::vector<std::valarray<double>> X(N);
+    std::vector<std::vector<std::valarray<double>>> W(num_out);
+    for (int i = 0; i < std::max(num_out, N); i++) {
+        // loop till max(N, num_out)
+        if (i < N)  // only add new arrays if i < N
+            X[i] = std::valarray<double>(features);
+        if (i < num_out) {  // only add new arrays if i < num_out
+            W[i] = std::vector<std::valarray<double>>(num_out);
+            for (int k = 0; k < num_out; k++) {
+                W[i][k] = std::valarray<double>(features);
+#ifdef _OPENMP
+#pragma omp for
+#endif
+                for (j = 0; j < features; j++)
+                    // preallocate with random initial weights
+                    W[i][k][j] = _random(-10, 10);
+            }
+        }
+    }
+
+    test_3d_classes1(&X);  // create test data around circumference of a circle
+    save_2d_data("test2.csv", X);  // save test data points
+    save_u_matrix("w21.csv", W);   // save initial random weights
+    kohonen_som(X, &W, 1e-4);      // train the SOM
+    save_u_matrix("w22.csv", W);   // save the resultant weights
+}
+
+/** Creates a random set of points distributed in four clusters in
+ * 3D space with centroids at the points
+ * * \f$(0,5, 0.5, 0.5)\f$
+ * * \f$(0,5,-0.5, -0.5)\f$
+ * * \f$(-0,5, 0.5, 0.5)\f$
+ * * \f$(-0,5,-0.5, -0.5)\f$
+ *
+ * \param[out] data matrix to store data in
+ */
+void test_3d_classes2(std::vector<std::valarray<double>> *data) {
+    const int N = data->size();
+    const double R = 0.2;  // radius of cluster
+    int i;
+    const int num_classes = 8;
+    const double centres[][3] = {
+        // centres of each class cluster
+        {.5, .5, .5},    // centre of class 1
+        {.5, .5, -.5},   // centre of class 2
+        {.5, -.5, .5},   // centre of class 3
+        {.5, -.5, -.5},  // centre of class 4
+        {-.5, .5, .5},   // centre of class 5
+        {-.5, .5, -.5},  // centre of class 6
+        {-.5, -.5, .5},  // centre of class 7
+        {-.5, -.5, -.5}  // centre of class 8
+    };
+
+#ifdef _OPENMP
+#pragma omp for
+#endif
+    for (i = 0; i < N; i++) {
+        // select a random class for the point
+        int cls = std::rand() % num_classes;
+
+        // create random coordinates (x,y,z) around the centre of the class
+        data[0][i][0] = _random(centres[cls][0] - R, centres[cls][0] + R);
+        data[0][i][1] = _random(centres[cls][1] - R, centres[cls][1] + R);
+        data[0][i][2] = _random(centres[cls][2] - R, centres[cls][2] + R);
+
+        /* The follosing can also be used
+        for (int j = 0; j < 3; j++)
+            data[i][j] = _random(centres[class][j] - R, centres[class][j] + R);
+        */
+    }
+}
+
+/** Test that creates a random set of points distributed in eight clusters in
+ * 3D space and trains an SOM that finds the topological pattern. The following
+ * [CSV](https://en.wikipedia.org/wiki/Comma-separated_values) files are created
+ * to validate the execution:
+ * * `test3.csv`: random test samples points with a circular pattern
+ * * `w31.csv`: initial random map
+ * * `w32.csv`: trained SOM map
+ */
+void test3() {
+    int j, N = 500;
+    int features = 3;
+    int num_out = 30;
+    std::vector<std::valarray<double>> X(N);
+    std::vector<std::vector<std::valarray<double>>> W(num_out);
+    for (int i = 0; i < std::max(num_out, N); i++) {
+        // loop till max(N, num_out)
+        if (i < N)  // only add new arrays if i < N
+            X[i] = std::valarray<double>(features);
+        if (i < num_out) {  // only add new arrays if i < num_out
+            W[i] = std::vector<std::valarray<double>>(num_out);
+            for (int k = 0; k < num_out; k++) {
+                W[i][k] = std::valarray<double>(features);
+#ifdef _OPENMP
+#pragma omp for
+#endif
+                for (j = 0; j < features; j++)
+                    // preallocate with random initial weights
+                    W[i][k][j] = _random(-10, 10);
+            }
+        }
+    }
+
+    test_3d_classes2(&X);  // create test data around circumference of a circle
+    save_2d_data("test3.csv", X);  // save test data points
+    save_u_matrix("w31.csv", W);   // save initial random weights
+    kohonen_som(X, &W, 1e-4);      // train the SOM
+    save_u_matrix("w32.csv", W);   // save the resultant weights
+}
+
+/**
+ * Convert clock cycle difference to time in seconds
+ *
+ * \param[in] start_t start clock
+ * \param[in] end_t end clock
+ * \returns time difference in seconds
+ */
+double get_clock_diff(clock_t start_t, clock_t end_t) {
+    return static_cast<double>(end_t - start_t) / CLOCKS_PER_SEC;
+}
+
+/** Main function */
+int main(int argc, char **argv) {
+#ifdef _OPENMP
+    std::cout << "Using OpenMP based parallelization\n";
+#else
+    std::cout << "NOT using OpenMP based parallelization\n";
+#endif
+
+    std::srand(std::time(nullptr));
+
+    std::clock_t start_clk = std::clock();
+    test1();
+    auto end_clk = std::clock();
+    std::cout << "Test 1 completed in " << get_clock_diff(start_clk, end_clk)
+              << " sec\n";
+
+    start_clk = std::clock();
+    test2();
+    end_clk = std::clock();
+    std::cout << "Test 2 completed in " << get_clock_diff(start_clk, end_clk)
+              << " sec\n";
+
+    start_clk = std::clock();
+    test3();
+    end_clk = std::clock();
+    std::cout << "Test 3 completed in " << get_clock_diff(start_clk, end_clk)
+              << " sec\n";
+
+    std::cout
+        << "(Note: Calculated times include: creating test sets, training "
+           "model and writing files to disk.)\n\n";
+    return 0;
+}
--- a/machine_learning/kohonen_som_trace.cpp
+++ b/machine_learning/kohonen_som_trace.cpp
@@ -0,0 +1,474 @@
+/**
+ * \addtogroup machine_learning Machine Learning Algorithms
+ * @{
+ * \file
+ * \brief [Kohonen self organizing
+ * map](https://en.wikipedia.org/wiki/Self-organizing_map) (data tracing)
+ *
+ * This example implements a powerful self organizing map algorithm.
+ * The algorithm creates a connected network of weights that closely
+ * follows the given data points. This this creates a chain of nodes that
+ * resembles the given input shape.
+ *
+ * \author [Krishna Vedala](https://github.com/kvedala)
+ *
+ * \note This C++ version of the program is considerable slower than its [C
+ * counterpart](https://github.com/kvedala/C/blob/master/machine_learning/kohonen_som_trace.c)
+ * \note The compiled code is much slower when compiled with MS Visual C++ 2019
+ * than with GCC on windows
+ * \see kohonen_som_topology.cpp
+ */
+#define _USE_MATH_DEFINES  // required for MS Visual C++
+#include <algorithm>
+#include <cmath>
+#include <cstdlib>
+#include <ctime>
+#include <fstream>
+#include <iostream>
+#include <valarray>
+#include <vector>
+#ifdef _OPENMP  // check if OpenMP based parallellization is available
+#include <omp.h>
+#endif
+
+/**
+ * Helper function to generate a random number in a given interval.
+ * \n Steps:
+ * 1. `r1 = rand() % 100` gets a random number between 0 and 99
+ * 2. `r2 = r1 / 100` converts random number to be between 0 and 0.99
+ * 3. scale and offset the random number to given range of \f$[a,b]\f$
+ *
+ * \param[in] a lower limit
+ * \param[in] b upper limit
+ * \returns random number in the range \f$[a,b]\f$
+ */
+double _random(double a, double b) {
+    return ((b - a) * (std::rand() % 100) / 100.f) + a;
+}
+
+/**
+ * Save a given n-dimensional data martix to file.
+ *
+ * \param[in] fname filename to save in (gets overwriten without confirmation)
+ * \param[in] X matrix to save
+ * \returns 0 if all ok
+ * \returns -1 if file creation failed
+ */
+int save_nd_data(const char *fname,
+                 const std::vector<std::valarray<double>> &X) {
+    size_t num_points = X.size();       // number of rows
+    size_t num_features = X[0].size();  // number of columns
+
+    std::ofstream fp;
+    fp.open(fname);
+    if (!fp.is_open()) {
+        // error with opening file to write
+        std::cerr << "Error opening file " << fname << "\n";
+        return -1;
+    }
+
+    // for each point in the array
+    for (int i = 0; i < num_points; i++) {
+        // for each feature in the array
+        for (int j = 0; j < num_features; j++) {
+            fp << X[i][j];             // print the feature value
+            if (j < num_features - 1)  // if not the last feature
+                fp << ",";             // suffix comma
+        }
+        if (i < num_points - 1)  // if not the last row
+            fp << "\n";          // start a new line
+    }
+
+    fp.close();
+    return 0;
+}
+
+/** \namespace machine_learning
+ * \brief Machine learning algorithms
+ */
+namespace machine_learning {
+
+/**
+ * Update weights of the SOM using Kohonen algorithm
+ *
+ * \param[in] X data point
+ * \param[in,out] W weights matrix
+ * \param[in,out] D temporary vector to store distances
+ * \param[in] alpha learning rate \f$0<\alpha\le1\f$
+ * \param[in] R neighborhood range
+ */
+void update_weights(const std::valarray<double> &x,
+                    std::vector<std::valarray<double>> *W,
+                    std::valarray<double> *D, double alpha, int R) {
+    int j, k;
+    int num_out = W->size();      // number of SOM output nodes
+    int num_features = x.size();  // number of data features
+
+#ifdef _OPENMP
+#pragma omp for
+#endif
+    // step 1: for each output point
+    for (j = 0; j < num_out; j++) {
+        // compute Euclidian distance of each output
+        // point from the current sample
+        (*D)[j] = (((*W)[j] - x) * ((*W)[j] - x)).sum();
+    }
+
+    // step 2:  get closest node i.e., node with snallest Euclidian distance to
+    // the current pattern
+    auto result = std::min_element(std::begin(*D), std::end(*D));
+    double d_min = *result;
+    int d_min_idx = std::distance(std::begin(*D), result);
+
+    // step 3a: get the neighborhood range
+    int from_node = std::max(0, d_min_idx - R);
+    int to_node = std::min(num_out, d_min_idx + R + 1);
+
+    // step 3b: update the weights of nodes in the
+    // neighborhood
+#ifdef _OPENMP
+#pragma omp for
+#endif
+    for (j = from_node; j < to_node; j++)
+        // update weights of nodes in the neighborhood
+        (*W)[j] += alpha * (x - (*W)[j]);
+}
+
+/**
+ * Apply incremental algorithm with updating neighborhood and learning rates
+ * on all samples in the given datset.
+ *
+ * \param[in] X data set
+ * \param[in,out] W weights matrix
+ * \param[in] alpha_min terminal value of alpha
+ */
+void kohonen_som_tracer(const std::vector<std::valarray<double>> &X,
+                        std::vector<std::valarray<double>> *W,
+                        double alpha_min) {
+    int num_samples = X.size();      // number of rows
+    int num_features = X[0].size();  // number of columns
+    int num_out = W->size();         // number of rows
+    int R = num_out >> 2, iter = 0;
+    double alpha = 1.f;
+
+    std::valarray<double> D(num_out);
+
+    // Loop alpha from 1 to slpha_min
+    for (; alpha > alpha_min; alpha -= 0.01, iter++) {
+        // Loop for each sample pattern in the data set
+        for (int sample = 0; sample < num_samples; sample++) {
+            // update weights for the current input pattern sample
+            update_weights(X[sample], W, &D, alpha, R);
+        }
+
+        // every 10th iteration, reduce the neighborhood range
+        if (iter % 10 == 0 && R > 1)
+            R--;
+    }
+}
+
+}  // namespace machine_learning
+
+/** @} */
+
+using machine_learning::kohonen_som_tracer;
+
+/** Creates a random set of points distributed *near* the circumference
+ * of a circle and trains an SOM that finds that circular pattern. The
+ * generating function is
+ * \f{eqnarray*}{
+ * r &\in& [1-\delta r, 1+\delta r)\\
+ * \theta &\in& [0, 2\pi)\\
+ * x &=& r\cos\theta\\
+ * y &=& r\sin\theta
+ * \f}
+ *
+ * \param[out] data matrix to store data in
+ */
+void test_circle(std::vector<std::valarray<double>> *data) {
+    const int N = data->size();
+    const double R = 0.75, dr = 0.3;
+    double a_t = 0., b_t = 2.f * M_PI;  // theta random between 0 and 2*pi
+    double a_r = R - dr, b_r = R + dr;  // radius random between R-dr and R+dr
+    int i;
+
+#ifdef _OPENMP
+#pragma omp for
+#endif
+    for (i = 0; i < N; i++) {
+        double r = _random(a_r, b_r);      // random radius
+        double theta = _random(a_t, b_t);  // random theta
+        data[0][i][0] = r * cos(theta);    // convert from polar to cartesian
+        data[0][i][1] = r * sin(theta);
+    }
+}
+
+/** Test that creates a random set of points distributed *near* the
+ * circumference of a circle and trains an SOM that finds that circular pattern.
+ * The following [CSV](https://en.wikipedia.org/wiki/Comma-separated_values)
+ * files are created to validate the execution:
+ * * `test1.csv`: random test samples points with a circular pattern
+ * * `w11.csv`: initial random map
+ * * `w12.csv`: trained SOM map
+ *
+ * The outputs can be readily plotted in [gnuplot](https:://gnuplot.info) using
+ * the following snippet
+ * ```gnuplot
+ * set datafile separator ','
+ * plot "test1.csv" title "original", \
+ *      "w11.csv" title "w1", \
+ *      "w12.csv" title "w2"
+ * ```
+ * ![Sample execution
+ * output](https://raw.githubusercontent.com/TheAlgorithms/C-Plus-Plus/docs/images/machine_learning/kohonen/test1.svg)
+ */
+void test1() {
+    int j, N = 500;
+    int features = 2;
+    int num_out = 50;
+    std::vector<std::valarray<double>> X(N);
+    std::vector<std::valarray<double>> W(num_out);
+    for (int i = 0; i < std::max(num_out, N); i++) {
+        // loop till max(N, num_out)
+        if (i < N)  // only add new arrays if i < N
+            X[i] = std::valarray<double>(features);
+        if (i < num_out) {  // only add new arrays if i < num_out
+            W[i] = std::valarray<double>(features);
+
+#ifdef _OPENMP
+#pragma omp for
+#endif
+            for (j = 0; j < features; j++)
+                // preallocate with random initial weights
+                W[i][j] = _random(-1, 1);
+        }
+    }
+
+    test_circle(&X);  // create test data around circumference of a circle
+    save_nd_data("test1.csv", X);    // save test data points
+    save_nd_data("w11.csv", W);      // save initial random weights
+    kohonen_som_tracer(X, &W, 0.1);  // train the SOM
+    save_nd_data("w12.csv", W);      // save the resultant weights
+}
+
+/** Creates a random set of points distributed *near* the locus
+ * of the [Lamniscate of
+ * Gerono](https://en.wikipedia.org/wiki/Lemniscate_of_Gerono).
+ * \f{eqnarray*}{
+ * \delta r &=& 0.2\\
+ * \delta x &\in& [-\delta r, \delta r)\\
+ * \delta y &\in& [-\delta r, \delta r)\\
+ * \theta &\in& [0, \pi)\\
+ * x &=& \delta x + \cos\theta\\
+ * y &=& \delta y + \frac{\sin(2\theta)}{2}
+ * \f}
+ * \param[out] data matrix to store data in
+ */
+void test_lamniscate(std::vector<std::valarray<double>> *data) {
+    const int N = data->size();
+    const double dr = 0.2;
+    int i;
+
+#ifdef _OPENMP
+#pragma omp for
+#endif
+    for (i = 0; i < N; i++) {
+        double dx = _random(-dr, dr);     // random change in x
+        double dy = _random(-dr, dr);     // random change in y
+        double theta = _random(0, M_PI);  // random theta
+        data[0][i][0] = dx + cos(theta);  // convert from polar to cartesian
+        data[0][i][1] = dy + sin(2. * theta) / 2.f;
+    }
+}
+
+/** Test that creates a random set of points distributed *near* the locus
+ * of the [Lamniscate of
+ * Gerono](https://en.wikipedia.org/wiki/Lemniscate_of_Gerono) and trains an SOM
+ * that finds that circular pattern. The following
+ * [CSV](https://en.wikipedia.org/wiki/Comma-separated_values) files are created
+ * to validate the execution:
+ * * `test2.csv`: random test samples points with a lamniscate pattern
+ * * `w21.csv`: initial random map
+ * * `w22.csv`: trained SOM map
+ *
+ * The outputs can be readily plotted in [gnuplot](https:://gnuplot.info) using
+ * the following snippet
+ * ```gnuplot
+ * set datafile separator ','
+ * plot "test2.csv" title "original", \
+ *      "w21.csv" title "w1", \
+ *      "w22.csv" title "w2"
+ * ```
+ * ![Sample execution
+ * output](https://raw.githubusercontent.com/TheAlgorithms/C-Plus-Plus/docs/images/machine_learning/kohonen/test2.svg)
+ */
+void test2() {
+    int j, N = 500;
+    int features = 2;
+    int num_out = 20;
+    std::vector<std::valarray<double>> X(N);
+    std::vector<std::valarray<double>> W(num_out);
+    for (int i = 0; i < std::max(num_out, N); i++) {
+        // loop till max(N, num_out)
+        if (i < N)  // only add new arrays if i < N
+            X[i] = std::valarray<double>(features);
+        if (i < num_out) {  // only add new arrays if i < num_out
+            W[i] = std::valarray<double>(features);
+
+#ifdef _OPENMP
+#pragma omp for
+#endif
+            for (j = 0; j < features; j++)
+                // preallocate with random initial weights
+                W[i][j] = _random(-1, 1);
+        }
+    }
+
+    test_lamniscate(&X);              // create test data around the lamniscate
+    save_nd_data("test2.csv", X);     // save test data points
+    save_nd_data("w21.csv", W);       // save initial random weights
+    kohonen_som_tracer(X, &W, 0.01);  // train the SOM
+    save_nd_data("w22.csv", W);       // save the resultant weights
+}
+
+/** Creates a random set of points distributed in six clusters in
+ * 3D space with centroids at the points
+ * * \f${0.5, 0.5, 0.5}\f$
+ * * \f${0.5, 0.5, -0.5}\f$
+ * * \f${0.5, -0.5, 0.5}\f$
+ * * \f${0.5, -0.5, -0.5}\f$
+ * * \f${-0.5, 0.5, 0.5}\f$
+ * * \f${-0.5, 0.5, -0.5}\f$
+ * * \f${-0.5, -0.5, 0.5}\f$
+ * * \f${-0.5, -0.5, -0.5}\f$
+ *
+ * \param[out] data matrix to store data in
+ */
+void test_3d_classes(std::vector<std::valarray<double>> *data) {
+    const int N = data->size();
+    const double R = 0.1;  // radius of cluster
+    int i;
+    const int num_classes = 8;
+    const double centres[][3] = {
+        // centres of each class cluster
+        {.5, .5, .5},    // centre of class 0
+        {.5, .5, -.5},   // centre of class 1
+        {.5, -.5, .5},   // centre of class 2
+        {.5, -.5, -.5},  // centre of class 3
+        {-.5, .5, .5},   // centre of class 4
+        {-.5, .5, -.5},  // centre of class 5
+        {-.5, -.5, .5},  // centre of class 6
+        {-.5, -.5, -.5}  // centre of class 7
+    };
+
+#ifdef _OPENMP
+#pragma omp for
+#endif
+    for (i = 0; i < N; i++) {
+        int cls =
+            std::rand() % num_classes;  // select a random class for the point
+
+        // create random coordinates (x,y,z) around the centre of the class
+        data[0][i][0] = _random(centres[cls][0] - R, centres[cls][0] + R);
+        data[0][i][1] = _random(centres[cls][1] - R, centres[cls][1] + R);
+        data[0][i][2] = _random(centres[cls][2] - R, centres[cls][2] + R);
+
+        /* The follosing can also be used
+        for (int j = 0; j < 3; j++)
+            data[0][i][j] = _random(centres[cls][j] - R, centres[cls][j] + R);
+        */
+    }
+}
+
+/** Test that creates a random set of points distributed in six clusters in
+ * 3D space. The following
+ * [CSV](https://en.wikipedia.org/wiki/Comma-separated_values) files are created
+ * to validate the execution:
+ * * `test3.csv`: random test samples points with a circular pattern
+ * * `w31.csv`: initial random map
+ * * `w32.csv`: trained SOM map
+ *
+ * The outputs can be readily plotted in [gnuplot](https:://gnuplot.info) using
+ * the following snippet
+ * ```gnuplot
+ * set datafile separator ','
+ * plot "test3.csv" title "original", \
+ *      "w31.csv" title "w1", \
+ *      "w32.csv" title "w2"
+ * ```
+ * ![Sample execution
+ * output](https://raw.githubusercontent.com/TheAlgorithms/C-Plus-Plus/docs/images/machine_learning/kohonen/test3.svg)
+ */
+void test3() {
+    int j, N = 200;
+    int features = 3;
+    int num_out = 20;
+    std::vector<std::valarray<double>> X(N);
+    std::vector<std::valarray<double>> W(num_out);
+    for (int i = 0; i < std::max(num_out, N); i++) {
+        // loop till max(N, num_out)
+        if (i < N)  // only add new arrays if i < N
+            X[i] = std::valarray<double>(features);
+        if (i < num_out) {  // only add new arrays if i < num_out
+            W[i] = std::valarray<double>(features);
+
+#ifdef _OPENMP
+#pragma omp for
+#endif
+            for (j = 0; j < features; j++)
+                // preallocate with random initial weights
+                W[i][j] = _random(-1, 1);
+        }
+    }
+
+    test_3d_classes(&X);              // create test data around the lamniscate
+    save_nd_data("test3.csv", X);     // save test data points
+    save_nd_data("w31.csv", W);       // save initial random weights
+    kohonen_som_tracer(X, &W, 0.01);  // train the SOM
+    save_nd_data("w32.csv", W);       // save the resultant weights
+}
+
+/**
+ * Convert clock cycle difference to time in seconds
+ *
+ * \param[in] start_t start clock
+ * \param[in] end_t end clock
+ * \returns time difference in seconds
+ */
+double get_clock_diff(clock_t start_t, clock_t end_t) {
+    return static_cast<double>(end_t - start_t) / CLOCKS_PER_SEC;
+}
+
+/** Main function */
+int main(int argc, char **argv) {
+#ifdef _OPENMP
+    std::cout << "Using OpenMP based parallelization\n";
+#else
+    std::cout << "NOT using OpenMP based parallelization\n";
+#endif
+
+    std::srand(std::time(nullptr));
+
+    std::clock_t start_clk = std::clock();
+    test1();
+    auto end_clk = std::clock();
+    std::cout << "Test 1 completed in " << get_clock_diff(start_clk, end_clk)
+              << " sec\n";
+
+    start_clk = std::clock();
+    test2();
+    end_clk = std::clock();
+    std::cout << "Test 2 completed in " << get_clock_diff(start_clk, end_clk)
+              << " sec\n";
+
+    start_clk = std::clock();
+    test3();
+    end_clk = std::clock();
+    std::cout << "Test 3 completed in " << get_clock_diff(start_clk, end_clk)
+              << " sec\n";
+
+    std::cout
+        << "(Note: Calculated times include: creating test sets, training "
+           "model and writing files to disk.)\n\n";
+    return 0;
+}