Algorithms_in_C++  1.0.0
Set of algorithms implemented in C++.
kohonen_som_trace.cpp File Reference

Kohonen self organizing map (data tracing) More...

#include <algorithm>
#include <cmath>
#include <cstdlib>
#include <ctime>
#include <fstream>
#include <iostream>
#include <valarray>
#include <vector>
Include dependency graph for kohonen_som_trace.cpp:

Namespaces

 machine_learning
 Machine learning algorithms.
 

Functions

double _random (double a, double b)
 
int save_nd_data (const char *fname, const std::vector< std::valarray< double >> &X)
 
void machine_learning::update_weights (const std::valarray< double > &x, std::vector< std::valarray< double >> *W, std::valarray< double > *D, double alpha, int R)
 
void machine_learning::kohonen_som_tracer (const std::vector< std::valarray< double >> &X, std::vector< std::valarray< double >> *W, double alpha_min)
 
void test_circle (std::vector< std::valarray< double >> *data)
 
void test1 ()
 
void test_lamniscate (std::vector< std::valarray< double >> *data)
 
void test2 ()
 
void test_3d_classes (std::vector< std::valarray< double >> *data)
 
void test3 ()
 
double get_clock_diff (clock_t start_t, clock_t end_t)
 
int main (int argc, char **argv)
 

Detailed Description

Kohonen self organizing map (data tracing)

This example implements a powerful self organizing map algorithm. The algorithm creates a connected network of weights that closely follows the given data points. This this creates a chain of nodes that resembles the given input shape.

Author
Krishna Vedala
Note
This C++ version of the program is considerable slower than its C counterpart
The compiled code is much slower when compiled with MS Visual C++ 2019 than with GCC on windows
See also
kohonen_som_topology.cpp

Function Documentation

◆ get_clock_diff()

double get_clock_diff ( clock_t  start_t,
clock_t  end_t 
)

Convert clock cycle difference to time in seconds

Parameters
[in]start_tstart clock
[in]end_tend clock
Returns
time difference in seconds
438  {
439  return static_cast<double>(end_t - start_t) / CLOCKS_PER_SEC;
440 }

◆ main()

int main ( int  argc,
char **  argv 
)

Main function

443  {
444 #ifdef _OPENMP
445  std::cout << "Using OpenMP based parallelization\n";
446 #else
447  std::cout << "NOT using OpenMP based parallelization\n";
448 #endif
449 
450  std::srand(std::time(nullptr));
451 
452  std::clock_t start_clk = std::clock();
453  test1();
454  auto end_clk = std::clock();
455  std::cout << "Test 1 completed in " << get_clock_diff(start_clk, end_clk)
456  << " sec\n";
457 
458  start_clk = std::clock();
459  test2();
460  end_clk = std::clock();
461  std::cout << "Test 2 completed in " << get_clock_diff(start_clk, end_clk)
462  << " sec\n";
463 
464  start_clk = std::clock();
465  test3();
466  end_clk = std::clock();
467  std::cout << "Test 3 completed in " << get_clock_diff(start_clk, end_clk)
468  << " sec\n";
469 
470  std::cout
471  << "(Note: Calculated times include: creating test sets, training "
472  "model and writing files to disk.)\n\n";
473  return 0;
474 }
Here is the call graph for this function:

◆ test1()

void test1 ( )

Test that creates a random set of points distributed near the circumference of a circle and trains an SOM that finds that circular pattern. The following CSV files are created to validate the execution:

  • test1.csv: random test samples points with a circular pattern
  • w11.csv: initial random map
  • w12.csv: trained SOM map

The outputs can be readily plotted in gnuplot using the following snippet

set datafile separator ','
plot "test1.csv" title "original", \
"w11.csv" title "w1", \
"w12.csv" title "w2"

Sample execution
output

225  {
226  int j, N = 500;
227  int features = 2;
228  int num_out = 50;
231  for (int i = 0; i < std::max(num_out, N); i++) {
232  // loop till max(N, num_out)
233  if (i < N) // only add new arrays if i < N
234  X[i] = std::valarray<double>(features);
235  if (i < num_out) { // only add new arrays if i < num_out
236  W[i] = std::valarray<double>(features);
237 
238 #ifdef _OPENMP
239 #pragma omp for
240 #endif
241  for (j = 0; j < features; j++)
242  // preallocate with random initial weights
243  W[i][j] = _random(-1, 1);
244  }
245  }
246 
247  test_circle(&X); // create test data around circumference of a circle
248  save_nd_data("test1.csv", X); // save test data points
249  save_nd_data("w11.csv", W); // save initial random weights
250  kohonen_som_tracer(X, &W, 0.1); // train the SOM
251  save_nd_data("w12.csv", W); // save the resultant weights
252 }
Here is the call graph for this function:

◆ test2()

void test2 ( )

Test that creates a random set of points distributed near the locus of the Lamniscate of Gerono and trains an SOM that finds that circular pattern. The following CSV files are created to validate the execution:

  • test2.csv: random test samples points with a lamniscate pattern
  • w21.csv: initial random map
  • w22.csv: trained SOM map

The outputs can be readily plotted in gnuplot using the following snippet

set datafile separator ','
plot "test2.csv" title "original", \
"w21.csv" title "w1", \
"w22.csv" title "w2"

Sample execution
output

305  {
306  int j, N = 500;
307  int features = 2;
308  int num_out = 20;
311  for (int i = 0; i < std::max(num_out, N); i++) {
312  // loop till max(N, num_out)
313  if (i < N) // only add new arrays if i < N
314  X[i] = std::valarray<double>(features);
315  if (i < num_out) { // only add new arrays if i < num_out
316  W[i] = std::valarray<double>(features);
317 
318 #ifdef _OPENMP
319 #pragma omp for
320 #endif
321  for (j = 0; j < features; j++)
322  // preallocate with random initial weights
323  W[i][j] = _random(-1, 1);
324  }
325  }
326 
327  test_lamniscate(&X); // create test data around the lamniscate
328  save_nd_data("test2.csv", X); // save test data points
329  save_nd_data("w21.csv", W); // save initial random weights
330  kohonen_som_tracer(X, &W, 0.01); // train the SOM
331  save_nd_data("w22.csv", W); // save the resultant weights
332 }
Here is the call graph for this function:

◆ test3()

void test3 ( )

Test that creates a random set of points distributed in six clusters in 3D space. The following CSV files are created to validate the execution:

  • test3.csv: random test samples points with a circular pattern
  • w31.csv: initial random map
  • w32.csv: trained SOM map

The outputs can be readily plotted in gnuplot using the following snippet

set datafile separator ','
plot "test3.csv" title "original", \
"w31.csv" title "w1", \
"w32.csv" title "w2"

Sample execution
output

402  {
403  int j, N = 200;
404  int features = 3;
405  int num_out = 20;
408  for (int i = 0; i < std::max(num_out, N); i++) {
409  // loop till max(N, num_out)
410  if (i < N) // only add new arrays if i < N
411  X[i] = std::valarray<double>(features);
412  if (i < num_out) { // only add new arrays if i < num_out
413  W[i] = std::valarray<double>(features);
414 
415 #ifdef _OPENMP
416 #pragma omp for
417 #endif
418  for (j = 0; j < features; j++)
419  // preallocate with random initial weights
420  W[i][j] = _random(-1, 1);
421  }
422  }
423 
424  test_3d_classes(&X); // create test data around the lamniscate
425  save_nd_data("test3.csv", X); // save test data points
426  save_nd_data("w31.csv", W); // save initial random weights
427  kohonen_som_tracer(X, &W, 0.01); // train the SOM
428  save_nd_data("w32.csv", W); // save the resultant weights
429 }
Here is the call graph for this function:

◆ test_3d_classes()

void test_3d_classes ( std::vector< std::valarray< double >> *  data)

Creates a random set of points distributed in six clusters in 3D space with centroids at the points

  • \({0.5, 0.5, 0.5}\)
  • \({0.5, 0.5, -0.5}\)
  • \({0.5, -0.5, 0.5}\)
  • \({0.5, -0.5, -0.5}\)
  • \({-0.5, 0.5, 0.5}\)
  • \({-0.5, 0.5, -0.5}\)
  • \({-0.5, -0.5, 0.5}\)
  • \({-0.5, -0.5, -0.5}\)
Parameters
[out]datamatrix to store data in
347  {
348  const int N = data->size();
349  const double R = 0.1; // radius of cluster
350  int i;
351  const int num_classes = 8;
352  const double centres[][3] = {
353  // centres of each class cluster
354  {.5, .5, .5}, // centre of class 0
355  {.5, .5, -.5}, // centre of class 1
356  {.5, -.5, .5}, // centre of class 2
357  {.5, -.5, -.5}, // centre of class 3
358  {-.5, .5, .5}, // centre of class 4
359  {-.5, .5, -.5}, // centre of class 5
360  {-.5, -.5, .5}, // centre of class 6
361  {-.5, -.5, -.5} // centre of class 7
362  };
363 
364 #ifdef _OPENMP
365 #pragma omp for
366 #endif
367  for (i = 0; i < N; i++) {
368  int cls =
369  std::rand() % num_classes; // select a random class for the point
370 
371  // create random coordinates (x,y,z) around the centre of the class
372  data[0][i][0] = _random(centres[cls][0] - R, centres[cls][0] + R);
373  data[0][i][1] = _random(centres[cls][1] - R, centres[cls][1] + R);
374  data[0][i][2] = _random(centres[cls][2] - R, centres[cls][2] + R);
375 
376  /* The follosing can also be used
377  for (int j = 0; j < 3; j++)
378  data[0][i][j] = _random(centres[cls][j] - R, centres[cls][j] + R);
379  */
380  }
381 }
Here is the call graph for this function:

◆ test_circle()

void test_circle ( std::vector< std::valarray< double >> *  data)

Creates a random set of points distributed near the circumference of a circle and trains an SOM that finds that circular pattern. The generating function is

\begin{eqnarray*} r &\in& [1-\delta r, 1+\delta r)\\ \theta &\in& [0, 2\pi)\\ x &=& r\cos\theta\\ y &=& r\sin\theta \end{eqnarray*}

Parameters
[out]datamatrix to store data in
188  {
189  const int N = data->size();
190  const double R = 0.75, dr = 0.3;
191  double a_t = 0., b_t = 2.f * M_PI; // theta random between 0 and 2*pi
192  double a_r = R - dr, b_r = R + dr; // radius random between R-dr and R+dr
193  int i;
194 
195 #ifdef _OPENMP
196 #pragma omp for
197 #endif
198  for (i = 0; i < N; i++) {
199  double r = _random(a_r, b_r); // random radius
200  double theta = _random(a_t, b_t); // random theta
201  data[0][i][0] = r * cos(theta); // convert from polar to cartesian
202  data[0][i][1] = r * sin(theta);
203  }
204 }
Here is the call graph for this function:

◆ test_lamniscate()

void test_lamniscate ( std::vector< std::valarray< double >> *  data)

Creates a random set of points distributed near the locus of the Lamniscate of Gerono.

\begin{eqnarray*} \delta r &=& 0.2\\ \delta x &\in& [-\delta r, \delta r)\\ \delta y &\in& [-\delta r, \delta r)\\ \theta &\in& [0, \pi)\\ x &=& \delta x + \cos\theta\\ y &=& \delta y + \frac{\sin(2\theta)}{2} \end{eqnarray*}

Parameters
[out]datamatrix to store data in
267  {
268  const int N = data->size();
269  const double dr = 0.2;
270  int i;
271 
272 #ifdef _OPENMP
273 #pragma omp for
274 #endif
275  for (i = 0; i < N; i++) {
276  double dx = _random(-dr, dr); // random change in x
277  double dy = _random(-dr, dr); // random change in y
278  double theta = _random(0, M_PI); // random theta
279  data[0][i][0] = dx + cos(theta); // convert from polar to cartesian
280  data[0][i][1] = dy + sin(2. * theta) / 2.f;
281  }
282 }
Here is the call graph for this function:
std::srand
T srand(T... args)
std::clock_t
std::cos
T cos(T... args)
std::vector
STL class.
test_3d_classes
void test_3d_classes(std::vector< std::valarray< double >> *data)
Definition: kohonen_som_trace.cpp:347
save_nd_data
int save_nd_data(const char *fname, const std::vector< std::valarray< double >> &X)
Definition: kohonen_som_trace.cpp:57
get_clock_diff
double get_clock_diff(clock_t start_t, clock_t end_t)
Definition: kohonen_som_trace.cpp:438
test1
void test1()
Definition: kohonen_som_trace.cpp:225
std::clock
T clock(T... args)
std::cout
std::valarray
STL class.
test2
void test2()
Definition: kohonen_som_trace.cpp:305
std::rand
T rand(T... args)
test_lamniscate
void test_lamniscate(std::vector< std::valarray< double >> *data)
Definition: kohonen_som_trace.cpp:267
std::sin
T sin(T... args)
data
int data[MAX]
test data
Definition: hash_search.cpp:24
test3
void test3()
Definition: kohonen_som_trace.cpp:402
std::time
T time(T... args)
std::max
T max(T... args)
_random
double _random(double a, double b)
Definition: kohonen_som_topology.cpp:50
machine_learning::kohonen_som_tracer
void kohonen_som_tracer(const std::vector< std::valarray< double >> &X, std::vector< std::valarray< double >> *W, double alpha_min)
Definition: kohonen_som_trace.cpp:145
test_circle
void test_circle(std::vector< std::valarray< double >> *data)
Definition: kohonen_som_trace.cpp:188