Algorithms_in_C++  1.0.0
Set of algorithms implemented in C++.
kohonen_som_topology.cpp File Reference

Kohonen self organizing map (topological map) More...

#include <algorithm>
#include <cmath>
#include <cstdlib>
#include <ctime>
#include <fstream>
#include <iostream>
#include <valarray>
#include <vector>
Include dependency graph for kohonen_som_topology.cpp:

Namespaces

 machine_learning
 Machine learning algorithms.
 

Macros

#define _USE_MATH_DEFINES
 
#define MIN_DISTANCE   1e-4
 Minimum average distance of image nodes.
 

Functions

double _random (double a, double b)
 
int save_2d_data (const char *fname, const std::vector< std::valarray< double >> &X)
 
void get_min_2d (const std::vector< std::valarray< double >> &X, double *val, int *x_idx, int *y_idx)
 
int machine_learning::save_u_matrix (const char *fname, const std::vector< std::vector< std::valarray< double >>> &W)
 
double machine_learning::update_weights (const std::valarray< double > &X, std::vector< std::vector< std::valarray< double >>> *W, std::vector< std::valarray< double >> *D, double alpha, int R)
 
void machine_learning::kohonen_som (const std::vector< std::valarray< double >> &X, std::vector< std::vector< std::valarray< double >>> *W, double alpha_min)
 
void test_2d_classes (std::vector< std::valarray< double >> *data)
 
void test1 ()
 
void test_3d_classes1 (std::vector< std::valarray< double >> *data)
 
void test2 ()
 
void test_3d_classes2 (std::vector< std::valarray< double >> *data)
 
void test3 ()
 
double get_clock_diff (clock_t start_t, clock_t end_t)
 
int main (int argc, char **argv)
 

Detailed Description

Kohonen self organizing map (topological map)

Author
Krishna Vedala This example implements a powerful unsupervised learning algorithm called as a self organizing map. The algorithm creates a connected network of weights that closely follows the given data points. This thus creates a topological map of the given data i.e., it maintains the relationship between varipus data points in a much higher dimesional space by creating an equivalent in a 2-dimensional space. Trained topological maps for the test cases in the program
Note
This C++ version of the program is considerable slower than its C counterpart
The compiled code is much slower when compiled with MS Visual C++ 2019 than with GCC on windows
See also
kohonen_som_trace.cpp

Function Documentation

◆ get_clock_diff()

double get_clock_diff ( clock_t  start_t,
clock_t  end_t 
)

Convert clock cycle difference to time in seconds

Parameters
[in]start_tstart clock
[in]end_tend clock
Returns
time difference in seconds
559  {
560  return static_cast<double>(end_t - start_t) / CLOCKS_PER_SEC;
561 }

◆ main()

int main ( int  argc,
char **  argv 
)

Main function

564  {
565 #ifdef _OPENMP
566  std::cout << "Using OpenMP based parallelization\n";
567 #else
568  std::cout << "NOT using OpenMP based parallelization\n";
569 #endif
570 
571  std::srand(std::time(nullptr));
572 
573  std::clock_t start_clk = std::clock();
574  test1();
575  auto end_clk = std::clock();
576  std::cout << "Test 1 completed in " << get_clock_diff(start_clk, end_clk)
577  << " sec\n";
578 
579  start_clk = std::clock();
580  test2();
581  end_clk = std::clock();
582  std::cout << "Test 2 completed in " << get_clock_diff(start_clk, end_clk)
583  << " sec\n";
584 
585  start_clk = std::clock();
586  test3();
587  end_clk = std::clock();
588  std::cout << "Test 3 completed in " << get_clock_diff(start_clk, end_clk)
589  << " sec\n";
590 
591  std::cout
592  << "(Note: Calculated times include: creating test sets, training "
593  "model and writing files to disk.)\n\n";
594  return 0;
595 }
Here is the call graph for this function:

◆ test1()

void test1 ( )

Test that creates a random set of points distributed in four clusters in circumference of a circle and trains an SOM that finds that circular pattern. The following CSV files are created to validate the execution:

  • test1.csv: random test samples points with a circular pattern
  • w11.csv: initial random map
  • w12.csv: trained SOM map
357  {
358  int j, N = 300;
359  int features = 2;
360  int num_out = 30;
363  for (int i = 0; i < std::max(num_out, N); i++) {
364  // loop till max(N, num_out)
365  if (i < N) // only add new arrays if i < N
366  X[i] = std::valarray<double>(features);
367  if (i < num_out) { // only add new arrays if i < num_out
368  W[i] = std::vector<std::valarray<double>>(num_out);
369  for (int k = 0; k < num_out; k++) {
370  W[i][k] = std::valarray<double>(features);
371 #ifdef _OPENMP
372 #pragma omp for
373 #endif
374  for (j = 0; j < features; j++)
375  // preallocate with random initial weights
376  W[i][k][j] = _random(-10, 10);
377  }
378  }
379  }
380 
381  test_2d_classes(&X); // create test data around circumference of a circle
382  save_2d_data("test1.csv", X); // save test data points
383  save_u_matrix("w11.csv", W); // save initial random weights
384  kohonen_som(X, &W, 1e-4); // train the SOM
385  save_u_matrix("w12.csv", W); // save the resultant weights
386 }
Here is the call graph for this function:

◆ test2()

void test2 ( )

Test that creates a random set of points distributed in 4 clusters in 3D space and trains an SOM that finds the topological pattern. The following CSV files are created to validate the execution:

  • test2.csv: random test samples points with a lamniscate pattern
  • w21.csv: initial random map
  • w22.csv: trained SOM map
437  {
438  int j, N = 300;
439  int features = 3;
440  int num_out = 30;
443  for (int i = 0; i < std::max(num_out, N); i++) {
444  // loop till max(N, num_out)
445  if (i < N) // only add new arrays if i < N
446  X[i] = std::valarray<double>(features);
447  if (i < num_out) { // only add new arrays if i < num_out
448  W[i] = std::vector<std::valarray<double>>(num_out);
449  for (int k = 0; k < num_out; k++) {
450  W[i][k] = std::valarray<double>(features);
451 #ifdef _OPENMP
452 #pragma omp for
453 #endif
454  for (j = 0; j < features; j++)
455  // preallocate with random initial weights
456  W[i][k][j] = _random(-10, 10);
457  }
458  }
459  }
460 
461  test_3d_classes1(&X); // create test data around circumference of a circle
462  save_2d_data("test2.csv", X); // save test data points
463  save_u_matrix("w21.csv", W); // save initial random weights
464  kohonen_som(X, &W, 1e-4); // train the SOM
465  save_u_matrix("w22.csv", W); // save the resultant weights
466 }
Here is the call graph for this function:

◆ test3()

void test3 ( )

Test that creates a random set of points distributed in eight clusters in 3D space and trains an SOM that finds the topological pattern. The following CSV files are created to validate the execution:

  • test3.csv: random test samples points with a circular pattern
  • w31.csv: initial random map
  • w32.csv: trained SOM map
521  {
522  int j, N = 500;
523  int features = 3;
524  int num_out = 30;
527  for (int i = 0; i < std::max(num_out, N); i++) {
528  // loop till max(N, num_out)
529  if (i < N) // only add new arrays if i < N
530  X[i] = std::valarray<double>(features);
531  if (i < num_out) { // only add new arrays if i < num_out
532  W[i] = std::vector<std::valarray<double>>(num_out);
533  for (int k = 0; k < num_out; k++) {
534  W[i][k] = std::valarray<double>(features);
535 #ifdef _OPENMP
536 #pragma omp for
537 #endif
538  for (j = 0; j < features; j++)
539  // preallocate with random initial weights
540  W[i][k][j] = _random(-10, 10);
541  }
542  }
543  }
544 
545  test_3d_classes2(&X); // create test data around circumference of a circle
546  save_2d_data("test3.csv", X); // save test data points
547  save_u_matrix("w31.csv", W); // save initial random weights
548  kohonen_som(X, &W, 1e-4); // train the SOM
549  save_u_matrix("w32.csv", W); // save the resultant weights
550 }
Here is the call graph for this function:

◆ test_2d_classes()

void test_2d_classes ( std::vector< std::valarray< double >> *  data)

Creates a random set of points distributed in four clusters in 3D space with centroids at the points

  • \((0,5, 0.5, 0.5)\)
  • \((0,5,-0.5, -0.5)\)
  • \((-0,5, 0.5, 0.5)\)
  • \((-0,5,-0.5, -0.5)\)
Parameters
[out]datamatrix to store data in
318  {
319  const int N = data->size();
320  const double R = 0.3; // radius of cluster
321  int i;
322  const int num_classes = 4;
323  const double centres[][2] = {
324  // centres of each class cluster
325  {.5, .5}, // centre of class 1
326  {.5, -.5}, // centre of class 2
327  {-.5, .5}, // centre of class 3
328  {-.5, -.5} // centre of class 4
329  };
330 
331 #ifdef _OPENMP
332 #pragma omp for
333 #endif
334  for (i = 0; i < N; i++) {
335  // select a random class for the point
336  int cls = std::rand() % num_classes;
337 
338  // create random coordinates (x,y,z) around the centre of the class
339  data[0][i][0] = _random(centres[cls][0] - R, centres[cls][0] + R);
340  data[0][i][1] = _random(centres[cls][1] - R, centres[cls][1] + R);
341 
342  /* The follosing can also be used
343  for (int j = 0; j < 2; j++)
344  data[i][j] = _random(centres[class][j] - R, centres[class][j] + R);
345  */
346  }
347 }
Here is the call graph for this function:

◆ test_3d_classes1()

void test_3d_classes1 ( std::vector< std::valarray< double >> *  data)

Creates a random set of points distributed in four clusters in 3D space with centroids at the points

  • \((0,5, 0.5, 0.5)\)
  • \((0,5,-0.5, -0.5)\)
  • \((-0,5, 0.5, 0.5)\)
  • \((-0,5,-0.5, -0.5)\)
Parameters
[out]datamatrix to store data in
397  {
398  const int N = data->size();
399  const double R = 0.3; // radius of cluster
400  int i;
401  const int num_classes = 4;
402  const double centres[][3] = {
403  // centres of each class cluster
404  {.5, .5, .5}, // centre of class 1
405  {.5, -.5, -.5}, // centre of class 2
406  {-.5, .5, .5}, // centre of class 3
407  {-.5, -.5 - .5} // centre of class 4
408  };
409 
410 #ifdef _OPENMP
411 #pragma omp for
412 #endif
413  for (i = 0; i < N; i++) {
414  // select a random class for the point
415  int cls = std::rand() % num_classes;
416 
417  // create random coordinates (x,y,z) around the centre of the class
418  data[0][i][0] = _random(centres[cls][0] - R, centres[cls][0] + R);
419  data[0][i][1] = _random(centres[cls][1] - R, centres[cls][1] + R);
420  data[0][i][2] = _random(centres[cls][2] - R, centres[cls][2] + R);
421 
422  /* The follosing can also be used
423  for (int j = 0; j < 3; j++)
424  data[i][j] = _random(centres[class][j] - R, centres[class][j] + R);
425  */
426  }
427 }
Here is the call graph for this function:

◆ test_3d_classes2()

void test_3d_classes2 ( std::vector< std::valarray< double >> *  data)

Creates a random set of points distributed in four clusters in 3D space with centroids at the points

  • \((0,5, 0.5, 0.5)\)
  • \((0,5,-0.5, -0.5)\)
  • \((-0,5, 0.5, 0.5)\)
  • \((-0,5,-0.5, -0.5)\)
Parameters
[out]datamatrix to store data in
477  {
478  const int N = data->size();
479  const double R = 0.2; // radius of cluster
480  int i;
481  const int num_classes = 8;
482  const double centres[][3] = {
483  // centres of each class cluster
484  {.5, .5, .5}, // centre of class 1
485  {.5, .5, -.5}, // centre of class 2
486  {.5, -.5, .5}, // centre of class 3
487  {.5, -.5, -.5}, // centre of class 4
488  {-.5, .5, .5}, // centre of class 5
489  {-.5, .5, -.5}, // centre of class 6
490  {-.5, -.5, .5}, // centre of class 7
491  {-.5, -.5, -.5} // centre of class 8
492  };
493 
494 #ifdef _OPENMP
495 #pragma omp for
496 #endif
497  for (i = 0; i < N; i++) {
498  // select a random class for the point
499  int cls = std::rand() % num_classes;
500 
501  // create random coordinates (x,y,z) around the centre of the class
502  data[0][i][0] = _random(centres[cls][0] - R, centres[cls][0] + R);
503  data[0][i][1] = _random(centres[cls][1] - R, centres[cls][1] + R);
504  data[0][i][2] = _random(centres[cls][2] - R, centres[cls][2] + R);
505 
506  /* The follosing can also be used
507  for (int j = 0; j < 3; j++)
508  data[i][j] = _random(centres[class][j] - R, centres[class][j] + R);
509  */
510  }
511 }
Here is the call graph for this function:
std::srand
T srand(T... args)
std::clock_t
std::vector
STL class.
test1
void test1()
Definition: kohonen_som_topology.cpp:357
std::clock
T clock(T... args)
test3
void test3()
Definition: kohonen_som_topology.cpp:521
std::cout
k
ll k
Definition: matrix_exponentiation.cpp:48
std::valarray
STL class.
test_3d_classes1
void test_3d_classes1(std::vector< std::valarray< double >> *data)
Definition: kohonen_som_topology.cpp:397
test2
void test2()
Definition: kohonen_som_topology.cpp:437
std::rand
T rand(T... args)
save_2d_data
int save_2d_data(const char *fname, const std::vector< std::valarray< double >> &X)
Definition: kohonen_som_topology.cpp:60
data
int data[MAX]
test data
Definition: hash_search.cpp:24
machine_learning::save_u_matrix
int save_u_matrix(const char *fname, const std::vector< std::vector< std::valarray< double >>> &W)
Definition: kohonen_som_topology.cpp:133
get_clock_diff
double get_clock_diff(clock_t start_t, clock_t end_t)
Definition: kohonen_som_topology.cpp:559
std::time
T time(T... args)
std::max
T max(T... args)
test_2d_classes
void test_2d_classes(std::vector< std::valarray< double >> *data)
Definition: kohonen_som_topology.cpp:318
machine_learning::kohonen_som
void kohonen_som(const std::vector< std::valarray< double >> &X, std::vector< std::vector< std::valarray< double >>> *W, double alpha_min)
Definition: kohonen_som_topology.cpp:259
test_3d_classes2
void test_3d_classes2(std::vector< std::valarray< double >> *data)
Definition: kohonen_som_topology.cpp:477
_random
double _random(double a, double b)
Definition: kohonen_som_topology.cpp:48