Algorithms_in_C++  1.0.0
Set of algorithms implemented in C++.
kohonen_som_topology.cpp File Reference

Kohonen self organizing map (topological map) More...

#include <algorithm>
#include <cmath>
#include <cstdlib>
#include <ctime>
#include <fstream>
#include <iostream>
#include <valarray>
#include <vector>
Include dependency graph for kohonen_som_topology.cpp:

Namespaces

 machine_learning
 Machine learning algorithms.
 

Macros

#define _USE_MATH_DEFINES
 
#define MIN_DISTANCE   1e-4
 Minimum average distance of image nodes.
 

Functions

double _random (double a, double b)
 
int save_2d_data (const char *fname, const std::vector< std::valarray< double >> &X)
 
void get_min_2d (const std::vector< std::valarray< double >> &X, double *val, int *x_idx, int *y_idx)
 
int machine_learning::save_u_matrix (const char *fname, const std::vector< std::vector< std::valarray< double >>> &W)
 
double machine_learning::update_weights (const std::valarray< double > &X, std::vector< std::vector< std::valarray< double >>> *W, std::vector< std::valarray< double >> *D, double alpha, int R)
 
void machine_learning::kohonen_som (const std::vector< std::valarray< double >> &X, std::vector< std::vector< std::valarray< double >>> *W, double alpha_min)
 
void test_2d_classes (std::vector< std::valarray< double >> *data)
 
void test1 ()
 
void test_3d_classes1 (std::vector< std::valarray< double >> *data)
 
void test2 ()
 
void test_3d_classes2 (std::vector< std::valarray< double >> *data)
 
void test3 ()
 
double get_clock_diff (clock_t start_t, clock_t end_t)
 
int main (int argc, char **argv)
 

Detailed Description

Kohonen self organizing map (topological map)

Author
Krishna Vedala

This example implements a powerful unsupervised learning algorithm called as a self organizing map. The algorithm creates a connected network of weights that closely follows the given data points. This thus creates a topological map of the given data i.e., it maintains the relationship between varipus data points in a much higher dimesional space by creating an equivalent in a 2-dimensional space. Trained topological maps for the test cases in the program

Note
This C++ version of the program is considerable slower than its C counterpart
The compiled code is much slower when compiled with MS Visual C++ 2019 than with GCC on windows
See also
kohonen_som_trace.cpp

Function Documentation

◆ get_clock_diff()

double get_clock_diff ( clock_t  start_t,
clock_t  end_t 
)

Convert clock cycle difference to time in seconds

Parameters
[in]start_tstart clock
[in]end_tend clock
Returns
time difference in seconds
561  {
562  return static_cast<double>(end_t - start_t) / CLOCKS_PER_SEC;
563 }

◆ main()

int main ( int  argc,
char **  argv 
)

Main function

566  {
567 #ifdef _OPENMP
568  std::cout << "Using OpenMP based parallelization\n";
569 #else
570  std::cout << "NOT using OpenMP based parallelization\n";
571 #endif
572 
573  std::srand(std::time(nullptr));
574 
575  std::clock_t start_clk = std::clock();
576  test1();
577  auto end_clk = std::clock();
578  std::cout << "Test 1 completed in " << get_clock_diff(start_clk, end_clk)
579  << " sec\n";
580 
581  start_clk = std::clock();
582  test2();
583  end_clk = std::clock();
584  std::cout << "Test 2 completed in " << get_clock_diff(start_clk, end_clk)
585  << " sec\n";
586 
587  start_clk = std::clock();
588  test3();
589  end_clk = std::clock();
590  std::cout << "Test 3 completed in " << get_clock_diff(start_clk, end_clk)
591  << " sec\n";
592 
593  std::cout
594  << "(Note: Calculated times include: creating test sets, training "
595  "model and writing files to disk.)\n\n";
596  return 0;
597 }
Here is the call graph for this function:

◆ test1()

void test1 ( )

Test that creates a random set of points distributed in four clusters in circumference of a circle and trains an SOM that finds that circular pattern. The following CSV files are created to validate the execution:

  • test1.csv: random test samples points with a circular pattern
  • w11.csv: initial random map
  • w12.csv: trained SOM map
359  {
360  int j, N = 300;
361  int features = 2;
362  int num_out = 30;
365  for (int i = 0; i < std::max(num_out, N); i++) {
366  // loop till max(N, num_out)
367  if (i < N) // only add new arrays if i < N
368  X[i] = std::valarray<double>(features);
369  if (i < num_out) { // only add new arrays if i < num_out
370  W[i] = std::vector<std::valarray<double>>(num_out);
371  for (int k = 0; k < num_out; k++) {
372  W[i][k] = std::valarray<double>(features);
373 #ifdef _OPENMP
374 #pragma omp for
375 #endif
376  for (j = 0; j < features; j++)
377  // preallocate with random initial weights
378  W[i][k][j] = _random(-10, 10);
379  }
380  }
381  }
382 
383  test_2d_classes(&X); // create test data around circumference of a circle
384  save_2d_data("test1.csv", X); // save test data points
385  save_u_matrix("w11.csv", W); // save initial random weights
386  kohonen_som(X, &W, 1e-4); // train the SOM
387  save_u_matrix("w12.csv", W); // save the resultant weights
388 }
Here is the call graph for this function:

◆ test2()

void test2 ( )

Test that creates a random set of points distributed in 4 clusters in 3D space and trains an SOM that finds the topological pattern. The following CSV files are created to validate the execution:

  • test2.csv: random test samples points with a lamniscate pattern
  • w21.csv: initial random map
  • w22.csv: trained SOM map
439  {
440  int j, N = 300;
441  int features = 3;
442  int num_out = 30;
445  for (int i = 0; i < std::max(num_out, N); i++) {
446  // loop till max(N, num_out)
447  if (i < N) // only add new arrays if i < N
448  X[i] = std::valarray<double>(features);
449  if (i < num_out) { // only add new arrays if i < num_out
450  W[i] = std::vector<std::valarray<double>>(num_out);
451  for (int k = 0; k < num_out; k++) {
452  W[i][k] = std::valarray<double>(features);
453 #ifdef _OPENMP
454 #pragma omp for
455 #endif
456  for (j = 0; j < features; j++)
457  // preallocate with random initial weights
458  W[i][k][j] = _random(-10, 10);
459  }
460  }
461  }
462 
463  test_3d_classes1(&X); // create test data around circumference of a circle
464  save_2d_data("test2.csv", X); // save test data points
465  save_u_matrix("w21.csv", W); // save initial random weights
466  kohonen_som(X, &W, 1e-4); // train the SOM
467  save_u_matrix("w22.csv", W); // save the resultant weights
468 }
Here is the call graph for this function:

◆ test3()

void test3 ( )

Test that creates a random set of points distributed in eight clusters in 3D space and trains an SOM that finds the topological pattern. The following CSV files are created to validate the execution:

  • test3.csv: random test samples points with a circular pattern
  • w31.csv: initial random map
  • w32.csv: trained SOM map
523  {
524  int j, N = 500;
525  int features = 3;
526  int num_out = 30;
529  for (int i = 0; i < std::max(num_out, N); i++) {
530  // loop till max(N, num_out)
531  if (i < N) // only add new arrays if i < N
532  X[i] = std::valarray<double>(features);
533  if (i < num_out) { // only add new arrays if i < num_out
534  W[i] = std::vector<std::valarray<double>>(num_out);
535  for (int k = 0; k < num_out; k++) {
536  W[i][k] = std::valarray<double>(features);
537 #ifdef _OPENMP
538 #pragma omp for
539 #endif
540  for (j = 0; j < features; j++)
541  // preallocate with random initial weights
542  W[i][k][j] = _random(-10, 10);
543  }
544  }
545  }
546 
547  test_3d_classes2(&X); // create test data around circumference of a circle
548  save_2d_data("test3.csv", X); // save test data points
549  save_u_matrix("w31.csv", W); // save initial random weights
550  kohonen_som(X, &W, 1e-4); // train the SOM
551  save_u_matrix("w32.csv", W); // save the resultant weights
552 }
Here is the call graph for this function:

◆ test_2d_classes()

void test_2d_classes ( std::vector< std::valarray< double >> *  data)

Creates a random set of points distributed in four clusters in 3D space with centroids at the points

  • \((0,5, 0.5, 0.5)\)
  • \((0,5,-0.5, -0.5)\)
  • \((-0,5, 0.5, 0.5)\)
  • \((-0,5,-0.5, -0.5)\)
Parameters
[out]datamatrix to store data in
320  {
321  const int N = data->size();
322  const double R = 0.3; // radius of cluster
323  int i;
324  const int num_classes = 4;
325  const double centres[][2] = {
326  // centres of each class cluster
327  {.5, .5}, // centre of class 1
328  {.5, -.5}, // centre of class 2
329  {-.5, .5}, // centre of class 3
330  {-.5, -.5} // centre of class 4
331  };
332 
333 #ifdef _OPENMP
334 #pragma omp for
335 #endif
336  for (i = 0; i < N; i++) {
337  // select a random class for the point
338  int cls = std::rand() % num_classes;
339 
340  // create random coordinates (x,y,z) around the centre of the class
341  data[0][i][0] = _random(centres[cls][0] - R, centres[cls][0] + R);
342  data[0][i][1] = _random(centres[cls][1] - R, centres[cls][1] + R);
343 
344  /* The follosing can also be used
345  for (int j = 0; j < 2; j++)
346  data[i][j] = _random(centres[class][j] - R, centres[class][j] + R);
347  */
348  }
349 }
Here is the call graph for this function:

◆ test_3d_classes1()

void test_3d_classes1 ( std::vector< std::valarray< double >> *  data)

Creates a random set of points distributed in four clusters in 3D space with centroids at the points

  • \((0,5, 0.5, 0.5)\)
  • \((0,5,-0.5, -0.5)\)
  • \((-0,5, 0.5, 0.5)\)
  • \((-0,5,-0.5, -0.5)\)
Parameters
[out]datamatrix to store data in
399  {
400  const int N = data->size();
401  const double R = 0.3; // radius of cluster
402  int i;
403  const int num_classes = 4;
404  const double centres[][3] = {
405  // centres of each class cluster
406  {.5, .5, .5}, // centre of class 1
407  {.5, -.5, -.5}, // centre of class 2
408  {-.5, .5, .5}, // centre of class 3
409  {-.5, -.5 - .5} // centre of class 4
410  };
411 
412 #ifdef _OPENMP
413 #pragma omp for
414 #endif
415  for (i = 0; i < N; i++) {
416  // select a random class for the point
417  int cls = std::rand() % num_classes;
418 
419  // create random coordinates (x,y,z) around the centre of the class
420  data[0][i][0] = _random(centres[cls][0] - R, centres[cls][0] + R);
421  data[0][i][1] = _random(centres[cls][1] - R, centres[cls][1] + R);
422  data[0][i][2] = _random(centres[cls][2] - R, centres[cls][2] + R);
423 
424  /* The follosing can also be used
425  for (int j = 0; j < 3; j++)
426  data[i][j] = _random(centres[class][j] - R, centres[class][j] + R);
427  */
428  }
429 }
Here is the call graph for this function:

◆ test_3d_classes2()

void test_3d_classes2 ( std::vector< std::valarray< double >> *  data)

Creates a random set of points distributed in four clusters in 3D space with centroids at the points

  • \((0,5, 0.5, 0.5)\)
  • \((0,5,-0.5, -0.5)\)
  • \((-0,5, 0.5, 0.5)\)
  • \((-0,5,-0.5, -0.5)\)
Parameters
[out]datamatrix to store data in
479  {
480  const int N = data->size();
481  const double R = 0.2; // radius of cluster
482  int i;
483  const int num_classes = 8;
484  const double centres[][3] = {
485  // centres of each class cluster
486  {.5, .5, .5}, // centre of class 1
487  {.5, .5, -.5}, // centre of class 2
488  {.5, -.5, .5}, // centre of class 3
489  {.5, -.5, -.5}, // centre of class 4
490  {-.5, .5, .5}, // centre of class 5
491  {-.5, .5, -.5}, // centre of class 6
492  {-.5, -.5, .5}, // centre of class 7
493  {-.5, -.5, -.5} // centre of class 8
494  };
495 
496 #ifdef _OPENMP
497 #pragma omp for
498 #endif
499  for (i = 0; i < N; i++) {
500  // select a random class for the point
501  int cls = std::rand() % num_classes;
502 
503  // create random coordinates (x,y,z) around the centre of the class
504  data[0][i][0] = _random(centres[cls][0] - R, centres[cls][0] + R);
505  data[0][i][1] = _random(centres[cls][1] - R, centres[cls][1] + R);
506  data[0][i][2] = _random(centres[cls][2] - R, centres[cls][2] + R);
507 
508  /* The follosing can also be used
509  for (int j = 0; j < 3; j++)
510  data[i][j] = _random(centres[class][j] - R, centres[class][j] + R);
511  */
512  }
513 }
Here is the call graph for this function:
std::srand
T srand(T... args)
std::clock_t
std::vector
STL class.
test1
void test1()
Definition: kohonen_som_topology.cpp:359
std::clock
T clock(T... args)
test3
void test3()
Definition: kohonen_som_topology.cpp:523
std::cout
k
ll k
Definition: matrix_exponentiation.cpp:48
std::valarray
STL class.
test_3d_classes1
void test_3d_classes1(std::vector< std::valarray< double >> *data)
Definition: kohonen_som_topology.cpp:399
test2
void test2()
Definition: kohonen_som_topology.cpp:439
std::rand
T rand(T... args)
save_2d_data
int save_2d_data(const char *fname, const std::vector< std::valarray< double >> &X)
Definition: kohonen_som_topology.cpp:62
data
int data[MAX]
test data
Definition: hash_search.cpp:24
machine_learning::save_u_matrix
int save_u_matrix(const char *fname, const std::vector< std::vector< std::valarray< double >>> &W)
Definition: kohonen_som_topology.cpp:135
get_clock_diff
double get_clock_diff(clock_t start_t, clock_t end_t)
Definition: kohonen_som_topology.cpp:561
std::time
T time(T... args)
std::max
T max(T... args)
test_2d_classes
void test_2d_classes(std::vector< std::valarray< double >> *data)
Definition: kohonen_som_topology.cpp:320
machine_learning::kohonen_som
void kohonen_som(const std::vector< std::valarray< double >> &X, std::vector< std::vector< std::valarray< double >>> *W, double alpha_min)
Definition: kohonen_som_topology.cpp:261
test_3d_classes2
void test_3d_classes2(std::vector< std::valarray< double >> *data)
Definition: kohonen_som_topology.cpp:479
_random
double _random(double a, double b)
Definition: kohonen_som_topology.cpp:50