Algorithms_in_C++  1.0.0
Set of algorithms implemented in C++.
minimum_edit_distance.cpp File Reference

Implementation of Minimum Edit Distance using Dynamic Programing. More...

#include <cassert>
#include <iostream>
#include <vector>
Include dependency graph for minimum_edit_distance.cpp:

Namespaces

 dynamic_programming
 Dynamic Programming algorithms.
 
 Minimum
 Implementation of Minimum Edit Distance algorithm.
 

Functions

uint64_t dynamic_programming::minimum_edit_distance::min (uint64_t x, uint64_t y, uint64_t z)
 Takes input of the cost of three operations: Insert, Replace and Delete and return the minimum cost among them. More...
 
uint64_t dynamic_programming::minimum_edit_distance::editDistDP (std::string str1, std::string str2, uint64_t m, uint64_t n)
 Calculates and stores the result of all the sub-problems, so that we don't have to recur to compute the minimum cost of a particular operation if it is already computed and stored in the dp vector. More...
 
static void test ()
 Self-test implementations. More...
 
int main (int argc, char *argv[])
 main function More...
 

Detailed Description

Implementation of Minimum Edit Distance using Dynamic Programing.

Given two strings str1 & str2 and we have to calculate the minimum number of operations (Insert, Remove, Replace) required to convert str1 to str2.

Algorithm

We will solve this problem using Naive recursion. But as we are approaching with a DP solution. So, we will take a DP array to store the solution of all sub-problems so that we don't have to perform recursion again and again. Now to solve the problem, We can traverse all characters from either right side of the strings or left side. Suppose we will do it from the right side. So, there are two possibilities for every pair of characters being traversed.

  1. If the last characters of two strings are the same, Ignore the characters and get the count for the remaining string. So, we get the solution for lengths m-1 and n-1 in a DP array.
  2. Else, (If last characters are not the same), we will consider all three operations (Insert, Remove, Replace) on the last character of the first string and compute the minimum cost for all three operations and take the minimum of three values in the DP array. For Insert: Recur for m and n-1 For Remove: Recur for for m-1 and n For Replace: Recur for for m-1 and n-1
Author
Nirjas Jakilim

Function Documentation

◆ editDistDP()

uint64_t dynamic_programming::minimum_edit_distance::editDistDP ( std::string  str1,
std::string  str2,
uint64_t  m,
uint64_t  n 
)

Calculates and stores the result of all the sub-problems, so that we don't have to recur to compute the minimum cost of a particular operation if it is already computed and stored in the dp vector.

Parameters
dpvector to store the computed minimum costs
str1to pass the 1st string
str2to pass the 2nd string
mthe length of str1
nthe length of str2
Returns
dp[m][n] the minimum cost of operations needed to convert str1 to str2

Create a table to store results of subproblems

creasting 2D vector dp to store the results of subproblems

Fill d[][] in bottom up manner

If first string is empty, only option is to insert all characters of second string

Minimum operations = j

If second string is empty, only option is to remove all characters of second string

Minimum operations = i

If last characters are same, ignore last char and recur for remaining string

If the last character is different, consider all possibilities and find the minimum

returning the minimum cost of operations needed to convert str1 to str2

88  {
89  /// Create a table to store results of subproblems
90  std::vector<std::vector<uint64_t>>dp(m+1, std::vector<uint64_t>(n+1)); /// creasting 2D vector dp to store the results of subproblems
91 
92  /// Fill d[][] in bottom up manner
93  for (uint64_t i = 0; i <= m; i++) {
94  for (uint64_t j = 0; j <= n; j++) {
95  /// If first string is empty, only option is to
96  /// insert all characters of second string
97  if (i == 0) {
98  dp[i][j] = j; /// Minimum operations = j
99  }
100 
101  /// If second string is empty, only option is to
102  /// remove all characters of second string
103  else if (j == 0) {
104  dp[i][j] = i; /// Minimum operations = i
105  }
106 
107  /// If last characters are same, ignore last char
108  /// and recur for remaining string
109  else if (str1[i - 1] == str2[j - 1]) {
110  dp[i][j] = dp[i - 1][j - 1];
111  }
112 
113  /// If the last character is different, consider all
114  /// possibilities and find the minimum
115  else {
116  dp[i][j] = 1 + min(dp[i][j - 1], // Insert
117  dp[i - 1][j], // Remove
118  dp[i - 1][j - 1]); // Replace
119  }
120  }
121  }
122 
123  return dp[m][n]; /// returning the minimum cost of operations needed to convert str1 to str2
124 }
T min(T... args)

◆ main()

int main ( int  argc,
char *  argv[] 
)

main function

Parameters
argccommandline argument count (ignored)
argvcommandline array of arguments (ignored)
Returns
0 on exit
160  {
161  test(); // run self-test implementations
162  return 0;
163 }
static void test()
Self-test implementations.
Definition: minimum_edit_distance.cpp:132
Here is the call graph for this function:

◆ min()

uint64_t dynamic_programming::minimum_edit_distance::min ( uint64_t  x,
uint64_t  y,
uint64_t  z 
)

Takes input of the cost of three operations: Insert, Replace and Delete and return the minimum cost among them.

Parameters
xused to pass minimum cost of Insert operations
yused to pass minimum cost of Replace operations
zused to pass minimum cost of Delete operations
Returns
x if x is the minimum value
y if y is the minimum value
z if z is the minimum value

returns x, if x is the minimum value

returns y, if y is the minimum value

returns z if z is the minimum value

63  {
64  if (x <= y && x <= z) {
65  return x; /// returns x, if x is the minimum value
66  }
67  if (y <= x && y <= z) {
68  return y; /// returns y, if y is the minimum value
69  }
70  else {
71  return z; /// returns z if z is the minimum value
72  }
73 }

◆ test()

static void test ( )
static

Self-test implementations.

Returns
void
132  {
133  // 1st test
134  std::string str1 = "INTENTION"; // Sample input of 1st string
135  std::string str2 = "EXECUTION"; // Sample input of 2nd string
136  uint64_t expected_output1 = 5; // Expected minimum cost
138  str1, str2, str1.length(), str2.length()); // calling the editDistDP function and storing the result on output1
139  assert(output1 == expected_output1); // comparing the output with the expected output
140  std::cout << "Minimum Number of Operations Required: " << output1
141  << std::endl;
142 
143  // 2nd test
144  std::string str3 = "SATURDAY";
145  std::string str4 = "SUNDAY";
146  uint64_t expected_output2 = 3;
148  str3, str4, str3.length(), str4.length());
149  assert(output2 == expected_output2);
150  std::cout << "Minimum Number of Operations Required: " << output2
151  << std::endl;
152 }
T endl(T... args)
uint64_t editDistDP(std::string str1, std::string str2, uint64_t m, uint64_t n)
Calculates and stores the result of all the sub-problems, so that we don't have to recur to compute t...
Definition: minimum_edit_distance.cpp:88
T length(T... args)
Here is the call graph for this function: