9 September 2013

Algorithmathon Day 18: Longest Common Substring Revisited

We previously found the Longest Common Substring (LCS) using a trie on Day 13. Today our goal is to solve this problem a different way. This solution is O(n2) because it involves a loop within a loop. We will scan through the first string and for every character we will scan every character in the second string for a match. If a match is found we will then scan the next characters in both strings for a match. This means that the worst case scenario is when neither string has a matching sub-string. The solution should work something like this.

As always we begin with our RSpec tests.

algorithm_string_spec.rb

require 'spec_helper'

describe AlgorithmString do

  describe "Class" do
    it "should have method 'longest_common_substring'" do
      AlgorithmString.method_defined?(:longest_common_substring).should be_true
    end    
  end

  describe "Instance" do
    describe "Longest common substring method tests" do
      it "should return 'bcd' when ABCD.longest_common_substring(BCDE)  is called" do
        a_string= AlgorithmString.new('ABCD')
        a_string.longest_common_substring('BCDE').should == 'BCD'
      end

      it "should return 'try' when TRY.longest_common_substring(RETRY)  is called" do
        a_string= AlgorithmString.new('TRY')
        a_string.longest_common_substring('RETRY').should == 'TRY'
      end 

       it "should return 'FGH' when ABCDFGHXYZ.longest_common_substring(TSVPFGHARS)  is called" do
        a_string= AlgorithmString.new('ABCDFGHXYZ')
        a_string.longest_common_substring('TSVPFGHARS').should == 'FGH'
      end     
    end
  end
end

Now for our solution.

algorithm_string_spec.rb

class AlgorithmString < String

  #Find the longest common substring with the 
  #string_to_compare parameter.
  # * *Args*
  #   - +string_to_compare+ -> A string which is compared to this string instance to find the longest common substring
  # * *Returns*
  #   - A string that is the first longest common substring found. 
  def longest_common_substring(string_to_compare)
    @longest_begin = -1
    @longest_length = 0    
    unless self.length < 1 || string_to_compare.nil? || string_to_compare.length < 1
      @current_outer = 0
      while (@current_outer + @longest_length) < self.length
        @current_inner = 0
        while @current_inner < string_to_compare.length
          if self[@current_outer] == string_to_compare[@current_inner]
            process_first_character_match(string_to_compare)
            break
          end
          @current_inner = @current_inner + 1
        end
        @current_outer = @current_outer + 1
      end
    end
    self.slice(@longest_begin,@longest_length)
  end


  private

    #  Called from longest_common_substring when a first character match is found.
    #  Advances to the next character in both strings to look for a match.
    #
    #   - +string_to_compare+ -> A string which is compared to this string instance to find the longest common substring
    #
    # * *Returns*
    #   - A string that is the matching characters found. 
    def process_first_character_match(string_to_compare)
      current_begin=@current_outer
      current_length=1
      @current_outer = @current_outer + 1
      @current_inner = @current_inner + 1
      while @current_outer < self.length && @current_inner < string_to_compare.length
        if self[@current_outer] == string_to_compare[@current_inner]
            current_length=current_length + 1
        else
          break
        end
        @current_outer = @current_outer + 1
        @current_inner = @current_inner + 1
      end      
      if current_length > @longest_length
         @longest_begin=current_begin
         @longest_length = current_length
      end      
    end
end
tags: Algorithm - Algorithmathon - Longest Common Substring - RSpec - Ruby

Less Is More