Longest Consecutive Sequence

Longest Consecutive Sequence

December 27, 2023
medium
dynamic programming

Problem #

Given two sequences, find the length of their longest common subsequence (LCS). A subsequence is a sequence that can be derived from another sequence by deleting some or no elements without changing the order of the remaining elements.

Solution #

Here is the Python code for finding the length of the longest common subsequence (LCS) between two sequences:

def lcs_length(X, Y):
    m = len(X)
    n = len(Y)

    # Create a 2D array to store lengths of LCS of subproblems
    L = [[0] * (n + 1) for i in range(m + 1)]

    # Fill L[m+1][n+1] in bottom up fashion
    for i in range(m + 1):
        for j in range(n + 1):
            if i == 0 or j == 0:
                L[i][j] = 0
            elif X[i - 1] == Y[j - 1]:
                L[i][j] = L[i - 1][j - 1] + 1
            else:
                L[i][j] = max(L[i - 1][j], L[i][j - 1])

    # L[m][n] contains the length of LCS for X[0..m-1] and Y[0..n-1]
    return L[m][n]

# Example usage
X = "AGGTAB"
Y = "GXTXAYB"
lcs_length = lcs_length(X, Y)

When applied to the sequences “AGGTAB” and “GXTXAYB”, the function returns 4, which is the length of the longest common subsequence between these two sequences. The LCS in this case includes characters “GTAB”.

Solution Analysis #

Sure, I’ll explain the code for finding the length of the longest common subsequence (LCS) between two sequences:

  1. Function Definition:

    def lcs_length(X, Y):
    

    This defines a function lcs_length which takes two strings X and Y as arguments. These strings represent the sequences for which we want to find the LCS.

  2. Determining the Lengths of the Sequences:

    m = len(X)
    n = len(Y)
    

    The lengths of the two sequences are stored in m and n. This is used to determine the size of the table for dynamic programming.

  3. Initializing the Table for Dynamic Programming:

    L = [[0] * (n + 1) for i in range(m + 1)]
    

    A 2D array L is created with dimensions (m+1) x (n+1). Each element of this table L[i][j] will eventually hold the length of the LCS between the first i characters of X and the first j characters of Y.

  4. Filling the Table:

    for i in range(m + 1):
        for j in range(n + 1):
            if i == 0 or j == 0:
                L[i][j] = 0
            elif X[i - 1] == Y[j - 1]:
                L[i][j] = L[i - 1][j - 1] + 1
            else:
                L[i][j] = max(L[i - 1][j], L[i][j - 1])
    

    The table L is filled using a nested loop.

    • If either i or j is 0, it means one of the sequences has length 0, so the LCS length is 0.
    • If X[i - 1] is equal to Y[j - 1], the characters at these positions are part of the LCS, so 1 is added to the LCS length found so far (L[i - 1][j - 1]).
    • If they are not equal, the algorithm checks both possibilities (excluding the current character from either X or Y) and takes the maximum of the two.
  5. Returning the Result:

    return L[m][n]
    

    After filling the table, the length of the LCS is found in L[m][n]. This is the length of the LCS for the entire sequences X and Y.

This dynamic programming approach ensures that all subproblems are solved only once, storing their solutions in the table L, which significantly improves efficiency, especially for larger sequences. The time complexity of this algorithm is O(mn), where m and n are the lengths of the two sequences.