📅  最后修改于: 2023-12-03 15:12:42.464000             🧑  作者: Mango
This question is related to computer programming and algorithm design. It requires a good understanding of dynamic programming and graph algorithms.
Given a set of strings, the task is to find the longest subsequence that is in all of the strings. For example, let's consider the following set of strings:
{"abcd", "abbd", "bcdc", "bd"}
The longest subsequence that is present in all of these strings is "bd".
The input is a set of strings and the output should be the longest common subsequence.
To solve this problem, we can use dynamic programming. Let's define our subproblems as follows:
LCS(s1, s2, ..., sn; i, j, ..., k) = longest common subsequence of the prefix strings s1[1:i], s2[1:j], ..., sn[1:k]
Our goal is to compute LCS(s1, s2, ..., sn; len(s1), len(s2), ..., len(sn))
. To do this, we can define our base case:
LCS(s1, s2, ..., sn; 0, 0, ..., 0) = ""
That is, the longest common subsequence of the empty strings is the empty string. Using these definitions, we can define the recursive formula:
LCS(s1, s2, ..., sn; i, j, ..., k) = LCS(s1, s2, ..., sn; i-1, j, ..., k) or # Case 1
LCS(s1, s2, ..., sn; i, j-1, ..., k) or # Case 2
LCS(s1, s2, ..., sn; i, j, ..., k-1) # Case 3
where the 'or' operator represents the longest common subsequence between two strings.
Now, we need to observe that these cases are expensive since we may be re-computing the same subproblems multiple times. To avoid this, we can use a dynamic programming table to store the results of our subproblems. The table should have dimensions (len(s1)+1) x (len(s2)+1) x ... x (len(sn)+1).
To populate our table, we can iterate over all possible values of i, j, ..., k and fill in the table using the recursive formula above. We can then return the value of LCS(s1, s2, ..., sn; len(s1), len(s2), ..., len(sn))
as the longest common subsequence between all the strings.
def LCS(strings):
n = len(strings)
m = [len(s) for s in strings]
dp = [[[0 for _ in range(m[k] + 1)] for k in range(n)] for _ in range(n)]
for i in range(1, n):
for j in range(1, n):
for x in range(1, m[i]):
for y in range(1, m[j]):
if strings[i][x] == strings[j][y]:
dp[i][j][x][y] = dp[i - 1][j - 1][x - 1][y - 1] + 1
else:
dp[i][j][x][y] = max(dp[i - 1][j][x][y], dp[i][j - 1][x][y])
# Recovering LCS
lcs = []
x, y = m[0], m[1]
for i in range(2, n):
lcs = []
for j in range(x, 0, -1):
if dp[i][i - 1][j][y] > dp[i][i - 1][j - 1][y]:
lcs.append(strings[0][j - 1])
for j in range(y, 0, -1):
if dp[i - 1][i][x][j] > dp[i - 1][i][x][j - 1]:
lcs.append(strings[1][j - 1])
x, y = len(lcs), m[i]
strings[0] = ''.join(lcs)
return strings[0]
Note: The code above is written in Python and uses a 4-dimensional table to store the results of our subproblems. The 4 dimensions are the indices of the two strings we're comparing as well as the length of their prefixes we're comparing. The LCS
function takes a list of strings and returns the longest common subsequence between them.