Given strings "ABAB" and "BABA"...
A dynamic programming style algorithm constructs a matrix of the substring lengths.
| | | A | B | A | B | | | O | O | O | O | O | | B | O | O | 1 | O | 1 | | A | O | 1 | 0 | 2 | 0 | | B | O | 0 | 2 | 0 | 3 | | A | O | 1 | 0 | 3 | 0 |
In this matrix, the 5th column tests the substring ABA of the ABAB sequence. So cell [5, 5] compares ABA to BAB ... the lcs is BA, length 2.
The other approach is to construct a generalized suffix tree for the strings, then find the deepest internal nodes which have leaf nodes from all the strings in the subtree below it.
A Simple Suffix Tree Implementation:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"""http://goo-apple.appspot.com/article/2e8d3c6a-2c38-48b9-96c6-240b4ded253a""" | |
class Node: | |
def __init__(self, start, substr): | |
self.start = start | |
self.substr = substr | |
self.branches = {} | |
def insert_into_tree(subroot, suffix, start): | |
prefix_len = len(subroot.substr) | |
new_suffix = str(suffix[prefix_len:]) | |
if(len(subroot.branches) == 0): | |
left_child = Node(subroot.start, "") | |
right_child = Node(start, new_suffix) | |
subroot.branches[""] = left_child | |
subroot.branches[new_suffix] = right_child | |
else: | |
for (substr, node) in subroot.branches.items(): | |
if len(substr) > 0 and new_suffix.startswith(substr): | |
insert_into_tree(node, new_suffix, start) | |
break | |
else: | |
new_child = Node(start, new_suffix) | |
subroot.branches[new_suffix] = new_child | |
def build_suffix_tree(t_str): | |
len_str = len(t_str) | |
i = len_str - 1 | |
root = Node(len_str, "") | |
while i >= 0: | |
insert_into_tree(root, str(t_str[i:]), i) | |
i -= 1 | |
return root | |
def display_all_suffix(subroot, suffix_s_prefix, level = 0): | |
if len(subroot.branches) == 0: | |
print suffix_s_prefix, level | |
return | |
for (substr, node) in subroot.branches.items(): | |
display_all_suffix(node, suffix_s_prefix + substr, level + 1) | |
root = build_suffix_tree("BCABC") | |
display_all_suffix(root, "") |
Dijkstras Algorithm for shortest path:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
Dijkstra's algorithm for shortest paths. | |
http://code.activestate.com/recipes/577343-dijkstras-algorithm-for-shortest-paths/) | |
- dijkstra(G, s) finds all shortest paths from s to each other vertex. | |
- shortest_path(G, s, t) uses dijkstra to find the shortest path from s to t. | |
The input graph G is assumed to have the following representation: | |
- A vertex can be any object that can be used as an index into a dictionary. | |
- G is a dictionary, indexed by vertices. | |
- For any vertex v, G[v] is itself a dictionary, indexed by neighbors of v. | |
- For any edge v->w, G[v][w] is the length of the edge. | |
""" | |
from priodict import priorityDictionary | |
def dijkstra(graph, start, end=None): | |
final_distances = {} | |
predecessors = {} | |
estimated_distances = priorityDictionary() | |
estimated_distances[start] = 0 | |
for vertex in estimated_distances: | |
final_distances[vertex] = estimated_distances[vertex] | |
if vertex == end: | |
break | |
for edge in graph[vertex]: | |
path_distance = final_distances[vertex] + graph[vertex][edge] | |
if edge in final_distances: | |
if path_distance < final_distances[edge]: | |
raise ValueError("Better path to already-final vertex") | |
elif edge not in estimated_distances or path_distance < estimated_distances[edge]: | |
estimated_distances[edge] = path_distance | |
predecessors[edge] = vertex | |
return final_distances, predecessors | |
def shortest_path(graph, start, end): | |
""" | |
Find a single shortest path from the given start vertex to the given end, | |
output as list of vertices in order along the shortest path. | |
""" | |
final_distances, predecessors = dijkstra(graph, start, end) | |
path = [] | |
while 1: | |
path.append(end) | |
if end == start: | |
break | |
end = predecessors[end] | |
path.reverse() | |
return path | |
if __name__ == "____main__": | |
pass | |
# G = {'s':{'u':10, 'x':5}, 'u':{'v':1, 'x':2}, 'v':{'y':4}, | |
# 'x':{'u':3, 'v':9, 'y':2}, 'y':{'s':7, 'v':6}} | |
# The shortest path from s to v is ['s', 'x', 'u', 'v'] and has length 9. |
Or, here's a more complete implementation.
No comments:
Post a Comment