There are a lot of methods for fining all locations of a substring substr in a string s, and if we ever find ourselves teaching again Python, I would first emphasize the simplest algorithm to do so.
We repeatedly use the basic string object find function. If s is a string and substr is a string pattern (no wild cards!) then s.find(substr) will -1 if substr is not found in s, otherwise it will return the starting index of substr in s.
Here is our function for findind all occurence of substr in a string:
""" def findall(s, substr): """ Find all locations of substr in s. """ out = [] oldpos = 0 n = len(substr) while (1): pos = s[oldpos:].find(substr) if pos >=0: out.append(oldpos + pos) oldpos += (pos+n+1) #remove +n if you want to find occurences starting within substr also! else: break return out if __name__ == "__main__": s = 'the quick brown fox jumps over the lazy dog near the bank of the river.' sub = 'the' x= findall(s, sub) print x for v in x: print s[v:v+len(sub)]
The string find function can actually perform a search using a sprcified starting and ending indices. Thus s[oldpos:].find(substr) may be rewritten as s.find(substr, oldpos). We ask the reader to experiment by modifying the above code.
Help on built-in function find: find(...) S.find(sub [,start [,end]]) -> int Return the lowest index in S where substring sub is found, such that sub is contained within S[start:end]. Optional arguments start and end are interpreted as in slice notation. Return -1 on failure.
Other methods may be searched in Google. Here are some links to get you started in learning alternative and complicated approaches:
No comments:
Post a Comment