There are a lot of methods for fining all locations of a substring substr in a string s, and if we ever find ourselves teaching again Python, I would first emphasize the simplest algorithm to do so.
We repeatedly use the basic string object find function. If s is a string and substr is a string pattern (no wild cards!) then s.find(substr) will -1 if substr is not found in s, otherwise it will return the starting index of substr in s.
Here is our function for findind all occurence of substr in a string:
"""
def findall(s, substr):
"""
Find all locations of substr in s.
"""
out = []
oldpos = 0
n = len(substr)
while (1):
pos = s[oldpos:].find(substr)
if pos >=0:
out.append(oldpos + pos)
oldpos += (pos+n+1) #remove +n if you want to find occurences starting within substr also!
else:
break
return out
if __name__ == "__main__":
s = 'the quick brown fox jumps over the lazy dog near the bank of the river.'
sub = 'the'
x= findall(s, sub)
print x
for v in x:
print s[v:v+len(sub)]
The string find function can actually perform a search using a sprcified starting and ending indices. Thus s[oldpos:].find(substr) may be rewritten as s.find(substr, oldpos). We ask the reader to experiment by modifying the above code.
Help on built-in function find:
find(...)
S.find(sub [,start [,end]]) -> int
Return the lowest index in S where substring sub is found,
such that sub is contained within S[start:end]. Optional
arguments start and end are interpreted as in slice notation.
Return -1 on failure.
Other methods may be searched in Google. Here are some links to get you started in learning alternative and complicated approaches:
No comments:
Post a Comment