Finding the nth Instance of a SubstringProblemGiven two strings source and pattern, you want to find the nth occurrence of pattern in source. SolutionUse the find member function to locate successive instances of the substring you are looking for. Example 4-17 contains a simple nthSubstr function. Example 4-17. Locate the nth version of a substring
#include <string>
#include <iostream>
using namespace std;
int nthSubstr(int n, const string& s,
const string& p) {
string::size_type i = s.find(p); // Find the first occurrence
int j;
for (j = 1; j < n && i != string::npos; ++j)
i = s.find(p, i+1); // Find the next occurrence
if (j == n)
return(i);
else
return(-1);
}
int main( ) {
string s = "the wind, the sea, the sky, the trees";
string p = "the";
cout << nthSubstr(1, s, p) << '\n';
cout << nthSubstr(2, s, p) << '\n';
cout << nthSubstr(5, s, p) << '\n';
}
DiscussionThere are a couple of improvements you can make to nthSubstr as it is presented in Example 4-17. First, you can make it generic by making it a function template instead of an ordinary function. Second, you can add a parameter to account for substrings that may or may not overlap with themselves. By "overlap," I mean that the beginning of the string matches part of the end of the same string, as in the word "abracadabra," where the last four characters are the same as the first four. Example 4-18 demonstrates this. Example 4-18. An improved version of nthSubstr
#include <string>
#include <iostream>
using namespace std;
template<typename T>
int nthSubstrg(int n, const basic_string<T>& s,
const basic_string<T>& p,
bool repeats = false) {
string::size_type i = s.find(p);
string::size_type adv = (repeats) ? 1 : p.length( );
int j;
for (j = 1; j < n && i != basic_string<T>::npos; ++j)
i = s.find(p, i+adv);
if (j == n)
return(i);
else
return(-1);
}
int main( ) {
string s = "AGATGCCATATATATACGATATCCTTA";
string p = "ATAT";
cout << p << " as non-repeating occurs at "
<< nthSubstrg(3, s, p) << '\n';
cout << p << " as repeating occurs at "
<< nthSubstrg(3, s, p, true) << '\n';
}
The output for the strings in Example 4-18 is as follows: ATAT as non-repeating occurs at 18 ATAT as repeating occurs at 11 |