Z算法(线性时间模式搜索算法)

该算法在线性时间内查找文本中所有出现的模式。假设文本长度为n,模式长度为m,则所花费的总时间为O(m+n),具有线性空间复杂度。现在我们可以看到,时间和空间复杂度都和KMP算法一样,但这个算法更容易理解。 在该算法中,我们构造了一个Z数组。

null

什么是Z阵列? 对于字符串str[0..n-1],Z数组的长度与字符串相同。Z数组的元素Z[i]存储从str[i]开始的最长子串的长度,str[i]也是str[0..n-1]的前缀。Z数组的第一个条目意义较小,因为完整字符串总是其自身的前缀。

Example:Index            0   1   2   3   4   5   6   7   8   9  10  11 Text             a   a   b   c   a   a   b   x   a   a   a   zZ values         X   1   0   0   3   1   0   0   2   2   1   0 
More Examples:str  = "aaaaaa"Z[]  = {x, 5, 4, 3, 2, 1}str = "aabaacd"Z[] = {x, 1, 0, 2, 1, 0, 0}str = "abababab"Z[] = {x, 0, 6, 0, 4, 0, 2, 0} 

Z数组如何有助于在线性时间内搜索模式? 其思想是连接模式和文本,并创建一个字符串“P$T”,其中P是模式,$是一个特殊字符,不应出现在模式和文本中,而T是文本。为连接的字符串构建Z数组。在Z数组中,若任意点的Z值等于图案长度,则该点上存在图案。

Example:Pattern P = "aab",  Text T = "baabaa"The concatenated string is = "aab$baabaa"Z array for above concatenated string is {x, 1, 0, 0, 0,                                           3, 1, 0, 2, 1}.Since length of pattern is 3, the value 3 in Z array indicates presence of pattern. 

如何构造Z数组? 一个简单的解决方案是运行两个嵌套循环,外部循环指向每个索引,内部循环查找与从当前索引开始的子字符串匹配的最长前缀的长度。该解的时间复杂度为O(n) 2. ). 我们可以在线性时间内构造Z阵列。

The idea is to maintain an interval [L, R] which is the interval with max Rsuch that [L,R] is prefix substring (substring which is also prefix). Steps for maintaining this interval are as follows – 1) If i > R then there is no prefix substring that starts before i and    ends after i, so we reset L and R and compute new [L,R] by comparing    str[0..] to str[i..] and get Z[i] (= R-L+1).2) If i <= R then let K = i-L,  now Z[i] >= min(Z[K], R-i+1)  because    str[i..] matches with str[K..] for atleast R-i+1 characters (they are in   [L,R] interval which we know is a prefix substring).        Now two sub cases arise –       a) If Z[K] < R-i+1  then there is no prefix substring starting at          str[i] (otherwise Z[K] would be larger)  so  Z[i] = Z[K]  and          interval [L,R] remains same.      b) If Z[K] >= R-i+1 then it is possible to extend the [L,R] interval         thus we will set L as i and start matching from str[R]  onwards  and         get new R then we will update interval [L,R] and calculate Z[i] (=R-L+1).

为了更好地理解上述步骤,请查看此动画—— http://www.utdallas.edu/~besp/demo/John2010/z-algorithm。htm 该算法在线性时间内运行,因为我们从不比较小于R的字符,而通过匹配,我们将R增加1,因此最多有T个比较。在不匹配的情况下,每个i只发生一次不匹配(因为R停止),这是另一个最多T的比较,使得整体线性复杂度增加。

下面是用于模式搜索的Z算法的实现。

C++

// A C++ program that implements Z algorithm for pattern searching
#include<iostream>
using namespace std;
void getZarr(string str, int Z[]);
// prints all occurrences of pattern in text using Z algo
void search(string text, string pattern)
{
// Create concatenated string "P$T"
string concat = pattern + "$" + text;
int l = concat.length();
// Construct Z array
int Z[l];
getZarr(concat, Z);
// now looping through Z array for matching condition
for ( int i = 0; i < l; ++i)
{
// if Z[i] (matched region) is equal to pattern
// length we got the pattern
if (Z[i] == pattern.length())
cout << "Pattern found at index "
<< i - pattern.length() -1 << endl;
}
}
// Fills Z array for given string str[]
void getZarr(string str, int Z[])
{
int n = str.length();
int L, R, k;
// [L,R] make a window which matches with prefix of s
L = R = 0;
for ( int i = 1; i < n; ++i)
{
// if i>R nothing matches so we will calculate.
// Z[i] using naive way.
if (i > R)
{
L = R = i;
// R-L = 0 in starting, so it will start
// checking from 0'th index. For example,
// for "ababab" and i = 1, the value of R
// remains 0 and Z[i] becomes 0. For string
// "aaaaaa" and i = 1, Z[i] and R become 5
while (R<n && str[R-L] == str[R])
R++;
Z[i] = R-L;
R--;
}
else
{
// k = i-L so k corresponds to number which
// matches in [L,R] interval.
k = i-L;
// if Z[k] is less than remaining interval
// then Z[i] will be equal to Z[k].
// For example, str = "ababab", i = 3, R = 5
// and L = 2
if (Z[k] < R-i+1)
Z[i] = Z[k];
// For example str = "aaaaaa" and i = 2, R is 5,
// L is 0
else
{
// else start from R and check manually
L = i;
while (R<n && str[R-L] == str[R])
R++;
Z[i] = R-L;
R--;
}
}
}
}
// Driver program
int main()
{
string text = "GEEKS FOR GEEKS" ;
string pattern = "GEEK" ;
search(text, pattern);
return 0;
}


JAVA

// A Java program that implements Z algorithm for pattern
// searching
class GFG {
//  prints all occurrences of pattern in text using
// Z algo
public static void search(String text, String pattern)
{
// Create concatenated string "P$T"
String concat = pattern + "$" + text;
int l = concat.length();
int Z[] = new int [l];
// Construct Z array
getZarr(concat, Z);
// now looping through Z array for matching condition
for ( int i = 0 ; i < l; ++i){
// if Z[i] (matched region) is equal to pattern
// length we got the pattern
if (Z[i] == pattern.length()){
System.out.println( "Pattern found at index "
+ (i - pattern.length() - 1 ));
}
}
}
// Fills Z array for given string str[]
private static void getZarr(String str, int [] Z) {
int n = str.length();
// [L,R] make a window which matches with
// prefix of s
int L = 0 , R = 0 ;
for ( int i = 1 ; i < n; ++i) {
// if i>R nothing matches so we will calculate.
// Z[i] using naive way.
if (i > R){
L = R = i;
// R-L = 0 in starting, so it will start
// checking from 0'th index. For example,
// for "ababab" and i = 1, the value of R
// remains 0 and Z[i] becomes 0. For string
// "aaaaaa" and i = 1, Z[i] and R become 5
while (R < n && str.charAt(R - L) == str.charAt(R))
R++;
Z[i] = R - L;
R--;
}
else {
// k = i-L so k corresponds to number which
// matches in [L,R] interval.
int k = i - L;
// if Z[k] is less than remaining interval
// then Z[i] will be equal to Z[k].
// For example, str = "ababab", i = 3, R = 5
// and L = 2
if (Z[k] < R - i + 1 )
Z[i] = Z[k];
// For example str = "aaaaaa" and i = 2, R is 5,
// L is 0
else {
// else start from R and check manually
L = i;
while (R < n && str.charAt(R - L) == str.charAt(R))
R++;
Z[i] = R - L;
R--;
}
}
}
}
// Driver program
public static void main(String[] args)
{
String text = "GEEKS FOR GEEKS" ;
String pattern = "GEEK" ;
search(text, pattern);
}
}
// This code is contributed by PavanKoli.


Python3

# Python3 program that implements Z algorithm
# for pattern searching
# Fills Z array for given string str[]
def getZarr(string, z):
n = len (string)
# [L,R] make a window which matches
# with prefix of s
l, r, k = 0 , 0 , 0
for i in range ( 1 , n):
# if i>R nothing matches so we will calculate.
# Z[i] using naive way.
if i > r:
l, r = i, i
# R-L = 0 in starting, so it will start
# checking from 0'th index. For example,
# for "ababab" and i = 1, the value of R
# remains 0 and Z[i] becomes 0. For string
# "aaaaaa" and i = 1, Z[i] and R become 5
while r < n and string[r - l] = = string[r]:
r + = 1
z[i] = r - l
r - = 1
else :
# k = i-L so k corresponds to number which
# matches in [L,R] interval.
k = i - l
# if Z[k] is less than remaining interval
# then Z[i] will be equal to Z[k].
# For example, str = "ababab", i = 3, R = 5
# and L = 2
if z[k] < r - i + 1 :
z[i] = z[k]
# For example str = "aaaaaa" and i = 2,
# R is 5, L is 0
else :
# else start from R and check manually
l = i
while r < n and string[r - l] = = string[r]:
r + = 1
z[i] = r - l
r - = 1
# prints all occurrences of pattern
# in text using Z algo
def search(text, pattern):
# Create concatenated string "P$T"
concat = pattern + "$" + text
l = len (concat)
# Construct Z array
z = [ 0 ] * l
getZarr(concat, z)
# now looping through Z array for matching condition
for i in range (l):
# if Z[i] (matched region) is equal to pattern
# length we got the pattern
if z[i] = = len (pattern):
print ( "Pattern found at index" ,
i - len (pattern) - 1 )
# Driver Code
if __name__ = = "__main__" :
text = "GEEKS FOR GEEKS"
pattern = "GEEK"
search(text, pattern)
# This code is contributed by
# sanjeev2552


C#

// A C# program that implements Z
// algorithm for pattern searching
using System;
class GFG
{
// prints all occurrences of
// pattern in text using Z algo
public static void search( string text,
string pattern)
{
// Create concatenated string "P$T"
string concat = pattern + "$" + text;
int l = concat.Length;
int [] Z = new int [l];
// Construct Z array
getZarr(concat, Z);
// now looping through Z array
// for matching condition
for ( int i = 0; i < l; ++i)
{
// if Z[i] (matched region) is equal
// to pattern length we got the pattern
if (Z[i] == pattern.Length)
{
Console.WriteLine( "Pattern found at index " +
(i - pattern.Length - 1));
}
}
}
// Fills Z array for given string str[]
private static void getZarr( string str,
int [] Z)
{
int n = str.Length;
// [L,R] make a window which
// matches with prefix of s
int L = 0, R = 0;
for ( int i = 1; i < n; ++i)
{
// if i>R nothing matches so we will
// calculate. Z[i] using naive way.
if (i > R)
{
L = R = i;
// R-L = 0 in starting, so it will start
// checking from 0'th index. For example,
// for "ababab" and i = 1, the value of R
// remains 0 and Z[i] becomes 0. For string
// "aaaaaa" and i = 1, Z[i] and R become 5
while (R < n && str[R - L] == str[R])
{
R++;
}
Z[i] = R - L;
R--;
}
else
{
// k = i-L so k corresponds to number
// which matches in [L,R] interval.
int k = i - L;
// if Z[k] is less than remaining interval
// then Z[i] will be equal to Z[k].
// For example, str = "ababab", i = 3,
// R = 5 and L = 2
if (Z[k] < R - i + 1)
{
Z[i] = Z[k];
}
// For example str = "aaaaaa" and
// i = 2, R is 5, L is 0
else
{
// else start from R and
// check manually
L = i;
while (R < n && str[R - L] == str[R])
{
R++;
}
Z[i] = R - L;
R--;
}
}
}
}
// Driver Code
public static void Main( string [] args)
{
string text = "GEEKS FOR GEEKS" ;
string pattern = "GEEK" ;
search(text, pattern);
}
}
// This code is contributed by Shrikant13


Javascript

<script>
// A JavaScript program that implements Z algorithm for pattern
// searching
//  prints all occurrences of pattern in text using
// Z algo
function search(text,pattern)
{
// Create concatenated string "P$T"
let concat = pattern + "$" + text;
let l = concat.length;
let Z = new Array(l);
// Construct Z array
getZarr(concat, Z);
// now looping through Z array for matching condition
for (let i = 0; i < l; ++i){
// if Z[i] (matched region) is equal to pattern
// length we got the pattern
if (Z[i] == pattern.length){
document.write( "Pattern found at index "
+ (i - pattern.length - 1)+ "<br>" );
}
}
}
// Fills Z array for given string str[]
function getZarr(str,Z)
{
let n = str.length;
// [L,R] make a window which matches with
// prefix of s
let L = 0, R = 0;
for (let i = 1; i < n; ++i) {
// if i>R nothing matches so we will calculate.
// Z[i] using naive way.
if (i > R){
L = R = i;
// R-L = 0 in starting, so it will start
// checking from 0'th index. For example,
// for "ababab" and i = 1, the value of R
// remains 0 and Z[i] becomes 0. For string
// "aaaaaa" and i = 1, Z[i] and R become 5
while (R < n && str[R - L] == str[R])
R++;
Z[i] = R - L;
R--;
}
else {
// k = i-L so k corresponds to number which
// matches in [L,R] interval.
let k = i - L;
// if Z[k] is less than remaining interval
// then Z[i] will be equal to Z[k].
// For example, str = "ababab", i = 3, R = 5
// and L = 2
if (Z[k] < R - i + 1)
Z[i] = Z[k];
// For example str = "aaaaaa" and i = 2, R is 5,
// L is 0
else {
// else start from R and check manually
L = i;
while (R < n && str[R - L] == str[R])
R++;
Z[i] = R - L;
R--;
}
}
}
}
// Driver program
let text = "GEEKS FOR GEEKS" ;
let pattern = "GEEK" ;
search(text, pattern);
// This code is contributed by rag2127
</script>


输出:

Pattern found at index 0Pattern found at index 10

本文由 乌特卡什·特里维迪 。如果您发现任何不正确的地方,或者您想分享有关上述主题的更多信息,请发表评论

© 版权声明
THE END
喜欢就支持一下吧
点赞15 分享