2014
02-23

# Match Maker

In Computer Science, pattern matching is the act of checking if a certain sequence conforms (matches) a given pattern. Patterns are usually specified using a language based on regular expression. In this problem, we’ll use a simple regular expression to express patterns on sequences of decimal digits. A pattern is a sequence of one or more decimal digits 0′ …9′, asterisks *’, and hash signs #’. A *’ denotes a sequence of an even number of digits, whereas a #’ denotes a sequence of an odd number of digits. For example, the pattern “129" only matches the sequence 129. The pattern “1*3" matches all sequences beginning with 1, ending with 3, and having an even number of decimal digits between the first and last digits. As another example, the pattern “#55" matches the sequences 155, 12355, 1234555, but none of the sequences 55, 1255, 123455. Your task is to write a program to find if a given sequence matches a given pattern.

Your program will be tested on one or more data sets. Each data set contains a single pattern and one or more sequences to match. The first line of each data set specifies the pattern, and the remaining lines specify the sequences to match against that pattern. The end of a data set (except the last) is identified by the word “END" (without the double quotes.) The end of the last data set is identified by the word “QUIT". All lines are 100,000 characters long or shorter.

Your program will be tested on one or more data sets. Each data set contains a single pattern and one or more sequences to match. The first line of each data set specifies the pattern, and the remaining lines specify the sequences to match against that pattern. The end of a data set (except the last) is identified by the word “END" (without the double quotes.) The end of the last data set is identified by the word “QUIT". All lines are 100,000 characters long or shorter.

129
1299
129
1129
END
1*3
123
1223
END
#55
155
12355
55
1255
QUIT

1.1. not
1.2. match
1.3. not
2.1. not
2.2. match
3.1. match
3.2. match
3.3. not
3.4. not

#include <iostream>
#include <cstring>
#include <cstdio>
#include <cmath>
#include <algorithm>
#include <string>
using namespace std;
#define ll long long
#define N 105000
struct Obj
{
int sign,len;
ll val;
}obj[N];
ll p[N],ha[N];
int tot;
void init(char *s)
{
int len=strlen(s);
tot=0;
for(int i=0,nxt=0;i<len;i=nxt)
{
if(s[i]=='#') obj[tot++].sign=1,nxt=i+1;
else if(s[i]=='*') obj[tot++].sign=0,nxt=i+1;
else
{
ll t=0;
while(nxt<len&&s[nxt]!='#'&&s[nxt]!='*')
{
t=t*133+s[nxt];
nxt++;
}
obj[tot].sign=-1;
obj[tot].val=t;
obj[tot++].len=nxt-i;
}
}
}
bool solve(char *s)
{
int len=strlen(s+1);
ha[0]=0;
for(int i=1;i<=len;++i) ha[i]=ha[i-1]*133+s[i];
int cur=1;
for(int i=0;i<tot&&cur<=len+1;++i)
{
if(obj[i].sign==1) cur++;
else if(obj[i].sign<0)
{
if(i==0)
{
if(obj[i].len>len || ha[obj[i].len]!=obj[i].val) return false;
cur+=obj[i].len;
continue;
}
if(i==tot-1)
{
if(cur+obj[i].len>len+1) return false;
if((len+1-(cur+obj[i].len))%2) return false;
cur=len+1-obj[i].len;
return ha[cur+obj[i].len-1]-ha[cur-1]*p[obj[i].len]==obj[i].val;
}
while(cur<=len+1&&cur+obj[i].len<=len+1)
{
if(ha[cur+obj[i].len-1]-ha[cur-1]*p[obj[i].len]==obj[i].val) break;
cur+=2;
}
if(cur>len+1||cur+obj[i].len>len+1) return false;
cur+=obj[i].len;

}
}
--cur;
return cur<=len&&(len-cur)%2==0;
}
char s[N];
int main ()
{
p[0]=1;
for(int i=1;i<N;++i) p[i]=p[i-1]*133;
int ncase1=0,ncase2=0;
while(scanf("%s",s) && s[0]!='Q')
{
init(s);
ncase1++;ncase2=0;
while(scanf("%s",s+1) && s[1]!='E' && s[1]!='Q')
{
printf("%d.%d. ",ncase1,++ncase2);
if(solve(s)) printf("match\n");
else printf("not\n");
}
if(s[1]=='Q') break;
}
return 0;
}

1. 给你一组数据吧：29 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 1000。此时的数据量还是很小的，耗时却不短。这种方法确实可以，当然或许还有其他的优化方案，但是优化只能针对某些数据，不太可能在所有情况下都能在可接受的时间内求解出答案。