首页 > ACM题库 > HDU-杭电 > hdu 2914 Search of Concatenated Strings[解题报告]hoj
2014
02-21

hdu 2914 Search of Concatenated Strings[解题报告]hoj

Search of Concatenated Strings

问题描述 :

The amount of information on the World Wide Web is growing quite rapidly. In this information explosion age, we must survive by accessing only the Web pages containing information relevant to our own needs. One of the key technologies for this purpose is keyword search. By using well-known search engines, we can easily access those pages containing useful information about the topic we want to know.

There are many variations in keyword search problems. If a single string is searched in a given text, the problem is quite easy. If the pattern to be searched consists of multiple strings, or is given by some powerful notation such as regular expressions, the task requires elaborate algorithms to accomplish efficiently.

In our problem, a number of strings (element strings) are given, but they are not directly searched for. Concatenations of all the element strings in any order are the targets of the search here.

For example, consider three element strings aa, b and ccc are given. In this case, the following six concatenated strings are the targets of the search, i.e. they should be searched in the text.

aabccc
aacccb
baaccc
bcccaa
cccaab
cccbaa

The text may contain several occurrences of these strings. You are requested to count the number of occurrences of these strings, or speaking more precisely, the number of positions of occurrences in the text.

Two or more concatenated strings may be identical. In such cases, it is necessary to consider subtle aspects of the above problem statement. For example, if two element strings are x and xx, the string xxx is an occurrence of both the concatenation of x and xx and that of xx and x.

Since the number of positions of occurrences should be counted, this case is counted as one, not two.

Two occurrences may overlap. For example, the string xxxx has occurrences of the concatenation xxx in two different positions. This case is counted as two.

输入:

The input consists of a number of datasets, each giving a set of element strings and a text. The format of a dataset is as follows.
n m
e1
e2

en
t1
t2

tm
The first line contains two integers separated by a space. n is the number of element strings. m is the number of lines used to represent the text. n is between 1 and 12, inclusive.

Each of the following n lines gives an element string. The length (number of characters) of an element string is between 1 and 20, inclusive.

The last m lines as a whole give the text. Since it is not desirable to have a very long line, the text is separated into m lines by newlines, but these newlines should be ignored. They are not parts of the text. The length of each of these lines (not including the newline) is between 1 and 100, inclusive. The length of the text is between 1 and 5000, inclusive.

The element strings and the text do not contain characters other than lowercase letters.The end of the input is indicated by a line containing two zeros separated by a space.

CAUTION! Although the sample input contains only small datasets, note that 12! × 5000 is far larger than 2^31.

输出:

The input consists of a number of datasets, each giving a set of element strings and a text. The format of a dataset is as follows.
n m
e1
e2

en
t1
t2

tm
The first line contains two integers separated by a space. n is the number of element strings. m is the number of lines used to represent the text. n is between 1 and 12, inclusive.

Each of the following n lines gives an element string. The length (number of characters) of an element string is between 1 and 20, inclusive.

The last m lines as a whole give the text. Since it is not desirable to have a very long line, the text is separated into m lines by newlines, but these newlines should be ignored. They are not parts of the text. The length of each of these lines (not including the newline) is between 1 and 100, inclusive. The length of the text is between 1 and 5000, inclusive.

The element strings and the text do not contain characters other than lowercase letters.The end of the input is indicated by a line containing two zeros separated by a space.

CAUTION! Although the sample input contains only small datasets, note that 12! × 5000 is far larger than 2^31.

样例输入:

3 1
aa
b
ccc
aabccczbaacccbaazaabbcccaa
3 1
a
b
c
cbbcbcbabaacabccaccbaacbccbcaaaccccbcbcbbcacbaacccaccbbcaacbbabbabaccc
3 4
aaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
0 0

样例输出:

5
12
197

/* ***********************************************
Author        :kuangbin
Created Time  :2013-9-28 10:05:55
File Name     :E:\2013ACM练习\专题强化训练\图论一\LightOJ1291.cpp
************************************************ */

#include <stdio.h>
#include <string.h>
#include <iostream>
#include <algorithm>
#include <vector>
#include <queue>
#include <set>
#include <map>
#include <string>
#include <math.h>
#include <stdlib.h>
#include <time.h>
using namespace std;

const int MAXN = 10010;
const int MAXM = 40010;
struct Edge
{
	int to,next;
	bool cut;
}edge[MAXM];
int head[MAXN],tot;
int Low[MAXN], DFN[MAXN], Stack[MAXN], Belong[MAXN];
int Index ,top;
int block;
bool Instack[MAXN];
int bridge;
void addedge(int u,int v)
{
	edge[tot].to = v; edge[tot].next = head[u]; edge[tot].cut = false;
	head[u] = tot++;
}
void Tarjan(int u,int pre)
{
	int v;
	Low[u] = DFN[u] = ++Index;
	Stack[top++] = u;
	Instack[u] = true;
	int pre_num = 0;
	for(int i = head[u]; i != -1;i = edge[i].next)
	{
		v = edge[i].to;
		if(v == pre && pre_num == 0){pre_num++;continue;}
		if(!DFN[v])
		{
			Tarjan(v,u);
			if(Low[u] > Low[v])Low[u] = Low[v];
			if(Low[v] > DFN[u])
			{
				bridge++;
				edge[i].cut = true;
				edge[i^1].cut = true;
			}
		}
		else if(Instack[v] && Low[u] > DFN[v])
			Low[u] = DFN[v];
	}
	if(Low[u] == DFN[u])
	{
		block++;
		do
		{
			v = Stack[--top];
			Instack[v] = false;
			Belong[v] = block;
		}
		while( v != u);
	}
}
void init()
{
	tot = 0;
	memset(head,-1,sizeof(head));
}
int du[MAXN];
int solve(int n)
{
	memset(DFN,0,sizeof(DFN));
	memset(Instack,false,sizeof(Instack));
	Index = top = block = 0;
	Tarjan(0,0);
	int ans = 0;
	memset(du,0,sizeof(du));
	for(int i = 0;i < n;i++)
		for(int j = head[i];j != -1;j = edge[j].next)
			if(edge[j].cut)
				du[Belong[i]]++;
	for(int i = 1;i <= block;i++)
		if(du[i] == 1)
			ans++;
	return (ans+1)/2;
}

int main()
{
    //freopen("in.txt","r",stdin);
    //freopen("out.txt","w",stdout);
    int T;
	int iCase = 0;
	int n,m;
	scanf("%d",&T);
	int u,v;
	while(T--)
	{
		iCase++;
		init();
		scanf("%d%d",&n,&m);
		while(m--)
		{
			scanf("%d%d",&u,&v);
			addedge(u,v);
			addedge(v,u);
		}
		printf("Case %d: %d\n",iCase,solve(n));
	}
    return 0;
}

  1. 网站做得很好看,内容也多,全。前段时间在博客园里看到有人说:网页的好坏看字体。觉得微软雅黑的字体很好看,然后现在这个网站也用的这个字体!nice!

  2. 给你一组数据吧:29 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 1000。此时的数据量还是很小的,耗时却不短。这种方法确实可以,当然或许还有其他的优化方案,但是优化只能针对某些数据,不太可能在所有情况下都能在可接受的时间内求解出答案。