首页 > 搜索 > BFS搜索 > hdu 2222 Keywords Search-KMP-[解题报告]C++
2014
01-04

hdu 2222 Keywords Search-KMP-[解题报告]C++

Keywords Search

问题描述 :

In the modern time, Search engine came into the life of everybody like Google, Baidu, etc.
Wiskey also wants to bring this feature to his image retrieval system.
Every image have a long description, when users type some keywords to find the image, the system will match the keywords with description of image and show the image which the most keywords be matched.
To simplify the problem, giving you a description of image, and some keywords, you should tell me how many keywords will be match.

输入:

First line will contain one integer means how many cases will follow by.
Each case will contain two integers N means the number of keywords and N keywords follow. (N <= 10000)
Each keyword will only contains characters ‘a’-'z’, and the length will be not longer than 50.
The last line is the description, and the length will be not longer than 1000000.

输出:

First line will contain one integer means how many cases will follow by.
Each case will contain two integers N means the number of keywords and N keywords follow. (N <= 10000)
Each keyword will only contains characters ‘a’-'z’, and the length will be not longer than 50.
The last line is the description, and the length will be not longer than 1000000.

样例输入:

1
5
she
he
say
shr
her
yasherhs

样例输出:

3

昨晚开始想学AC自动机,然后尽早看了看算导的KMP。。。以前看过,忘差不多了。

今天看了看一个学习AC自动机的文章,磕磕绊绊理解了。

trie的建立很随意,关键是失败指针这个概念。稍微有点抽象。这个指针的建立,用到了BFS,引用那个文章的话

“  假设有一个节点k,他的失败指针指向j。那么k,j满足这个性质:设root到j的距离为n,则从k之上的第n个节点到k所组成的长度为n的单词,与从root到j所组成的单词相同。对于每个节点,我们可以这样处理:设这个节点上的字母为C,沿着他父亲的失败指针走,直到走到一个节点,他的儿子中也有字母为C的节点。然后把当前节点的失败指针指向那个字目也为C的儿子。如果一直走到了root都没找到,那就把失败指针指向root,最开始,我们把root加入队列(root的失败指针显然指向自己),这以后我们每处理一个点,就把它的所有儿子加入队列,直到搞完。”

查找匹配数目的时候,如果发现不匹配的就沿着失败指针寻找,直到到根部为止。

因为失败指针走一次后再走就没有用了,可以标记下,下次就不用走了。

#include <map>
#include <set>
#include <queue>
#include <stack>
#include <math.h>
#include <time.h>
#include <stdio.h>
#include <stdlib.h>
#include <iostream>
#include <limits.h>
#include <string.h>
#include <string>
#include <algorithm>
#define MID(x,y) ( ( x + y ) >> 1 )
#define L(x) ( x << 1 )
#define R(x) ( x << 1 | 1 )
#define BUG puts("here!!!")
#define STOP system("pause")

using namespace std;

const int MAX_N = 26;
const int MAX_NODE = 500010;

struct NODE{
	int cnt;
	NODE *next[MAX_N], *fail;
	NODE()
	{
		cnt = 0;
		fail = NULL;
		memset(next, 0, sizeof(next));
	}
}*head;

void Build_trie(char *s,NODE *head)
{
	int len = strlen(s);
	for(int i=0; i<len; i++)
	{
		int k = s[i] - 'a';
		if( head->next[k] == NULL )
			head->next[k] = new NODE();
		head = head->next[k];
	}
	head->cnt++;
}

queue<NODE*> q;
void Build_fail(NODE *head)
{
	head->fail = NULL;
	q.push(head);
	while( !q.empty() )
	{
		NODE *now = q.front(); q.pop();
		for(int i=0; i<MAX_N; i++)
			if( now->next[i] )
			{
				NODE *p = now->fail;
				while( p )
				{
					if( p->next[i] )
					{
						now->next[i]->fail = p->next[i];
						break;
					}
					p = p->fail;
				}
				if( p == NULL )
					now->next[i]->fail = head;
				q.push(now->next[i]);
			}
						
	}
}

int AC_find(NODE *head, char *s)
{
	int len = strlen(s), sum = 0;
	NODE* p = head;
	for(int i=0; i<len; i++)
	{
		int k = s[i] - 'a';
		while( p->next[k] == NULL && p != head )
			p = p->fail;
		p = p->next[k] == NULL ? head : p->next[k];
		
		NODE *tmp = p;
		while( tmp != head && tmp->cnt != -1 )
		{
			sum += tmp->cnt;
			tmp->cnt = -1;
			tmp = tmp->fail;
		}
	}
	return sum;
}

char s[1000005];
char ss[100];
int main()
{
	int n, ncases;

	scanf("%d", &ncases);
	
	while( ncases-- )
	{
		head = new NODE();
		scanf("%d", &n);		
		while( n-- )
		{
			scanf("%s", ss);
			Build_trie(ss, head);
		}
		
		Build_fail( head );
		
		scanf("%s", s);
		int sum = AC_find( head, s);
		printf("%d\n", sum);
	}

return 0;
}

解题转自:http://blog.csdn.net/zxy_snow/article/details/6709255


, ,
  1. 有两个重复的话结果是正确的,但解法不够严谨,后面重复的覆盖掉前面的,由于题目数据限制也比较严,所以能提交通过。已更新算法

  2. [email protected]

  3. 约瑟夫也用说这么长……很成熟的一个问题了,分治的方法解起来o(n)就可以了,有兴趣可以看看具体数学的第一章,关于约瑟夫问题推导出了一系列的结论,很漂亮

  4. #include <cstdio>

    int main() {
    //answer must be odd
    int n, u, d;
    while(scanf("%d%d%d",&n,&u,&d)==3 && n>0) {
    if(n<=u) { puts("1"); continue; }
    n-=u; u-=d; n+=u-1; n/=u;
    n<<=1, ++n;
    printf("%dn",n);
    }
    return 0;
    }

  5. 第二个方法挺不错。NewHead代表新的头节点,通过递归找到最后一个节点之后,就把这个节点赋给NewHead,然后一直返回返回,中途这个值是没有变化的,一边返回一边把相应的指针方向颠倒,最后结束时返回新的头节点到主函数。