首页 > ACM题库 > HDU-杭电 > HDU 3247-Resource Archiver-动态规划-[解题报告]HOJ
2014
03-09

HDU 3247-Resource Archiver-动态规划-[解题报告]HOJ

Resource Archiver

问题描述 :

Great! Your new software is almost finished! The only thing left to do is archiving all your n resource files into a big one.
Wait a minute… you realized that it isn’t as easy as you thought. Think about the virus killers. They’ll find your software suspicious, if your software contains one of the m predefined virus codes. You absolutely don’t want this to happen.
Technically, resource files and virus codes are merely 01 strings. You’ve already convinced yourself that none of the resource strings contain a virus code, but if you make the archive arbitrarily, virus codes can still be found somewhere.
Here comes your task (formally): design a 01 string that contains all your resources (their occurrences can overlap), but none of the virus codes. To make your software smaller in size, the string should be as short as possible.

输入:

There will be at most 10 test cases, each begins with two integers in a single line: n and m (2 <= n <= 10, 1 <= m <= 1000). The next n lines contain the resources, one in each line. The next m lines contain the virus codes, one in each line. The resources and virus codes are all non-empty 01 strings without spaces inside. Each resource is at most 1000 characters long. The total length of all virus codes is at most 50000. The input ends with n = m = 0.

输出:

There will be at most 10 test cases, each begins with two integers in a single line: n and m (2 <= n <= 10, 1 <= m <= 1000). The next n lines contain the resources, one in each line. The next m lines contain the virus codes, one in each line. The resources and virus codes are all non-empty 01 strings without spaces inside. Each resource is at most 1000 characters long. The total length of all virus codes is at most 50000. The input ends with n = m = 0.

样例输入:

2 2
1110
0111
101
1001
0 0

样例输出:

5

题目链接: http://acm.zju.edu.cn/onlinejudge/showProblem.do?problemId=3280  (写这篇文章的时候航掉挂了就贴个浙大的链接)


题目大意: 给定n个代码串,给定m个病毒串,让我们将n个代码串连起来,中间可以重叠,连接起来的代码串不能含病毒串,问最短的的连接字符串长度。n<=10,m<=1000,单代码串长度<=1000,病毒穿总长度<=5万


解题思路:综合题,需要用到AC自动机+状态压缩DP+Spfa。

    总体思路是利用代码串和病毒串建立自动机,他们在自动机上的差别是一个末节点标记,一个不标记。然后将每个代码串尾节点看做图上的一个节点,利用自动机计算每个串其他所有串的不重叠的最短长度即两两节点间的最短距离。最后转变成TSP问题,状态压缩DP解之

    第一眼看到那个n,小等于10,soga,状态压缩,稍微思考下就能将问题转换成这样一个模型:n个串必须都选且选一次,求这n个串的排列使得组合成的串不包含病毒串并且长度最小。啊哈,这不是TSP问题吗?是的,你没有看错,转换成了TSP问题。

   转换成TSP问题之后,我们想的是怎么让长度尽量小,考虑将两个代码串重叠起来。两个代码串a,b的前缀和后缀可能相等,他们组成的最短不包含病毒串的字符串c,前面部分为a,后面部分为b,这时候再来个d代码串要和前面两个合体,那么就成c和d的重叠问题了

   接下来我们要做的怎么让代码串a和代码串b组成的串c长度最小且不包含病毒串呢?我一开始用kmp来找两个串的相等前缀、后缀,然后组成串去ac自动机中匹配。然后一瞬间我就觉得我自己脑残了,这不是让ac自动机退化成kmp和字典树了吗!因为ac自动机上的一个节点到根的路径代表一个字符串,假设串a的末尾节是p,b的末尾节点是q,接着我们要做是在p点利用next数组转移到q,我们得到一个结论:从p到q所走的路径便是b除开与a重叠部分的那个后缀,如a为aaabb,b为bbaaa,那么路经就代表串b的aaa子串。我们怎么保证从p点走到q点,中间走过的路径表示的串一定是串b的后缀呢?两种情况:1、a是b的子串,这时候我们不会用到fail指针,显然可以
2、我们需要用到fail指针,每次用fail指针找到下一个匹配的位置假设是failx,failx节点到根节点所表示的串便是我们走过路径的最长后缀,这样一直找找到节点q,点q到根节点所表示的串遍是我们走过路径的最长后缀,然后上面的结论便得证

    总而言之,我们在ac自动机上走过的路径可以表示一个串,设为S,到达点p,那么点p到根节点这条路径所表示的串s,s为S的后缀。为用路径代表一个串是ac自动机优美之处

    我们从上面说的p点走到q点会有很多路径,要保证走过的路径长度最小即b串于a串的不重叠部分最短,要用到spfa,其实本题就退化成普通的Bfs,因为没有松弛操作。这样得到就可以得到各串相互之间的最短距离,然后就变成了很普通的TSP。

    据说这题的数据很弱,很多代码能ac但都是错的,貌似那些hack数据我的代码都能过…

测试数据:

Input:
3 1
00000
00000
11111
01
3 2
101
010
1111
001
011
2 2
1110
0111
101
1001
3 3
0001
0000
10000
010
101
111
3 3
00000
00000
00000
101
101
101

OutPut:
10
7
5
6
5

C艹代码:

#include <stdio.h>
#include <string.h>
#include <string>
#include <iostream>
#include <algorithm>
using namespace std;
#define MAXNODE 2
#define MAXSIZE 300000
#define INF (1<<29)
#define min(a,b) ((a)<(b)?(a):(b))


int n,m,issub[20],len;
string str[20],virus;


struct ACnode {
	
	int flag,in;
	ACnode *next[MAXNODE],*fail,*father;
};
struct Auto_AC {
	
	int Index[20],ans;
	int dist[100][100],spfaqu[MAXSIZE];
	int temp[MAXSIZE],in[MAXSIZE];
	int total,head,tail,dp[20][2000];
	ACnode *p,*q,*root;
	ACnode *qu[MAXSIZE],tree[MAXSIZE];
	
	
	ACnode *CreateNode() {
		
		tree[total].flag = 0;
		tree[total].in = total;
		tree[total].next[0] = NULL;
		tree[total].next[1] = NULL;
		return &tree[total++];
	}
	void Initial() {
		
		total = 0,ans = INF;
		root = CreateNode();
		root->fail = root;
		for (int i = 0; i < n; ++i) {
		
			for (int j = 0; j < (1<<n); ++j)
				dp[i][j] = INF;
			dp[i][1<<i] = str[i].size();
		}
	}
	int GetHash(char c) {
		
		return c - '0';
	}
	void Insert(string str,int in,int kind) {
		
		p = root;
		for (int i = 0; str[i]; ++i) {
			
			int k = GetHash(str[i]);
			if (p->next[k] == NULL)
				p->next[k] = CreateNode();
			p = p->next[k];
		}
		
		
		if (kind == 1) p->flag = 1;
		else Index[in] = p->in;
	}
	void Build_AC() {
		
		qu[head++] = root;
		while (tail < head) {
			
			p = qu[tail++];
			for (int k = 0; k < MAXNODE; ++k)
				if (p->next[k]) {
					
					if (p == root) p->next[k]->fail = root;
					else p->next[k]->fail = p->fail->next[k];
					qu[head++] = p->next[k];
					p->next[k]->flag |= p->next[k]->fail->flag;
				}
				else {
					
					if (p == root) p->next[k] = root;
					else p->next[k] = p->fail->next[k];
				}
		}
		
	}
	void Spfa_ForDist(int s) {

		int i,j,k,u,v,ndis;
		int head = 0,tail = 0;


		temp[s] = 0;
		spfaqu[head++] = s,in[s] = 1;


		while (tail < head) {

			u = spfaqu[tail++]; in[u] = 0;
			for ( k = 0; k < MAXNODE; ++k)
				if (!tree[u].next[k]->flag) {

					ndis = temp[u] + 1;
					v = tree[u].next[k]->in;
					if (ndis < temp[v]) {

						temp[v] = ndis;
						if (in[v] == 0) 
							spfaqu[head++] = v,in[v] = 1;
					}
				}
		}
	}
	void CountDist() {
			
		int i,j;
		for (i = 0; i < n; ++i) {
		
			for (j = 0; j <= total; ++j)
				in[j] = 0,temp[j] = INF;
			Spfa_ForDist(Index[i]);
			for (j = 0; j < n; ++j)
				dist[i][j] = temp[Index[j]];
		}
	}
	int Solve_DP() {

		int i,j,k,nst,ndis;
		

		for (i = 1; i < (1<<n); ++i)
			for (j = 0; j < n; ++j)
				if (i & (1<<j)) for (k = 0; k < n; ++k)
					if (!(i & (1<<k)) && dist[j][k] != INF) {

						nst = i | (1<<k);
						ndis = dp[j][i] + dist[j][k];
						dp[k][nst] = min(dp[k][nst],ndis);
					}


		for (i = 0; i < n; ++i)
			ans = min(ans,dp[i][(1<<n)-1]);
		return ans;
	}
}AC;
int cmp(string s1,string s2) {
	
	return s1.size() > s2.size();
}
void Solve_Sub() {
	
	int i,j,k = n;
	
	
	sort(str,str+n,cmp);
	memset(issub,0,sizeof(issub));
	for (i = 0; i < n; ++i)
		for (j = i + 1; j <= n; ++j)
			if (str[i].find(str[j]) != -1) 
				issub[j] = 1;
	for (n = 0,i = 0; i < k; ++i)
		if (issub[i] == 0) str[n++] = str[i];
}


int main()
{
	int i,j,k;
	
	
	while (cin>>n>>m,n + m) { 
		
		for (i = 0; i < n; ++i) 
			cin>>str[i];
		Solve_Sub(),AC.Initial();
		for (i = 0; i < n; ++i) 
			AC.Insert(str[i],i,0);
		for (i = 0; i < m; ++i)
			cin>>virus,AC.Insert(virus,i,1);
		
		
		AC.Build_AC();
		AC.CountDist();
		printf("%d\n",AC.Solve_DP());
	}
}

本文ZeroClock原创,但可以转载,因为我们是兄弟。

参考:http://blog.csdn.net/woshi250hua/article/details/8021283


  1. Often We don’t set up on weblogs, but I would like to condition that this established up really forced me individually to do this! considerably outstanding publish