首页 > 搜索 > DFS搜索 > HDU 1583 DNA Assembly-字符串-[解题报告] C++
2013
12-12

HDU 1583 DNA Assembly-字符串-[解题报告] C++

DNA Assembly

问题描述 :

Farmer John has performed DNA sequencing on his prize milk-producing cow, Bessie DNA sequences are ordered lists (strings) containing the letters ‘A’, ‘C’, ‘G’, and ‘T’.

As is usual for DNA sequencing, the results are a set of strings that are sequenced fragments of DNA, not entire DNA strings. A pair of strings like ‘GATTA’ and ‘TACA’ most probably represent the string ‘GATTACA’ as the overlapping characters are merged, since they were probably duplicated in the sequencing process.

Merging a pair of strings requires finding the greatest overlap between the two and then eliminating it as the two strings are concatenated together. Overlaps are between the end of one string and beginning of another string, NOT IN THE MIDDLE OF A STRING.

By way of example, the strings ‘GATTACA’ and ‘TTACA’ overlap completely. On the other hand, the strings ‘GATTACA’ and ‘TTA’ have no overlap at all, since the matching characters of one appear in the middle of the other, not at one end or the other. Here are some examples of merging strings, including those with no overlap:

  GATTA + TACA -> GATTACA
  TACA + GATTA -> TACAGATTA
  TACA + ACA -> TACA
  TAC + TACA -> TACA
  ATAC + TACA -> ATACA
  TACA + ACAT -> TACAT
Given a set of N (2 <= N <= 7) DNA sequences all of whose lengths are in the range 1..7, find and print length of the shortest possible sequence obtainable by repeatedly merging all N strings using the procedure described above. All strings must be merged into the resulting sequence.

输入:

The input consists of multiple test cases.
Each test case :
Line 1: A single integer N

Lines 2..N+1: Each line contains a single DNA subsequence
End of file.

输出:

For each pair of input output the length of the shortest possible string obtained by merging the subsequences. It is always possible � and required � to merge all the input strings to obtain this string.

样例输入:

4
GATTA
TAGG
ATCGA
CGCAT

样例输出:

13

Hint
Hint
Explanation of the sample: Such string is "CGCATCGATTAGG".

/*
	Author: ACb0y
	Date: 2010-9-05
	Type: force
	ProblemId: hdu 1583 DNA Assembly
	Result: 2919263 2010-09-05 10:02:43 Accepted 1583 281MS 272K 1079 B C++ ACb0y 
*/
#include <iostream>
#include <string>
using namespace std;

int n;
int ans;
int d[10];
int vis[10];
string str[10];

//字符串合并
string str_merge(string a, string b)
{
	if (a == "") 
	{
		return b;
	}
	int i;
	int flag = 0;
	int pos;
	int alen = a.length();
	int blen = b.length();
	
	for (i = 1; i <= alen; i++) 
	{
		if (b.substr(0, i) == a.substr(alen - i, i))
		{
			flag = 1;
			pos = i;
		} 
	}
	if (flag) 
	{
		return a + b.substr(pos, blen - pos);
	}
	else 
	{
		return a + b;
	}
}

//回溯法求N!
void dfs(int pos)
{
	int i;
	if (pos == n) 
	{
		string temp = "";
		for (i = 0; i < n; i++) 
		{
			temp = str_merge(temp, str[d[i]]);
		}
		if (temp.length() < ans) 
		{
			ans = temp.length();
		}
	}
	else 
	{
		for (i = 0; i < n; i++) if (!vis[i])
		{
			d[pos] = i;
			vis[i] = 1;
			dfs(pos + 1);
			vis[i] = 0;
		}
	}
}

int main()
{
	int i;
#ifndef ONLINE_JUDGE
	freopen("1583.txt", "r", stdin);
#endif
	while (cin >> n) 
	{
		for (i = 0; i < n; i++) 
		{
			cin >> str[i];
		}
		memset(vis, 0, sizeof(vis));
		ans = 10000;
		dfs(0);
		cout << ans << endl;
	}
	return 0;
}

解题报告转自:http://blog.csdn.net/acb0y/article/details/5864236


  1. 其实国内大部分公司对算法都不够重视。特别是中小型公司老板根本都不懂技术,也不懂什么是算法,从而也不要求程序员懂什么算法,做程序从来不考虑性能问题,只要页面能显示出来就是好程序,这是国内的现状,很无奈。

  2. 5.1处,反了;“上一个操作符的优先级比操作符ch的优先级大,或栈是空的就入栈。”如代码所述,应为“上一个操作符的优先级比操作符ch的优先级小,或栈是空的就入栈。”

  3. 在方法1里面:

    //遍历所有的边,计算入度
    for(int i=0; i<V; i++)
    {
    degree = 0;
    for (j = adj .begin(); j != adj .end(); ++j)
    {
    degree[*j]++;
    }
    }

    为什么每遍历一条链表,要首先将每个链表头的顶点的入度置为0呢?
    比如顶点5,若在顶点1、2、3、4的链表中出现过顶点5,那么要增加顶点5的入度,但是在遍历顶点5的链表时,又将顶点5的入度置为0了,那之前的从顶点1234到顶点5的边不是都没了吗?

  4. simple, however efficient. A lot of instances it is difficult to get that a??perfect balancea?? among usability and appearance. I must say that youa??ve done a exceptional task with this. Also, the blog masses quite fast for me on Web explore.