2013
12-09

DNA Sorting

One measure of “unsortedness” in a sequence is the number of pairs of entries that are out of order with respect to each other. For instance, in the letter sequence “DAABEC”, this measure is 5, since D is greater than four letters to its right and E is greater than one letter to its right. This measure is called the number of inversions in the sequence. The sequence “AACEDGG” has only one inversion (E and D)–it is nearly sorted–while the sequence “ZWQM” has 6 inversions (it is as unsorted as can be–exactly the reverse of sorted).

You are responsible for cataloguing a sequence of DNA strings (sequences containing only the four letters A, C, G, and T). However, you want to catalog them, not in alphabetical order, but rather in order of “sortedness”, from “most sorted” to “least sorted”. All the strings are of the same length.

This problem contains multiple test cases!

The first line of a multiple input is an integer N, then a blank line followed by N input blocks. Each input block is in the format indicated in the problem description. There is a blank line between input blocks.

The output format consists of N output blocks. There is a blank line between output blocks.

The first line contains two integers: a positive integer n (0 < n <= 50) giving the length of the strings; and a positive integer m (1 < m <= 100) giving the number of strings. These are followed by m lines, each containing a string of length n.

Output the list of input strings, arranged from “most sorted” to “least sorted”. If two or more strings are equally sorted, list them in the same order they are in the input file.

1

10 6
AACATGAAGG
TTTTGGCCAA
TTTGGCCAAA
GATCAGATTT
CCCGGGGGGA
ATCGATGCAT

CCCGGGGGGA
AACATGAAGG
GATCAGATTT
ATCGATGCAT
TTTTGGCCAA
TTTGGCCAAA

求逆序数，按排序程度从好到差排序（代数）。如果逆序数相同，则按原来顺序输出。wa了好几次，输出结果没看清就提交上去，唉。。。。。。。。在这里不能用简单sort。要stable_sort。因为sort排序如果逆序数相同则不分大小随机排序。

#include<stdlib.h>
#include<stdio.h>
#include <string>
#include <iostream>
#include <algorithm>
#include <cstdio>
using namespace std;
struct DNA //**定义DNA结构体**/
{
string str;//**这个方便，用多大就开多大空间**//
int count;
}w[1001];
bool comp(DNA x,DNA y)//**调整排序方法**//
{
return x.count<y.count;
}
int main()
{
int s,n,i,j,k;
scanf("%d %d",&s,&n);
for(i=0;i<n;i++)
{
cin>>w[i].str;//**由于C没有字符串，所以只能用C++**//
w[i].count=0;
for(j=0;j<=s-2;j++)//**选择排序**//
{
for(k=j+1;k<=s-1;k++)
{
if(w[i].str[j]>w[i].str[k]) w[i].count++;
}
}
}
stable_sort(w,w+n,comp);
for(i=0;i<n;i++)
{
cout<<w[i].str<<endl;

}
return 0;
}

1. 有限自动机在ACM中是必须掌握的算法，实际上在面试当中几乎不可能让你单独的去实现这个算法，如果有题目要用到有限自动机来降低时间复杂度，那么这种面试题应该属于很难的级别了。

2. 题本身没错，但是HDOJ放题目的时候，前面有个题目解释了什么是XXX定律。
这里直接放了这个题目，肯定没几个人明白是干啥