首页 > ACM题库 > HDU-杭电 > hdu 1958 Word Encoding-二分[解题报告]C++
2013
12-26

hdu 1958 Word Encoding-二分[解题报告]C++

Word Encoding

问题描述 :

In any language, certain combinations of letters do not appear (or at least appear so seldom that they can be considered non-existent). For instance, there are no English words containing the three letter combination buv as a substring. Given a list of letter combinations that do not exist, the number of possible “words” in a language can be reduced a lot (a “word” here means any combination of letters that doesn’t contain any of the given letter combinations as a substring). If we order all such words by increasing length, ordering words of the same length alphabetically, we can enumerate them starting from 1. Assume that the alphabet always consists of the lower case letters ’a’ to ’z’.

For instance, if the list only contains the combinations q, ab and aaa, the words would be
enumerated like this:
1. a
2. b

16. p
17. r

26. aa
27. ac

649. zz
650. aac
Given the list of letter combinations, write a program that for a given word outputs its number, and for a given number ouputs its word. You can assume that none of the words will exceed 20 characters and no number will be greater than 2 000 000 000 (for both input and output).

输入:

The input will contain several test cases. The number of test cases T appears on a line by itself. Then follow T test cases. Each test case starts with a line containing two integers, N (the number of letter combinations, non-negative, at most 1 000) and M (the number of queries for this list, positive, at most 100). Then follow N lines, each containing a lower case letter combination (between 1 and 3 letters, inclusive). After that follow M lines, each containing either a positive integer or a lower case word. If it’s a word, it will not contain any of the combinations of letters in the list for this test case. If it’s a number, it will not be greater than the number of words in the language.

输出:

The input will contain several test cases. The number of test cases T appears on a line by itself. Then follow T test cases. Each test case starts with a line containing two integers, N (the number of letter combinations, non-negative, at most 1 000) and M (the number of queries for this list, positive, at most 100). Then follow N lines, each containing a lower case letter combination (between 1 and 3 letters, inclusive). After that follow M lines, each containing either a positive integer or a lower case word. If it’s a word, it will not contain any of the combinations of letters in the list for this test case. If it’s a number, it will not be greater than the number of words in the language.

样例输入:

2
3 4
q
ab
aaa
16
r
27
aac
7 2
a
b
c
d
ef
ghi
ijk
102345678
ksvfuw

样例输出:

p
17
ac
650
xexgun
39174383


长度相同时按字典序从小到大排列,字典中的单词最多20 0000 0000。求 给出一个满足题目限制条件的单词,输出其在该字典中的位置,或者给出字典中的合法位置, 输出对应位置的单词。
        
      
算法: DP做预处理 + 二分
比赛的时候这道题目想了半个小时,主要的思路是在组合数学上,但是难点在于字典中的单词有一个巨恶心的限制条件: 不能包含所给出的任意单词。 半个小时未果,也就没有在深入思考了。
              
整理整理这道题目的思路:
其实输出只有一个,对于给定一个合法位置,如何正确的得到字典中该位置的单词?
接着分析得出: 对于已经给定的限制条件, 字典已经唯一确定了,里面的单词就位了。
如果限制单词中有 a, 那么字典中永远不会出现a.
f[k][i][j]: 表示以ij开始长度为k的单词的个数.
递推公式:
f[k][i][j] = sum{ f[k-1][j][z] (0<=z < 27) }.
Cnum[i][0] = 表示字典中的位置.
Cnum[i][1] = 表示该位置的单词的长度.
Cnum[i][2] = 表示第一个字母.
Cnum[i][3] = 表示第二个字母.
                
对于输出可以根据Cnum进行构造.

// 代码
void findS(int id, char *s){
int i = 0, j;
while (cnum[i][0] < id) ++i;
if (i) id -= cnum[i-1][0];
int a = cnum[i][2];
int b = cnum[i][3];
FORD (l, cnum[i][1], 1)
{
sum = 0;
for (j = 0; j < 27; ++j)
{
// abi 不是非法的
if (!g[a][b][j])
{
sum += f[l-1][b][j];
if (sum >= id)
{
id -= sum – f[l-1][b][j];
break;
}
}
}
s[cnum[i][1] – l] = a+’a'-1;
a = b;
b = j;
}
s[cnum[i][1]] = ‘\0′;
}
          
如果给出一个单词,求其在字典中的位置,二分位置,调用findS(mid, ss)得到该位置所对应的单词,在与目标单词进行对比,直到找到该单词的正确位置。


  1. #include <cstdio>
    #include <cstring>

    const int MAXSIZE=256;
    //char store[MAXSIZE];
    char str1[MAXSIZE];
    /*
    void init(char *store) {
    int i;
    store['A']=’V', store['B']=’W',store['C']=’X',store['D']=’Y',store['E']=’Z';
    for(i=’F';i<=’Z';++i) store =i-5;
    }
    */
    int main() {
    //freopen("input.txt","r",stdin);
    //init(store);
    char *p;
    while(fgets(str1,MAXSIZE,stdin) && strcmp(str1,"STARTn")==0) {
    if(p=fgets(str1,MAXSIZE,stdin)) {
    for(;*p;++p) {
    //*p=store[*p]
    if(*p<’A’ || *p>’Z') continue;
    if(*p>’E') *p=*p-5;
    else *p=*p+21;
    }
    printf("%s",str1);
    }
    fgets(str1,MAXSIZE,stdin);
    }
    return 0;
    }

  2. #!/usr/bin/env python
    def cou(n):
    arr =
    i = 1
    while(i<n):
    arr.append(arr[i-1]+selfcount(i))
    i+=1
    return arr[n-1]

    def selfcount(n):
    count = 0
    while(n):
    if n%10 == 1:
    count += 1
    n /= 10
    return count

  3. 5.1处,反了;“上一个操作符的优先级比操作符ch的优先级大,或栈是空的就入栈。”如代码所述,应为“上一个操作符的优先级比操作符ch的优先级小,或栈是空的就入栈。”

  4. 这道题目虽然简单,但是小编做的很到位,应该会给很多人启发吧!对于面试当中不给开辟额外空间的问题不是绝对的,实际上至少是允许少数变量存在的。之前遇到相似的问题也是恍然大悟,今天看到小编这篇文章相见恨晚。

  5. 学算法中的数据结构学到一定程度会乐此不疲的,比如其中的2-3树,类似的红黑树,我甚至可以自己写个逻辑文件系统结构来。

  6. Thanks for using the time to examine this, I truly feel strongly about it and enjoy finding out far more on this subject matter. If achievable, as you achieve knowledge