首页 > ACM题库 > HDU-杭电 > HDU 3724-Encoded Barcodes-字典树-[解题报告]HOJ
2015
02-21

HDU 3724-Encoded Barcodes-字典树-[解题报告]HOJ

Encoded Barcodes

问题描述 :

All the big malls need a powerful system for the products retrieval. Now you are employed design a sub-system: reading the barcodes and return the matching products.

A barcode is an optical machine-readable representation of data, which shows certain data on certain products. A barcode consists of a series of bars with different widths. In our system, the barcodes have been scanned and the widths have been recorded. Every consecutive eight bars are considered as representing the ASCII code of a character, each bar for each bit. Ideally, there should be only two kinds of widths in the eight bars, and the width of the wider bar is twice of the narrower. The wider bar indicates 1, while the narrower indicates 0. However, due to the inaccuracy of printing and scanning, there will be an error of at most 5%. That is, if the pretended exact width is x, you may get a value in the range [0.95x, 1.05x].

For example, the width sequence "10.0 20.0 10.0 10.0 10.0 10.0 10.0 20.0" is a valid barcode of our system, and it means (01000001)2, which is (65)10 and the corresponding character is "A". Note that "10.5 20.1 10.1 10.2 9.9 9.7 10.0 19.9" is also a valid barcode representing the same letter.

You are given the names of all the products and many queries. Every name contains lower-case letters only, and the length is no more than 30. The queries are represented as barcodes. For each query, you should decode it to a string S, and report the amount of products whose prefix is S. For the output may be very large, you only need to output the sum of all the queries for each case.

输入:

There are several test cases in the input. The first line of each case contains two integers N and M (1 <= N <= 10000, 1 <= M <= 2000), indicating the number of products and queries. Then N lines follow, indicating the names of the products. Note that the names may be duplicated. Then M query blocks follow. The first line of each query block is an integer K (0 < K <= 30) indicating the length of the query, then K lines follow, each line contains 8 positive float numbers, indicating the barcode for each character.

You can assume that the barcodes are always valid, and always represent lower-case letters.

输出:

There are several test cases in the input. The first line of each case contains two integers N and M (1 <= N <= 10000, 1 <= M <= 2000), indicating the number of products and queries. Then N lines follow, indicating the names of the products. Note that the names may be duplicated. Then M query blocks follow. The first line of each query block is an integer K (0 < K <= 30) indicating the length of the query, then K lines follow, each line contains 8 positive float numbers, indicating the barcode for each character.

You can assume that the barcodes are always valid, and always represent lower-case letters.

样例输入:

4 3
apple
apple
avatar
book
1
1 2 2 1 1 1 1 2
2
1 2 2 1 1 1 1 2
10.1 20.1 19.9 20.0 10.2 9.8 9.9 10.0
1
1 2 2 1 1 1 2 2

样例输出:

5

Hint
There is only one test case. The first query is "a", and the answer is 3. The second query is "ap", and the answer is 2. The third query is "c", and the answer is 0. So the total sum is 3+2+0 = 5.

         Trie模板。把所有单词都用字典树存起来,然后给每个节点赋权值表示前缀到该节点出现了几次。然后直接查询就可以了。

 

#include <algorithm>
#include <iostream>
#include <sstream>
#include <cstdlib>
#include <climits>
#include <cstring>
#include <cstdio>
#include <string>
#include <vector>
#include <cctype>
#include <queue>
#include <cmath>
#include <set>
#include <map>
#define CLR(a, b) memset(a, b, sizeof(a))
using namespace std;

const int MAX_NODE = 10100 * 66;
const int INF = 0x3f3f3f3f;
const int MOD = 20090717;
const int CHILD_NUM = 26;
const int N = 30;

class Trie
{
private:
    int chd[MAX_NODE][CHILD_NUM];
    int val[MAX_NODE];
    int Q[MAX_NODE];
    int ID[128];
    int sz;
public:
    void Initialize()
    {
        for(int i = 0; i < 26; i ++)
            ID[i + 'a'] = i;
    }
    void Reset()
    {
        CLR(chd[0] , 0);sz = 1;
        CLR(val, 0);
    }
    void Insert(char *a)
    {
        int p = 0;
        for ( ; *a ; a ++)
        {
            int c = ID[*a];
            if (!chd[p][c])
            {
                CLR(chd[sz] , 0);
                chd[p][c] = sz ++;
            }
            p = chd[p][c];
            val[p] ++;
        }
    }
    int Work()
    {
        int q, now, S = 0;
        double tmp[10], lg;
        scanf("%d", &q);bool flag = 0;
        for(int i = 0; i < q; i ++)
        {
            lg = 1e18;
            for(int j = 0; j < 8; j ++)
            {
                scanf("%lf", &tmp[j]);
                lg = min(lg, tmp[j]);
            }now = 0;
            for(int j = 0; j < 8; j ++)
            {
                now <<= 1;
                if(tmp[j] > lg * 1.105)
                    now += 1;
            }
            if(!flag)S = chd[S][ID[now]];
            if(!S) flag = 1;
        }
        if(!flag) return val[S];
        else return 0;
    }
} Tr;

char ch[N];

int main()
{
    //freopen("input.txt", "r", stdin);
    Tr.Initialize();
    int n, m, k, ans;
    while (scanf("%d%d", &n, &m) != EOF)
    {
        Tr.Reset();
        for (int i = 0 ; i < n ; i ++)
        {
            char temp[66];
            scanf("%s", temp);
            Tr.Insert(temp);
        }
        ans = 0;
        for(int i = 0; i < m; i ++)
            ans += Tr.Work();
        printf("%d\n", ans);
    }
    return 0;
}

 

 

参考:http://www.cnblogs.com/keanuyaoo/archive/2013/10/10/3362240.html


  1. 算法是程序的灵魂,算法分简单和复杂,如果不搞大数据类,程序员了解一下简单点的算法也是可以的,但是会算法的一定要会编程才行,程序员不一定要会算法,利于自己项目需要的可以简单了解。

  2. 第23行:
    hash = -1是否应该改成hash[s ] = -1

    因为是要把从字符串s的start位到当前位在hash中重置

    修改提交后能accept,但是不修改居然也能accept