首页 > ACM题库 > HDU-杭电 > Hdu 1483 Automatic Correction of Misspellings-字符串处理[解题报告] C++
2013
12-11

Hdu 1483 Automatic Correction of Misspellings-字符串处理[解题报告] C++

Automatic Correction of Misspellings

问题描述 :

Some text editors offer a feature to correct words which seem to be written incorrectly. In this problem you are asked to implement a simple Automatic Correction of Misspellings (ACM).

ACM takes care of the following misspellings of words:

1.One letter is missing (e.g., letter is written leter) or too much (e.g., letter is written lettter).
2.One letter is wrong (e.g., letter is written ketter)
3.The order of two adjacent letters is wrong (e.g., letter is written lettre)

ACM is based on a dictionary of known words. When a text contains a word which is not in the dictionary, ACM will try to replace it by a similar word of the dictionary. Two words are similar if we can transform one word into the other by doing exactly one of the misspellings listed above. An unknown word is left unchanged if there is no similar word in the dictionary.

输入:

The first line of the input file will give the number n of words in the dictionary (n ≤ 10000). The next n lines contain the dictionary words. The following line contains an integer q ≤ 1000, the number of query words. The next q lines contain the query words. You may assume that each word in the input consists of 1 to 25 lower case letters (‘a’ to ‘z’).

输出:

For each query word, print one line with the query word followed by one of the following possibilities:

1.is correct, if the word occurs in the dictionary.
2.is a misspelling of <x>, where <x> is a word of the dictionary similar to the query word, and the query word is not in the dictionary. In the case that there are several possibilities, select the word from the dictionary which appeared earlier in the input.
3.is unknown, if cases 1 and 2 do not apply.

样例输入:

10
this
is
a
dictionary
that
we
will
use
for
us
6
su
as
the
dictonary
us
willl

样例输出:

su is a misspelling of us
as is a misspelling of is
the is unknown
dictonary is a misspelling of dictionary
us is correct
willl is a misspelling of will


wrong了5次 最后一次太激动了.忘记改一个数字,就多wrong了一次.
题目意思是:
给你N个准确的单词
再判断M个单词中的每个单词s的状态;
1:如果s在N个单词中有,就是正确的
2:如果s没在N个单词里,但是与M个单词相近
相近的原则是: 在长度相等的时候只能有一个字母不一样如(letter ketter);
在长度相等的时候最多两个字母交换位置 如(letter lettre)
长度不等时,长度最多只能相差一,而且较短的那个必须是长的那个的子序列.如(letter lettr)
3 如果满足2中的多个条件,那么输入N个单词中最先出现的那个.
————–
下面是代码;
主要是 Cval这个函数,在满足1条件是 返回0,满足2条件是 返回1,再不行 就返回2.
时间是 468毫秒…不过有人写出46毫秒的= =很牛..
————-
给几个测试数据:
5
abcd
ab
pq
xp
y
6
abce
a
xq
x
xp
c
—————————

#include<iostream>
using namespace std;
struct NODE
{
 char s[26];
 char t[26];//ji de '?'-1
}; 
struct NODE2
{
 char s[26];
 char t[26];
 int val;
};
int main()
{
 bool mainn();
 while(mainn());
 return 0;
}
int N,M;
NODE node[100001];
NODE2 tcin;
int ptr;
int tval;
int la;
inline f(const char s[],char t[])
{
 int i;
 for(i=0;s[i]!='\0';i++)
 {
  t[s[i]-'a']++;
 }
}
inline Cval(NODE2 tcin,int id)
{
 int i,j;
 int t=0;
 if(!strcmp(tcin.s,node[id].s))return 0;//letter letter
 int lb=strlen(node[id].s);
 if(la==lb)
 {
  t=0;//lettre letter 
  if(!memcmp(tcin.t,node[id].t,sizeof(char)*26))
  {
   for(i=0;i<la;i++)
   {
    if(tcin.s[i]!=node[id].s[i])
    {
     t++;
     if(t>2)
      break;
    }
   }
   if(t==2)return 1;
  }
  t=0;//ketter letter 
  for(i=0;i<la;i++)
  {
   if(tcin.s[i]!=node[id].s[i])
   {
    t++;
    if(t>1)break;
   }
  }
  if(t==1)return 1;
 }
 
 if(abs(la-lb)==1)
 {
  if(la<lb)//leter letter 
  {
   t=0;
   for(i=0,j=0;i<la && j<lb;i++,j++)
   {
    if(tcin.s[i]==node[id].s[j])continue;
    i--;t++;
    if(t>2)
     break;
   }
   if(t<2)return 1;
  }
  else if(la>lb)//lettter letter
  {
   t=0;
   for(i=0,j=0;i<la && j<lb;i++,j++)
   {
    if(tcin.s[i]==node[id].s[j])continue;
    j--;t++;
    if(t>2)
     break;
   }
   if(t<2)return 1;
  }
 }
 return 2; 

}
bool mainn()
{
 int i,j,flag;
 if(EOF==scanf("%d",&N))return false;
 for(i=0;i<N;i++)
 {
  scanf("%s",node[i].s);
  memset(node[i].t,0,sizeof(char)*26);
  f(node[i].s,node[i].t);
 }
 scanf("%d",&M);
 for(i=0;i<M;i++)
 {
  scanf("%s",tcin.s);
  memset(tcin.t,0,sizeof(char)*26);
  tcin.val=2;
  ptr=-1;
  f(tcin.s,tcin.t);
  la=strlen(tcin.s);
  for(j=0;j<N;j++)
  {
   tval=Cval(tcin,j);
   if(tval<tcin.val)
   {
    tcin.val=tval;
    ptr=j;
    if(tcin.val==0)
    {
     break;
    }
   }
  }
  if(tcin.val==2)
  {
   printf("%s is unknown\n",tcin.s);
   continue;
  }
  if(tcin.val==1)
  {
   printf("%s is a misspelling of %s\n",tcin.s,node[ptr].s);
   continue;
  }
  if(tcin.val==0)
  {
   printf("%s is correct\n",tcin.s);
   continue;
  }
 }
 return true;
}

 


  1. if(j){
    int ans=a ;
    for(int x=j-1;x>=0;x–){
    if(!a ) break;
    ans=min(ans,a );
    sum+=ans;
    }
    }
    求解释,,dp的思路是什么呢?

  2. 可以参考算法导论中的时间戳。就是结束访问时间,最后结束的顶点肯定是入度为0的顶点,因为DFS要回溯

  3. 题目需要求解的是最小值,而且没有考虑可能存在环,比如
    0 0 0 0 0
    1 1 1 1 0
    1 0 0 0 0
    1 0 1 0 1
    1 0 0 0 0
    会陷入死循环