首页 > ACM题库 > LeetCode > LeetCode-Word Break
2014
07-21

LeetCode-Word Break

Given a string s and a dictionary of words dict, determine if s can be segmented into a space-separated sequence of one or more dictionary words.

For example, given
s = "leetcode",
dict = ["leet", "code"].

Return true because "leetcode" can be segmented as "leet code".

题目链接:https://oj.leetcode.com/problems/word-break/

分析:可以使用回溯法求解,每次从dict中查找是否有匹配的单词(下面用java实现的,调用startsWith函数),有的话就将字符串s前面的部分截取掉。从而缩小s的范围,直到s为空说明匹配成功。但是这种方法效率太低,运行超时。

下面是超时的代码:

public class Solution {
    public static boolean wordBreak(String s, Set<String> dict) {
		 if(s==null || s.length()==0) return false;
		 return find(s,dict);
	  }
	 public static boolean find(String s, Set<String> set){
		 if(s.length() == 0) return true;
		 boolean find = false;
		 for(String str:set){
			 if(s.startsWith(str)){
				 find =  find(s.substring(str.length()), set);
			 }
			 if(find) return true;
		 }
		 return false;
	 }
}

可以利用记忆化存储的方式改进上面的算法,因为s会有很多重复的字符被重复计算。利用数组dp[][]记录下面字串的匹配结果为 true或false。初始化为0,表示没有计算过,dp[i][j] = 1表示 字串(i,j)的匹配结果为true,dp[i][j] = -1表示匹配结果为false。改进代码如下,accepted,其实也就是动态规划的思想,也可以用自底向上打表的方式进行优化。

public class WordBreak {
		static int dp[][];
		static String string;
	 public static boolean wordBreak(String s, Set<String> dict) {
		 if(s==null || s.length()==0) return false;
		 dp = new int[s.length()+1][s.length()+1];
		 string = s;
		 return find(0, s.length() ,dict);
	  }
	 public static boolean find(int s, int e, Set<String> set){
		 if(dp[s][e] == -1) return false;
		 if( s == e || dp[s][e] == 1) return true;
		 String tmpstr = string.substring(s,e);
		 boolean find = false;
		 for(String str:set){
			 if(tmpstr.startsWith(str)){
				 find =  find(s+str.length(), e, set);
			 }
			 if(find){
				dp[s][e] = 1;
				 return true;
			 }
		 }
		 dp[s][e] = -1;
		 return false;
	 }
	 public static void main(String[] args) {
		Set<String> dict = new HashSet<String>();
		String str[] = {"a","aa","aaa","aaaa","aaaaa","aaaaaa","aaaaaaa","aaaaaaaa","aaaaaaaaa","aaaaaaaaaa"};
		for(int i=0; i<str.length; i++) dict.add(str[i]);
		String s = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab";
		System.out.println(wordBreak(s, dict));
	}
}

上面的解法依然不够简洁,突然发现使用二维数组是没有必要的。可以使用一个数组 dp[i]表示 str[0.... i-1]能否成功分词。dp[0]初始化为true.

public class Solution {
    public boolean wordBreak(String s, Set<String> dict) {
        boolean[] t = new boolean[s.length()+1];
        t[0] = true; //set first to be true, why?
        //Because we need initial state

        for(int i=0; i<s.length(); i++){
            //should continue from match position
            if(!t[i]) 
                continue;

            for(String a: dict){
                int len = a.length();
                int end = i + len;
                if(end > s.length())
                    continue;

                if(t[end]) continue;

                if(s.substring(i, end).equals(a)){
                    t[end] = true;
                }
            }
        }

        return t[s.length()];
    }
}

这个链接可以参考其它的解决方案:http://blog.csdn.net/doc_sgl/article/details/12323015

参考:http://www.programcreek.com/2012/12/leetcode-solution-word-break/


  1. [email protected]