Post 86

寒蝉凄切,对长亭晚,骤雨初歇。

[USACO14FEB]自动完成Auto-complete

题目描述

Bessie the cow has a new cell phone and enjoys sending text messages, although she keeps making spelling errors since she has trouble typing on such a small screen with her large hooves. Farmer John has agreed to help her by writing an auto-completion app that takes a partial word and suggests how to complete it.

The auto-completion app has access to a dictionary of W words, each consisting of lowercase letters in the range a..z, where the total number of letters among all words is at most 1,000,000. The app is given as input a list of N partial words (1 <= N <= 1000), each containing at most 1000 lowercase letters. Along with each partial word i, an integer K_i is also provided, such that the app must find the (K_i)th word in alphabetical order that has partial word i as a prefix. That is, if one ordered all of the valid completions of the ith partial word, the app should output the completion that is (K_i)th in this sequence.

贝西新买了手机,打字不方便,请设计一款应用,帮助她快速发消息。

字典里有W(W<=30000)个小写字母构成的单词,所有单词的字符总数量不超过1,000,000,这些单词是无序的。现在给出N(1 <= N <= 1000)个询问,每个询问i包含一个的字符串s_i(每个字符串最多包含1000个字符)和一个整数K_i,对于所有以s_i为前缀的单词,其中按字典序排序后的第K_i个单词,求该单词在原字典里的序号。

输入输出格式

输入格式:

* Line 1: Two integers: W and N.

* Lines 2..W+1: Line i+1: The ith word in the dictionary.

* Lines W+2..W+N+1: Line W+i+1: A single integer K_i followed by a partial word.

输出格式:

* Lines 1..N: Line i should contain the index within the dictionary

(an integer in the range 1..W) of the (K_i)th completion (in

alphabetical order) of the ith partial word, or -1 if there

are less than K_i completions.

输入输出样例

输入样例#1:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
10 3
dab
ba
ab
daa
aa
aaa
aab
abc
ac
dadba
4 a
2 da
4 da

输出样例#1:

1
2
3
3
1
-1

说明

OUTPUT DETAILS:

The completions of a are {aa,aaa,aab,ab,abc,ac}. The 4th is ab, which

is listed on line 3 of the dictionary. The completions of da are

{daa,dab,dadba}. The 2nd is dab, listed on line 1 of the dictionary.

There is no 4th completion of da.

Source: USACO 2014 Feburary Contest, Silver

题解

感觉可以广义SAM上暴力,鉴于实际上并不会,还是先想简单做法吧。

一个比较暴力的做法是将给定的所有字符串排序,对于查询的前缀,去分治找一个区间(有公共前缀的字符串在排序后必定是连续的,平凡性质略去证明),使得这个区间均以询问串为前缀。答案就是这个区间第$k$个。

时间复杂度$O(wlenlogw+nlenlogw)$

看上去很不错但实际上算一下发现由于那个$logw$超时了。。

鉴于练习暴力,先写一下。

好像想出一个后缀数组的优化方法?

对所有串用不同分隔符隔开建SA,SA上二分然后暴力比较当前的前缀情况,这样通过二分求出左右边界。

然而并没有优化后面一项的时间复杂度。。但是应该能过了。

用前面那种暴力方法直接A了。。数据水。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
#include <cstdio>
#include <cstdlib>
#include <algorithm>
#include <iostream>
#include <cstring>
#define maxn 30005
int n , q;
std::string t;
struct Node
{
std::string s;
int len , num;
bool operator < (const Node& x)const{
return s < x.s;
}
}p[maxn];

inline bool ck(std::string& s , std::string& t , int op)
{
if(s < t) return true;
else
{
int ct = 0 ;
for(int i = 0 ; i < (int)s.length() && i < (int)t.length() ; ++i)
{
if(s[i] == t[i]) ++ ct;
else break;
}
if(ct == (int)t.length()) return op;
else return false;
}
}

inline int query(int k , std::string& t)
{
int nl = 1 , nr = n , l = 0 , r = n;
while(nl <= nr) //ch(s , t , 0) : s < t -> true , false otherwise
{
int mid = nl + nr >> 1;
if(ck(p[mid].s , t , 0)) l = mid , nl = mid + 1;
else nr = mid - 1;
}
nl = 1 , nr = n;
while(nl <= nr)
{
int mid = nl + nr >> 1;
if(ck(p[mid].s , t, 1)) r = mid , nl = mid + 1;
else nr = mid - 1;
}
// printf("QUERY INTERVENE : %d %d\n",l,r);
if(r-l < k) return -1;
else return p[l+k].num;
}

int main()
{
scanf("%d%d",&n,&q);
for(int i = 1 ;i <= n ; ++i)
std::cin>>p[i].s , p[i].num = i , p[i].len = p[i].s.length();
std::sort(p+1,p+n+1);
// for(int i = 1 ; i <= n ; ++i)
// {
// printf("RANK : %d NUM: %d ",i,p[i].num);
// std::cout<<' ' << p[i].s << std::endl;
// }
for(int i = 1 ; i <= q ; ++i)
{
int k;
scanf("%d",&k);
std::cin>>t;
printf("%d\n",query(k , t));
}

}

这么水的题还WA了一发(主要是看错了样例)


[USACO12NOV]平衡的奶牛品种Balanced Cow Breeds

题目背景

征求翻译。如果你能提供翻译或者题意简述,请直接发讨论,感谢你的贡献。

题目描述

Farmer John usually brands his cows with a circular mark, but his branding iron is broken, so he instead must settle for branding each cow with a mark in the shape of a parenthesis — (. He has two breeds of cows on his farm: Holsteins and Guernseys. He brands each of his cows with a

parenthesis-shaped mark. Depending on which direction the cow is facing, this might look like either a left parenthesis or a right parenthesis.

FJ’s N cows are all standing in a row, each facing an arbitrary direction, so the marks on the cows look like a string of parentheses of length N. Looking at this lineup, FJ sees a remarkable pattern: if he scans from left to right through just the Holsteins (in the order they appear in the sequence), this gives a balanced string of parentheses; moreover, the same is true for the Guernseys! To see if this is truly a rare event, please help FJ compute the number of possible ways he could assign breeds to his N cows so that this property holds.

There are several ways to define what it means for a string of parentheses to be “balanced”. Perhaps the simplest definition is that there must be the same total number of (‘s and )’s, and for any prefix of the string, there must be at least as many (‘s as )’s. For example, the following strings are all balanced:

() (()) ()(()())

while these are not:

)( ())( ((())))

给一个只有左右括号的字符串,然后用H,G两种字符来标记这个序列,所有标记H的括号可以组成一个正确的括号序列,所有G标记的括号也组成一个正确的括号序列,然后输出这种标记方案的总数mod2012的值。

输入输出格式

输入格式:

* Line 1: A string of parentheses of length N (1 <= N <= 1000).

输出格式:

* Line 1: A single integer, specifying the number of ways FJ can assign breeds to cows so that the Holsteins form a balanced subsequence of parentheses, and likewise for the Guernseys. Since the answer might be a very large number, please print the remainder of this number when divided by 2012 (i.e., print the number mod 2012). Breed assignments involving only one breed type are valid.

输入输出样例

输入样例#1:

1
(())

输出样例#1:

1
6

说明

The following breed assignments work:

(()) HHHH (()) GGGG (()) HGGH (()) GHHG (()) HGHG (()) GHGH

感谢@CREEPER_ 提供翻译

题解

一道dp套路简单题。

NOIp出这种题该多好。

设$f_{i,j,k}$表示前$i$个括号,$j$个染$H$多出来的左括号,$k$个染$G$多出来的左括号。

然后枚举$i,j,k$,考虑当前括号的两种决策:染H还是G,然后再根据括号是左还是右转移。即

$f{i,j,k} = f{i-1,j+f(s[i]),k} + f_{i-1,j,k+f(s[i])}$

其中$f(‘(‘) = 1 , f(‘)’) = -1$

最后应该是两种均无左括号剩余的方案数,为

$f_{n,0,0}$

滚动数组优化dp空间复杂度。

前排警告,循环还是写括号比较好,不要把memset丢进循环最内层结果调半天才懊悔,考场上丢了这1小时就又凉了。

时间复杂度$O(n^3)$ , 空间复杂度$O(n^2)$,得分76,还行。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#include <cstdio>
#include <cstdlib>
#include <algorithm>
#include <iostream>
#include <cstring>
#define maxn 1005
int f[2][maxn][maxn] , n , a[maxn];
char s[maxn];
const int mod =2012;
int main()
{
scanf("%s",s+1);
n = std::strlen(s+1);
for(int i = 1 ; i <= n ; ++i)
{
if(s[i] == '(') a[i] = -1;
else a[i] = 1;
}
f[0][0][0] = 1;
for(int i = 1 ; i <= n ; ++i)
{
std::memset(f[i&1] , 0 ,sizeof(f[i&1]));
for(int j = 0 ; j <= i ; ++j)
{
for(int k = 0 ; k <= i ; ++k)
{
f[i&1][j][k] = f[i-1&1][j+a[i]>=0?j+a[i]:1003][k] + f[i-1&1][j][k+a[i]>=0?k+a[i]:1003] , f[i&1][j][k] %= mod;
}
}
}
printf("%d\n",f[n&1][0][0]);
}

但是对于第$i$阶段$j+k$必须等于前$i-1$个字符中左括号减去右括号的数量。这样我们可以轻易地将时间复杂度从$O(n^3)$优化到$O(n^2)$,只需要加个预处理数组然后dp去掉一维枚举就好了。

但是我懒,发现差一点就过了,然后顺手把内层$for ~k ~1 ~to~ i$变成$i-j$就过了。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#include <cstdio>
#include <cstdlib>
#include <algorithm>
#include <iostream>
#include <cstring>
#define maxn 1005
int f[2][maxn][maxn] , n , a[maxn];
char s[maxn];
const int mod =2012;
int main()
{
scanf("%s",s+1);
n = std::strlen(s+1);
for(int i = 1 ; i <= n ; ++i)
{
if(s[i] == '(') a[i] = -1;
else a[i] = 1;
}
f[0][0][0] = 1;
for(int i = 1 ; i <= n ; ++i)
{
std::memset(f[i&1] , 0 ,sizeof(f[i&1]));
for(int j = 0 ; j <= i ; ++j)
{
for(int k = 0 ; k <= i - j ; ++k)
{
f[i&1][j][k] = f[i-1&1][j+a[i]>=0?j+a[i]:1003][k] + f[i-1&1][j][k+a[i]>=0?k+a[i]:1003] , f[i&1][j][k] %= mod;
}
}
}
printf("%d\n",f[n&1][0][0]);
}

由于这种题不存在特殊数据导致随机数据跑得快而超时的情况,所以我们只需要极限数据测一下够快就行,考场节省时间


11:30了,开始颓。

要是联赛第一天出这么两道题该多好。

突然又想做题


[USACO07FEB]牛的词汇The Cow Lexicon

题目描述

Few know that the cows have their own dictionary with W (1 ≤ W ≤ 600) words, each containing no more 25 of the characters ‘a’..’z’. Their cowmunication system, based on mooing, is not very accurate; sometimes they hear words that do not make any sense. For instance, Bessie once received a message that said “browndcodw”. As it turns out, the intended message was “browncow” and the two letter “d”s were noise from other parts of the barnyard.

The cows want you to help them decipher a received message (also containing only characters in the range ‘a’..’z’) of length L (2 ≤ L ≤ 300) characters that is a bit garbled. In particular, they know that the message has some extra letters, and they want you to determine the smallest number of letters that must be removed to make the message a sequence of words from the dictionary.

没有几个人知道,奶牛有她们自己的字典,里面的有W (1 ≤ W ≤ 600)个词,每个词的长度不超过25,且由小写字母组成.她们在交流时,由于各种原因,用词总是不那么准确.比如,贝茜听到有人对她说”browndcodw”,确切的意思是”browncow”,多出了两个”d”,这两个”d”大概是身边的噪音.

奶牛们发觉辨认那些奇怪的信息很费劲,所以她们就想让你帮忙辨认一条收到的消息,即一个只包含小写字母且长度为L (2 ≤ L ≤ 300)的字符串.有些时候,这个字符串里会有多余的字母,你的任务就是找出最少去掉几个字母就可以使这个字符串变成准确的”牛语”(即奶牛字典中某些词的一个排列).

输入输出格式

输入格式:

Line 1: Two space-separated integers, respectively: W and L

Line 2: L characters (followed by a newline, of course): the received message

Lines 3..W+2: The cows’ dictionary, one word per line

• 第1行:两个用空格隔开的整数,W和L.

• 第2行:一个长度为L的字符串,表示收到的信息.

• 第3行至第W+2行:奶牛的字典,每行一个词.

输出格式:

Line 1: a single integer that is the smallest number of characters that need to be removed to make the message a sequence of dictionary words.

一个整数,表示最少去掉几个字母就可以使之变成准确的”牛语”.

输入输出样例

输入样例#1:

1
2
3
4
5
6
7
8
6 10
browndcodw
cow
milk
white
black
brown
farmer

输出样例#1:

1
2

题解

上次我们已经分析过爆搜的可能性并且发现在最坏情况下它最多过一个$n=5$的数据。

来想稍微用点脑子的做法。

感觉有可能进行一个区间暴力DP。设$f_{i,j}$表示$[i,j]$最少去掉几个字符就可以符合题意。

显然如果一个区间所有子区间都符合题意,那它也是符合题意的

如果我们对一个区间扩张一个字符,比如$[l,r+1]$,发现只需要枚举一个分界点$k$,让$[l,k]$是一个可行的区间,并且让$[k+1,r+1]$是一个单词(之所以不定义这段是一个可行区间是因为方便转移,比如我们可以直接查询这段是否是一个单词完成转移,这样做同样可以依赖前面的子问题得到最优解,因为显然有$f{i,j}+f{j+1,k} >= f_{i,k}$,这意味着让转移的区间尽可能大,也就是后面的一段是单词是个必要条件,因此可以优化)。