2 Monkeys on a computer [closed]











up vote
59
down vote

favorite
6












(a) Two monkeys are typing capital letters (A-Z) randomly. The first stops typing when the word COCONUT appears as seven successive letters. The second stops typing when TUNOCOC appears; TUNOCOC is simply COCONUT spelt backwards. Which monkey is expected to type more?



(b) A monkey is typing capital letters (A-Z) randomly. Two squirrels observe the sequence of letters thus generated. The first squirrel wins if COCONUT appears (as seven successive letters) before TUNOCOC. The second squirrel wins if TUNOCOC appears before COCONUT. Which squirrel is more likely to win?



Is there any way to solve it without using markov chains?










share|improve this question















closed as off-topic by ABcDexter, Astralbee, Chowzen, rhsquared, boboquack Dec 2 at 9:35


This question appears to be off-topic. The users who voted to close gave this specific reason:


  • "This question is off-topic as it appears to be a mathematics problem, as opposed to a mathematical puzzle. For more info, see "Are math-textbook-style problems on topic?" on meta." – ABcDexter, Astralbee, Chowzen, rhsquared, boboquack

If this question can be reworded to fit the rules in the help center, please edit the question.









  • 9




    @mukyuu How do people win if dogs are running? It's called betting...
    – LocustHorde
    Nov 23 at 9:48






  • 5




    Welcome to puzzling :) This question is better suited to MATH.SE
    – ABcDexter
    Nov 23 at 10:32








  • 5




    @EricDuminil I think they'd be equivalent if (b) had two monkeys typing and each squirrel was watching a different monkey. It's a different scenario when both are watching the same monkey.
    – jafe
    Nov 23 at 11:43






  • 2




    Please clarify: in a, do the monkeys generate separate input strings or do they co-operate on one?
    – Weckar E.
    Nov 23 at 17:22






  • 2




    Clarification? Are the two monkeys typing on two computers, each creating their own string of letters? Or are the typing parallel on the same keyboard, creating a unified line of letters?
    – Falco
    Nov 26 at 13:37















up vote
59
down vote

favorite
6












(a) Two monkeys are typing capital letters (A-Z) randomly. The first stops typing when the word COCONUT appears as seven successive letters. The second stops typing when TUNOCOC appears; TUNOCOC is simply COCONUT spelt backwards. Which monkey is expected to type more?



(b) A monkey is typing capital letters (A-Z) randomly. Two squirrels observe the sequence of letters thus generated. The first squirrel wins if COCONUT appears (as seven successive letters) before TUNOCOC. The second squirrel wins if TUNOCOC appears before COCONUT. Which squirrel is more likely to win?



Is there any way to solve it without using markov chains?










share|improve this question















closed as off-topic by ABcDexter, Astralbee, Chowzen, rhsquared, boboquack Dec 2 at 9:35


This question appears to be off-topic. The users who voted to close gave this specific reason:


  • "This question is off-topic as it appears to be a mathematics problem, as opposed to a mathematical puzzle. For more info, see "Are math-textbook-style problems on topic?" on meta." – ABcDexter, Astralbee, Chowzen, rhsquared, boboquack

If this question can be reworded to fit the rules in the help center, please edit the question.









  • 9




    @mukyuu How do people win if dogs are running? It's called betting...
    – LocustHorde
    Nov 23 at 9:48






  • 5




    Welcome to puzzling :) This question is better suited to MATH.SE
    – ABcDexter
    Nov 23 at 10:32








  • 5




    @EricDuminil I think they'd be equivalent if (b) had two monkeys typing and each squirrel was watching a different monkey. It's a different scenario when both are watching the same monkey.
    – jafe
    Nov 23 at 11:43






  • 2




    Please clarify: in a, do the monkeys generate separate input strings or do they co-operate on one?
    – Weckar E.
    Nov 23 at 17:22






  • 2




    Clarification? Are the two monkeys typing on two computers, each creating their own string of letters? Or are the typing parallel on the same keyboard, creating a unified line of letters?
    – Falco
    Nov 26 at 13:37













up vote
59
down vote

favorite
6









up vote
59
down vote

favorite
6






6





(a) Two monkeys are typing capital letters (A-Z) randomly. The first stops typing when the word COCONUT appears as seven successive letters. The second stops typing when TUNOCOC appears; TUNOCOC is simply COCONUT spelt backwards. Which monkey is expected to type more?



(b) A monkey is typing capital letters (A-Z) randomly. Two squirrels observe the sequence of letters thus generated. The first squirrel wins if COCONUT appears (as seven successive letters) before TUNOCOC. The second squirrel wins if TUNOCOC appears before COCONUT. Which squirrel is more likely to win?



Is there any way to solve it without using markov chains?










share|improve this question















(a) Two monkeys are typing capital letters (A-Z) randomly. The first stops typing when the word COCONUT appears as seven successive letters. The second stops typing when TUNOCOC appears; TUNOCOC is simply COCONUT spelt backwards. Which monkey is expected to type more?



(b) A monkey is typing capital letters (A-Z) randomly. Two squirrels observe the sequence of letters thus generated. The first squirrel wins if COCONUT appears (as seven successive letters) before TUNOCOC. The second squirrel wins if TUNOCOC appears before COCONUT. Which squirrel is more likely to win?



Is there any way to solve it without using markov chains?







mathematics probability






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 23 at 17:12









Dedwards

329315




329315










asked Nov 23 at 7:39









anmol porwal

39826




39826




closed as off-topic by ABcDexter, Astralbee, Chowzen, rhsquared, boboquack Dec 2 at 9:35


This question appears to be off-topic. The users who voted to close gave this specific reason:


  • "This question is off-topic as it appears to be a mathematics problem, as opposed to a mathematical puzzle. For more info, see "Are math-textbook-style problems on topic?" on meta." – ABcDexter, Astralbee, Chowzen, rhsquared, boboquack

If this question can be reworded to fit the rules in the help center, please edit the question.




closed as off-topic by ABcDexter, Astralbee, Chowzen, rhsquared, boboquack Dec 2 at 9:35


This question appears to be off-topic. The users who voted to close gave this specific reason:


  • "This question is off-topic as it appears to be a mathematics problem, as opposed to a mathematical puzzle. For more info, see "Are math-textbook-style problems on topic?" on meta." – ABcDexter, Astralbee, Chowzen, rhsquared, boboquack

If this question can be reworded to fit the rules in the help center, please edit the question.








  • 9




    @mukyuu How do people win if dogs are running? It's called betting...
    – LocustHorde
    Nov 23 at 9:48






  • 5




    Welcome to puzzling :) This question is better suited to MATH.SE
    – ABcDexter
    Nov 23 at 10:32








  • 5




    @EricDuminil I think they'd be equivalent if (b) had two monkeys typing and each squirrel was watching a different monkey. It's a different scenario when both are watching the same monkey.
    – jafe
    Nov 23 at 11:43






  • 2




    Please clarify: in a, do the monkeys generate separate input strings or do they co-operate on one?
    – Weckar E.
    Nov 23 at 17:22






  • 2




    Clarification? Are the two monkeys typing on two computers, each creating their own string of letters? Or are the typing parallel on the same keyboard, creating a unified line of letters?
    – Falco
    Nov 26 at 13:37














  • 9




    @mukyuu How do people win if dogs are running? It's called betting...
    – LocustHorde
    Nov 23 at 9:48






  • 5




    Welcome to puzzling :) This question is better suited to MATH.SE
    – ABcDexter
    Nov 23 at 10:32








  • 5




    @EricDuminil I think they'd be equivalent if (b) had two monkeys typing and each squirrel was watching a different monkey. It's a different scenario when both are watching the same monkey.
    – jafe
    Nov 23 at 11:43






  • 2




    Please clarify: in a, do the monkeys generate separate input strings or do they co-operate on one?
    – Weckar E.
    Nov 23 at 17:22






  • 2




    Clarification? Are the two monkeys typing on two computers, each creating their own string of letters? Or are the typing parallel on the same keyboard, creating a unified line of letters?
    – Falco
    Nov 26 at 13:37








9




9




@mukyuu How do people win if dogs are running? It's called betting...
– LocustHorde
Nov 23 at 9:48




@mukyuu How do people win if dogs are running? It's called betting...
– LocustHorde
Nov 23 at 9:48




5




5




Welcome to puzzling :) This question is better suited to MATH.SE
– ABcDexter
Nov 23 at 10:32






Welcome to puzzling :) This question is better suited to MATH.SE
– ABcDexter
Nov 23 at 10:32






5




5




@EricDuminil I think they'd be equivalent if (b) had two monkeys typing and each squirrel was watching a different monkey. It's a different scenario when both are watching the same monkey.
– jafe
Nov 23 at 11:43




@EricDuminil I think they'd be equivalent if (b) had two monkeys typing and each squirrel was watching a different monkey. It's a different scenario when both are watching the same monkey.
– jafe
Nov 23 at 11:43




2




2




Please clarify: in a, do the monkeys generate separate input strings or do they co-operate on one?
– Weckar E.
Nov 23 at 17:22




Please clarify: in a, do the monkeys generate separate input strings or do they co-operate on one?
– Weckar E.
Nov 23 at 17:22




2




2




Clarification? Are the two monkeys typing on two computers, each creating their own string of letters? Or are the typing parallel on the same keyboard, creating a unified line of letters?
– Falco
Nov 26 at 13:37




Clarification? Are the two monkeys typing on two computers, each creating their own string of letters? Or are the typing parallel on the same keyboard, creating a unified line of letters?
– Falco
Nov 26 at 13:37










15 Answers
15






active

oldest

votes

















up vote
45
down vote



accepted










(a) I claim that the expected typing length are the same for both monkeys. I guess something in my argument will be incorrect, as jafe's answer has 9 approvals, but finding that incorrectness would be helpful to me.



I prove that the probabilty that a monkey ends his typing after exactly $n$ letters is the same for both monkeys, which then implies that the expected typing lengths are the same.



In order for the typing to end after letter number $n$, the last seven letters must be exactly the monkey's preferred word "COCONUT" / "TUNOCOC" and that word cannot ever have appeared before in the text. That word appearing before can be of two kinds:




  1. It could share some letters with the last 7, or

  2. be completely contained in the first $n-7$ letters.


It's easy to see that for the first monkey, case 1 can not happen: If the word "COCONUT" shared some of its letters with the last 7 letters, then the last letter "T" would have to appear a second time in "COCONUT" (besides last place), which it doesn't.



The argument is slightly more complicated for the second monkey, as the last letter ("C") of "TUNOCOC" does appear a second time in it. But "TUNOC" is not the ending part of "TUNOCOC", so also for the second monkey case 1. cannot happen.



Let's recap: In order for each monkey to end typing after exactly $n$ letters, the text it produced must fullfill two criteria:




  • A) The last seven letters must be exactly the monkey's preferred word
    "COCONUT"/"TUNOCOC", and

  • B) The previous $n-7$ letters cannot contain
    the monkey's preferred word.


This is an equivalence: Any sequence of $n$ letters that fulfills A and B (for any of the two monkeys) is a sequence that has that monkey stop after exactly $n$ letters, and vice versa.



Since each letter typed is independent of others, the events that A or B aply to a sequence of $n$ letters (for a given monkey) are also independent, so the respective probabilities can be multiplied. If we use indices $1$ and $2$ to apply for the first and second monkey, we obviuously get



$$ p_1(A)=p_2(A)=left(frac1{26}right)^7$$



Calculating $p_{1/2}(B)$ directly is much harder, but they are the same: Both consider sequences of $n-7$ independent, uniformly distributed letters, and if and only if such a sequence does (does not) contain the word "COCONUT", then the reversed sequence (which is again a sequence of $n-7$ independent, uniformly distributed letters) does (does not) contain the word "TUNOCOC".



So we get



$$p_1(B) = p_2 (B)$$ and this proves my initial claim: The probability that the typing ends after exactly $n$ letters is $p_1(A)p_1(B)$ for the first monkey and $p_2(A)p_2(B)$ for the second, and they are the same.



+++++++



I cannot say why jafe's argument is incorrect, it is very compelling, and as I said initially, it may very well be that it is correct and something I said is incorrect. I just don't see what that may be, in either case.



ADDITION after more thinking and reading the comments:



I think the flaw in jafe's argument is in double counting after "COCOCO". If the next letter is "N", this is counted as 'good' for both the ending "COCO" and continuation "NUT" and the intitial "COCO" and continution "CONUT". However, both can only lead to the same end result "COCOCONUT", the additional "CONUT" option is not really additional.






share|improve this answer



















  • 3




    The caveat is that you show that the proba for each monkey to end after $n$ letters is the same assuming that they both type at least $n$ letters. But the monkey A has a higher probability of having finished beforehand.
    – Evargalo
    Nov 23 at 12:30








  • 3




    How can that be? If the first monkey has a higher probabilty to finish beforehand than monkey 2, it has to have a higher probabilty to have finished after exactly $k$ letters for some $k < n$, and then my argument applies again. In addition, I do not assume that they have typed at least $n$ letters, I have given exact condtions for that to be true and shown that they have equal probability.
    – Ingix
    Nov 23 at 12:42








  • 1




    Again, if you think that the first monkey has a higher probabilty to finish beforehand, which do you think is the first letter index at which the first monkey has a higher probability to have finished?
    – Ingix
    Nov 23 at 12:50






  • 1




    I am now convinced that this is the correct answer. Very confusing problem, but the flaw is in the other reasoning (see my comment down Viktor Mellgren's answer)
    – Evargalo
    Nov 23 at 13:49






  • 1




    I wrote a Python script with a smaller alphabet ('ABCNOTU'). Both monkeys seem to need around 800000 letters before writing the desired word. I repeated the experiment 300 times and couldn't find any bias toward monkey 1 or 2.
    – Eric Duminil
    Nov 23 at 14:11


















up vote
30
down vote













(a)

Edit: This is incorrect, see comments




The TUNOCOC monkey is expected to type more.


Regardless of the previous letters, the TUNOCOC monkey has 1/26 chance of getting the next letter right. Clearly the COCONUT monkey has at least this, but turns out it does even better: Once it has typed COCO, it has two ways to get the correct result (NUT or CONUT).







share|improve this answer



















  • 4




    Sounds very compelling, still I think it is wrong, see my solution below. Would be very interested in you (and everybody else, of course) to check my approach.
    – Ingix
    Nov 23 at 12:12






  • 3




    I reduced the problem to the monkeys typing FFA or AFF which I think is equivalent. Then I made a computer simulation where I had them imput random characters and select a winner. The simulation does NOT support your answer. They both were winner the same amount of times.
    – Pieter B
    Nov 23 at 13:00






  • 1




    PieterB & Evargalo, what is the characterset of the typewriter in the FFA example? I think it needs a third 'wrong' character.
    – elias
    Nov 23 at 13:52






  • 2




    This answer indeed appears to be wrong but I don't know why. ¯_(ツ)_/¯
    – Eric Duminil
    Nov 23 at 14:12






  • 2




    I listed out all combinations of C,O,U,N,T for up to 11 letters. Indeed both COCONUT and TUNOCOC show up the same number of times for each letter count. D'oh!
    – jafe
    Nov 23 at 14:12




















up vote
23
down vote













Monkey problem



To settle down which monkey is faster on average, I'll use Markov chains and Mathematica. Define a state $i$, for $i = 0..6$, as that the monkey 1 has currently written $i$ correct subsequent characters of COCONUT, but has never written all of COCONUT. Define a state 7 as the monkey 1 having written COCONUT at least once. Since state 7 cannot be escaped, we say that it is absorptive (other states are transient). The transition matrix for monkey 1 is:





P1 = {
{25, 1, 0, 0, 0, 0, 0, 0},
{24, 1, 1, 0, 0, 0, 0, 0},
{25, 0, 0, 1, 0, 0, 0, 0},
{24, 1, 0, 0, 1, 0, 0, 0},
{24, 0, 0, 1, 0, 1, 0, 0},
{24, 1, 0, 0, 0, 0, 1, 0},
{24, 1, 0, 0, 0, 0, 0, 1},
{0, 0, 0, 0, 0, 0, 0, 26}
} / 26;


Here $P1_{ij}$ is the probability that the monkey 1 transitions from state $i$ to state $j$ on pushing a key. The Markov chain can be visualized as a graph:



p1 = DiscreteMarkovProcess[{1, 0, 0, 0, 0, 0, 0, 0}, P1];
Graph[Table[StringTake["COCONUT", x], {x, 0, 7}], p1,
VertexSize -> 0.6, GraphLayout -> {VertexLayout -> "CircularEmbedding" }]


Markov chain for monkey 1



Next up, we extract the sub-matrix which does not contain the absorptive state:



Q1 = P1[[1 ;; 7, 1 ;; 7]];


As described here, the expected time for the monkey 1 to write COCONUT (expected time to absorption) starting from state $i$ is given by



N1 = Inverse[IdentityMatrix[7] - Q1]
t1 = N1.{1, 1, 1, 1, 1, 1, 1};


The exact solution is given by:



-      8031810176
C 8031810150
CO 8031809500
COC 8031792574
COCO 8031352524
COCON 8019928800
COCONU 7722894400


The transition matrix for monkey 2 is given by:



P2 = {
{25, 1, 0, 0, 0, 0, 0, 0},
{24, 1, 1, 0, 0, 0, 0, 0},
{24, 1, 0, 1, 0, 0, 0, 0},
{24, 1, 0, 0, 1, 0, 0, 0},
{24, 1, 0, 0, 0, 1, 0, 0},
{24, 1, 0, 0, 0, 0, 1, 0},
{24, 1, 0, 0, 0, 0, 0, 1},
{0, 0, 0, 0, 0, 0, 0, 26}
} / 26;


The Markov chain for monkey 2 visualized as a graph:



p2 = DiscreteMarkovProcess[{1, 0, 0, 0, 0, 0, 0, 0}, P2];
Graph[Table[StringTake["TUNOCOC", x], {x, 0, 7}], p2,
VertexSize -> 0.6, GraphLayout -> {VertexLayout -> "CircularEmbedding" }]


Markov chain for monkey 2



By following the same procedure as for monkey 1, we solve $t2$ as:



-      8031810176
T 8031810150
TU 8031809500
TUN 8031792600
TUNO 8031353200
TUNOC 8019928800
TUNOCO 7722894400


Because the monkeys start from the situation when nothing has been written yet, we see that the expected time for them to write their word is the same $C = 8031810176$



This result can also be obtained directly in Mathematica:



d1 = FirstPassageTimeDistribution[p1, 8];
Mean[d1]


In fact, we can compute the characteristic function for the absorption-time-distribution:



CharacteristicFunction[d1, x]


which evaluates to



e^(7ix) / (C - C e^(ix) + e^(7ix))


The distributions for both monkeys have this same characteristic function. Wikipedia claims that cumulative distribution functions and characteristic functions are in one-to-one correspondence. This implies that the monkeys' finishing-time-distributions are the same.



Squirrel problem



The transition matrix for a monkey aiming for both words COCONUT and TUNOCOC is:



P3 = {
{24, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0},
{23, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0},
{24, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0},
{23, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0},
{23, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0},
{23, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0},
{24, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0},
{23, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0},
{23, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0},
{23, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0},
{24, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0},
{23, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0},
{24, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1},
{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 26, 0},
{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 26}} / 26


Here the states have been numbered in order - C CO COC COCO COCON COCONU T TU TUN TUNO TUNOC TUNOCO COCONUT TUNOCOC. Note that the absorbing states come last. The same reference as above shows that we can compute the probability $B3_{ij}$ of ending from a transient state $i$ to an absorption state $j + 13$ as follows:



Q3 = P3[[1 ;; 13, 1 ;; 13]];
R3 = P3[[1 ;; 13, 14 ;; 15]];
N3 = Inverse[IdentityMatrix[13] - Q3]
B3 = N3.R3


The matrix $B3 * 617830874$ is



        COCONUT   TUNOCOC
- 308915099 308915775
C 308915100 308915774
CO 308915125 308915749
COC 308915776 308915098
COCO 308932701 308898173
COCON 309372075 308458799
COCONU 320796475 297034399
T 308915098 308915776
TU 308915073 308915801
TUN 308914423 308916451
TUNO 308897523 308933351
TUNOC 308458124 309372750
TUNOCO 297033749 320797125


Since the monkey starts from nothing, the probability for writing COCONUT and TUNOCOC is 308915099 and 308915775, respectively, divided by 617830874. These correspond to 0.4999994529 and 0.5000005471, computed with Mathematica accurate to 10 decimal places. Therefore the monkey is more probable to write TUNOCOC than COCONUT.



Code



Here is the Python-code I used to generate the transition matrices.





def computeTransitionMatrix(words, keys):
def properPrefixes(word):
return (word[:i] for i in range(len(word)))

def suffixesInDecreasingLength(word):
return (word[i:] for i in range(len(word) + 1))

prefixToState = {}
stateToPrefix =

def addState(prefix):
if prefix in prefixToState:
return
prefixToState[prefix] = len(stateToPrefix)
stateToPrefix.append(prefix)

# Create a state for each proper prefix of the word.
for word in words:
for prefix in properPrefixes(word):
addState(prefix)

# Create a state for each word last.
for word in words:
addState(word)

print(stateToPrefix)

# Number of states.
nStates = len(stateToPrefix)

# Compute the (scaled) transition probabilities.
transitions =
for i in range(nStates):
row = [0] * nStates
transitions.append(row)

prefix = stateToPrefix[i]
if prefix in words:
# The word is an absorptive state.
row[i] = len(keys)
continue
for key in keys:
nextPrefix = prefix + key
# Find the longest suffix which
# is a prefix of the word.
for suffix in suffixesInDecreasingLength(nextPrefix):
j = prefixToState.get(suffix)
if j != None:
row[j] += 1
break

return transitions

# The words the monkey is supposed to write.
words = ['COCONUT', 'TUNOCOC']

# Keys available on monkey's keyboard.
keys = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'

transitions = computeTransitionMatrix(words, keys)
print(repr(transitions).replace('[', '{').replace(']','}').replace('}, ', '},n'))





share|improve this answer



















  • 3




    Thanks for doing the work of establishing the Markov matrices and doing the calculations! It should be noted that the discussed effects about the repeated sub-words "CO" and "OC" can be seen: the expected value vectors start and and end the same, but have different values for COC/TUN and COCO/TUNO. The differences just vanish again in the end. The reason for this is IMO that COCONUT can not overlap with itself in a text (like e.g. NUTNU could). That means that for all such words of the same length (say "TOBACCO"), the expected time should be exactly the same.
    – Ingix
    Nov 24 at 19:18






  • 2




    It means the expected time is simply 26**7. How anticlimactic! :)
    – Eric Duminil
    Nov 25 at 9:09










  • Thanks. This is a great answer for learning some cool tricks in Mathematica.
    – Erel Segal-Halevi
    Nov 26 at 8:26










  • This is a great answer! In my answer I calculated the probabilities for the two squirrels in the second problem too, but my result is slightly more in favour to the second squirrel (rounds to around 50.037 %). I'd expect the methods used to be equivalent, so maybe there's some typo or silly error in one of the two answers? I'll check when I have more time
    – abl
    Nov 27 at 9:30


















up vote
16
down vote













(a) has been adequately answered by




Ingix.




In (b), though:




The second squirrel is more likely to win.




Consider:




A simpler example, where instead of typing "COCONUT" or "TUNOCOC", the monkeys are trying to type "AB" or "CA". Also, the keyboard only has "A", "B", and "C" keys. Consider what happens after the first three keystrokes, for which there are 27 possibilities.




In this case:




In 5 out of the 27 possiblities, "AB" appears and "CA" does not. Similarly, in 5 of the possiblities, "CA" appears and "AB" does not. But there's also one possibility, "CAB", where both appear, and squirrel 2 wins. In this instance, the "A" is useful for both, but it was useful for the second squirrel earlier. The "setup" for the first squirrel to match will often result in the second squirrel already having matched.




Now consider:




A similar thing is true of "COCONUT" versus "TUNOCOC". The first squirrel wants "COC" as a setup, but if "COC" does appear, it's often at the tail end (pun intended) of the second squirrel's victory.







share|improve this answer























  • It sounds plausible. But it would also mean that if there's any bias, it will be probably too small to be detected by random simulations, right?
    – Eric Duminil
    Nov 23 at 18:45












  • @Duminil correct. Even simulating a single play-through would be fairly expensive.
    – Sneftel
    Nov 23 at 20:11










  • Can't we check the CAB-Thesis, very cheaply and extrapolate from there?
    – Falco
    Nov 26 at 13:46










  • @Falco Sure, but there's no need for that. The situation is simple enough to solve analytically based on the initial conditions and the transition matrix. In that situation, the second squirrel wins precisely 60% of the time. Of course, I don't offer a formal proof that it is valid to extrapolate from AB to COCONUT.
    – Sneftel
    Nov 26 at 14:51












  • I ran a simulation with "C,O,N,U,T,X" as possible letters (since the number of dummy-letters will only make the bias smaller. And I got a fairly consistent bias of over 50% wins for the second squirrel
    – Falco
    Nov 26 at 14:51


















up vote
15
down vote













@Ingix provides a good proof, one certainly worthy of the tick, but one that could perhaps use some reinforcement. The answer is that the N number of letters would have the same expected value. A few points that might help @jafe realise the flaw in his proof.




  1. The order of independent events occurs has no directional bias. For example, drawing a King Of Hearts followed by an Ace Of Hearts, from a 52 card deck is no more probable than an Ace followed by a King.

  2. Whilst the COCOCONUT argument might seem to hold water, it's first important to realise that the probability of that 9 letter sequence is the same as the probability of DOCOCONUT, or for that matter TUTUNOCOC having the initial CO does not gain anything to the probability as there's still a 1/26 chance of getting the next letter correct.

  3. The apparent gain of the fallback that COCOCONUT gives you is offset by the amount of extra letters that you have to type. If you get the sequence COCOCONUX, you have wasted 9 letters, whereas TUNOCOX only wastes 7. (And COCOCONUX is far more improbable than TUNOCOX).


TLDR: the order of independent variables has no favoured direction.



(P.S. This is my first stack answer, so feel free to give any constructive feedback)






share|improve this answer



















  • 2




    Makes sense! Welcome to Puzzling.
    – jafe
    Nov 23 at 14:25






  • 1




    I'm still confused about this, but your 3) point seems very good to me
    – George Menoutis
    Nov 23 at 14:26










  • Welcome to Puzzling! Yup, your first two points are exactly right. As for the wastage of letters, I don't think it's really a good argument. I'm not sure it works that way. IMO the reason why COCOCONUT is a false benefit, is because when considering the probability of the events happening, you are assuming the letters before the last 7 are essentially random. This means each letter has 26 possibilities. As such, the combination of CO- as a prefix to COCONUT is already considered within the probability count. To factor it in again is double counting.
    – Ong Yu Hann
    Nov 23 at 14:52












  • Thanks for your feedback @Ong Yu Hann. The difference between the 3rd point and the others is that the first 2 are easily provable mathematical facts. The 3rd is simply to help people logically understand why COCOCOCO doesn't provide a probabilistic benefit.
    – WittierDinosaur
    Nov 23 at 14:58












  • While it's true that COCONUT and TUNOCOC are equivalent, you are making a far more sweeping (and false) statement when you imply that all sequences of the same length are equivalent. For a simple counterexample that can be checked by hand (using Markov chains, or even just by trying it): if you flip coins over and over, on average it will take you 6 coinflips until you see HH, but only 4 coinflips until you see HT.
    – Misha Lavrov
    Nov 27 at 1:14


















up vote
6
down vote













There is a clever trick to finding the kind of expected times that appear in part (a). It turns out that both for COCONUT and TUNOCOC, the expected time is just $26^7$ (so the monkeys have equal expected times to win), but for this to be convincing, I will demonstrate the method on a case where the expected value is unexpected: the word ABRACADABRA.



Suppose that a monkey is typing randomly, and a family of beavers decides to bet on the outcome. The rules for betting are as follows:




  1. Right before a new letter is typed, a new beaver shows up and is given $1$ dollar. The beaver bets that dollar on that letter being A, at $26:1$ odds.

  2. If the beaver wins (and now has $26$ dollars), the beaver bets all of them on the next letter being B.

  3. If the beaver wins (and now has $26^2$ dollars), the beaver bets all of them on the next letter being R.

  4. In general, whenever the beaver wins the first $k$ bets (and has $26^k$ dollars), the beaver bets all of them on the next, $(k+1$)-th letter still matching the $(k+1)$-th letter of ABRACADABRA.


Any time a beaver loses a bet, that beaver has no more money and goes home. (But at any time, multiple beavers may be betting, because a new beaver shows up right before each letter.)



Each bet is fair, so after $n$ letters are typed, the expected number of money the beavers have altogether is $n$: the $n$ dollars that the beavers get when they show up.



When ABRACADABRA first appears, we can count the total amount of money in the system:




  • One lucky beaver has $26^{11}$ dollars from betting on ABRACADABRA all the way through.

  • One beaver started on the ABRA at the end of ABRACADABRA, and has $26^4$ dollars.

  • One beaver just started betting before the last A, and has $26$ dollars.

  • All other beavers lost their bets and have no money.


So we are guaranteed to have $26^{11} + 26^4 + 26$ dollars in the system. But the expected amount of money in the system is equal to the number of letters typed. So in expectation, it takes $26^{11} + 26^4 + 26$ letters to type ABRACADABRA.



(Formally, we are invoking Wald's equation to see that the rule "the expected money in the system is equal to the expected number of letters typed" still holds when the number of letters typed depends on the letters that have been typed - when we stop once ABRACADABRA is reached)





Now, going back to COCONUT and TUNOCOC: neither word ends with an initial segment of the word. (The CONUT in COCONUT doesn't change anything.) If the beavers were betting on COCONUT, then at the end when COCONUT is typed, one beaver would have $26^7$ dollars and no other beavers would have anything. The same is true for TUNOCOC. So the expected time is just $26^7$ in both cases.



(A weird consequence is that a word like AAAAA has a longer expected waiting time than a word like ABCDE. To explain this intuitively, consider that if ABCDE hasn't been typed yet, the last five letters can can be any of AABCD, BABCD, CABCD, ..., ZABCD, giving us 26 ways to win on the next letter. But if AAAAA hasn't been typed yet, the last five letters can only be any of BAAAA, CAAAA, DAAAA, ..., ZAAAA, giving us only 25 ways to win.)






share|improve this answer























  • This sounds exciting. Could you add a note on how Wald's equation is applied here? Referring to Wikipedia's page, what is the sequence (X_n), what is N, and how to see that X_N are i.i.d., and in particular why N is independent of the sequence (X_n)?
    – kaba
    Nov 26 at 23:03










  • We are using the general version, so we don't need either of the requirements you mention. Here, $N$ is the number of steps taken and $X_n$ is the change in how much money there is in the system at step $n$. Condition 2 in Wikipedia's article essentially requires that $N$ be a stopping time: $N ge n$ is independent of $X_n, X_{n+1}, X_{n+2}, dots$ (so it does not require telling the future).
    – Misha Lavrov
    Nov 26 at 23:13












  • Thanks, I somehow missed the general version. I replicated that the mean time for ABRACADABRA is indeed $26^{11} + 26^4 + 26$ using Markov chains as in my answer. This answer is probably the best one for problem (a) this far.
    – kaba
    Nov 27 at 1:04












  • This is a cool answer +1
    – justhalf
    Nov 27 at 14:15


















up vote
5
down vote













Problem (a) has been adequately covered in other answers.



In this answer I'll show that in problem (b), the second squirrel has better chances.




  • First (for introduction purposes only) we'll see a simplified version of the problem, that is simple enough to be run on a simulation

  • Then we'll see how we can exactly calculate the probability that the second squirrel will win, in this simplified example. The method that we'll use does not depend on any simplification, and can be applied to the original problem as well.

  • We'll see that the result that we obtain with this method matches with the results of the simulation.

  • Having then some extra confidence in the method (not that it's really needed, since it's fairly clear), we'll apply it to the original problem and obtain a result.


Simplified problem:




A monkey is typing capital letters from an alphabet that contains the letters A, B and X. Two squirrels observe the sequence of letters thus generated. The first squirrel wins if AAB appears (as three successive letters) before BAA. The second squirrel wins if BAA appears before AAB. Which squirrel is more likely to win?




If chosen this problem because it has a similarity with the original, which I believe is crucial: "AAB" can overlap with "BAA" only in the form "AABAA", while in the other way around it can be either "BAAAB" or "BAAB". Just like "COCONUT" followed by "TUNOCOC" can be overlapped only in the form "COCONUTUNOCOC", while, the other way around, they can be overlapped like "TUNOCOCOCONUT" or like "TUNOCOCONUT".



Note however that we'll be solving the actual problem in the end. The only reason I'm including a simpler version is to explain the method on a simpler case, and because for the simpler problem we can run a simulation to verify our predictions.



Explanation



We must realize that at any point in the game, any of the two strings can be partly under construction, or not. More importantly, the only thing that affects the probabilities at any point in the game is how much of each string is already written. If, for example, the two last letters of the string so far are "XA", that means that the string AAB is partly written up to "A", and the string BAA is not partly written at all. This and only this is what determines which of the two squirrels is better off at this point, regardless of what has been written so far before "XA".



Therefore the state of the game at any given point is defined exclusively by the longest suffix of the sequence written so far that is also a prefix of at least one of the strings that the squirrels are betting at.



Thus this game has exactly seven different possible states, which are:





  • $epsilon$ (empty string, meaning no matching suffix)

  • $verb|A|$

  • $verb|AA|$

  • $verb|AAB|$

  • $verb|B|$

  • $verb|BA|$

  • $verb|BAA|$


On each of these different states, the squirrels may have different chances of winning. Let's note $P_epsilon$ as the chances that the second squirrel has when the state is $epsilon$, $P_{A}$ as the chances that he has when the state is $verb|A|$, etc. What we're interested in is $P_epsilon$, because that's the initial state of the game.



We can easily formulate each of this probabilities in terms of the others. From state $epsilon$ there's a 1/3 chance of writing an "A", thus moving to state A. There's also a 1/3 chance of transitioning to B, and finally a 1/3 chance of staying at e (if an "X" is written). That means that the chances for the second squirrel on state e are:



$$
P_{epsilon} = frac{1}{3} P_{A} + frac{1}{3} P_{B} + frac{1}{3} P_{epsilon}
$$



We can similarly express the chances for all the other states. Note that $P_{AAB} = 0$, because the second squirrel loses immediately on that state, and $P_{BAA} = 1$, because it wins immediately. The full list is:



$$
begin{align*}
P_epsilon &= frac{1}{3} P_{A} + frac{1}{3} P_{B} + frac{1}{3} P_epsilon \
P_{A} &= frac{1}{3} P_{AA} + frac{1}{3} P_{B} + frac{1}{3} P_epsilon \
P_{AA} &= frac{1}{3} P_{AAB} + frac{1}{3} P_{AA} + frac{1}{3} P_epsilon \
P_{AAB} &= 0 \
P_{B} &= frac{1}{3} P_{BA} + frac{1}{3} P_{B} + frac{1}{3} P_epsilon \
P_{BA} &= frac{1}{3} P_{BAA} + frac{1}{3} P_{B} + frac{1}{3} P_epsilon \
P_{BAA} &= 1
end{align*}
$$



This is simply a system of linear equations, and it's simple enough to be solved by hand. The solution is $P_epsilon = frac{8}{13}$ or approximately 61.54%. These are the chances for the second squirrel when the game starts. The chances for the first squirrel are $frac{5}{13}$, about 38.46%.



Running the simplified problem



Now, because this game has only three letters, the average game will be pretty short, so much in fact that we can play it enough times to have some statistics. The following program (in Java) simulates 10 million games and prints the percentage of wins for each squirrel. I've run it several times and the results are always in the range of 38.4 - 38.5% of wins for the first squirrel, and 61.5 - 61.6% of wins for the second squirrel, which matches very well with our previous calculations.



public class Monkeys {

private enum Squirrel {
FIRST_SQUIRREL,
SECOND_SQUIRREL
}

private static final Random random = new Random();
private static final char letters = "ABX".toCharArray();

public static void main(String args) {

int winsF = 0;
int winsS = 0;
int iterations = 10000000;

for (int i = 0; i < iterations; i++) {

Squirrel winner = playGame();
if (winner == Squirrel.FIRST_SQUIRREL) {
winsF++;
} else if (winner == Squirrel.SECOND_SQUIRREL) {
winsS++;
}
}

System.out.println("First squirrel wins: " + ((double) winsF) / (winsF + winsS));
System.out.println("Second squirrel wins: " + ((double) winsS) / (winsF + winsS));

}

private static Squirrel playGame() {
StringBuilder sb = new StringBuilder();
while (true) {
if (sb.length() >= 3) {
if (sb.substring(sb.length() - 3, sb.length()).equals("AAB")) {
return Squirrel.FIRST_SQUIRREL;
}
if (sb.substring(sb.length() - 3, sb.length()).equals("BAA")) {
return Squirrel.SECOND_SQUIRREL;
}
}
sb.append(letters[random.nextInt(letters.length)]);
}
}
}


Actual problem



Great! So the only thing we need to do now is apply the same method for the original problem. Following the same reasoning, in the original problem there are 15 distinct states:




  • $epsilon$

  • $verb|C|$

  • $verb|CO|$

  • $verb|COC|$

  • $verb|COCO|$

  • $verb|COCON|$

  • $verb|COCONU|$

  • $verb|COCONUT|$

  • $verb|T|$

  • $verb|TU|$

  • $verb|TUN|$

  • $verb|TUNO|$

  • $verb|TUNOC|$

  • $verb|TUNOCO|$

  • $verb|TUNOCOC|$


The chances for the second squirrel in each of the states are (thanks @kaba for pointing a previous error):



$$
begin{align*}
P_epsilon &= frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{24}{26} P_epsilon \
P_{C} &= frac{1}{26} P_{CO} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
P_{CO} &= frac{1}{26} P_{COC} + frac{1}{26} P_{T} + frac{24}{26} P_epsilon \
P_{COC} &= frac{1}{26} P_{COCO} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
P_{COCO} &= frac{1}{26} P_{COCON} + frac{1}{26} P_{COC} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
P_{COCON} &= frac{1}{26} P_{COCONU} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
P_{COCONU} &= frac{1}{26} P_{COCONUT} + frac{1}{26} P_{C} + frac{24}{26} P_epsilon \
P_{COCONUT} &= 0 \
P_{T} &= frac{1}{26} P_{TU} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
P_{TU} &= frac{1}{26} P_{TUN} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
P_{TUN} &= frac{1}{26} P_{TUNO} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
P_{TUNO} &= frac{1}{26} P_{TUNOC} + frac{1}{26} P_{T} + frac{24}{26} P_epsilon \
P_{TUNOC} &= frac{1}{26} P_{TUNOCO} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
P_{TUNOCO} &= frac{1}{26} P_{TUNOCOC} + frac{1}{26} P_{T} + frac{24}{26} P_epsilon \
P_{TUNOCOC} &= 1
end{align*}
$$



Finally let's solve this system of equations. I've used SageMath for that purpose. The script is (after renaming the variables):



var('a, b, c, d, e, f, g, h, i, j, k, l, m, n, u')

eqn = [
a == (1/26) * b + (1/26) * i + (24/26) * a,
b == (1/26) * c + (1/26) * b + (1/26) * i + (23/26) * a,
c == (1/26) * d + (1/26) * i + (24/26) * a,
d == (1/26) * e + (1/26) * b + (1/26) * i + (23/26) * a,
e == (1/26) * f + (1/26) * d + (1/26) * i + (23/26) * a,
f == (1/26) * g + (1/26) * b + (1/26) * i + (23/26) * a,
g == (1/26) * h + (1/26) * b + (24/26) * a,
h == 0,
i == (1/26) * j + (1/26) * b + (1/26) * i + (23/26) * a,
j == (1/26) * k + (1/26) * b + (1/26) * i + (23/26) * a,
k == (1/26) * l + (1/26) * b + (1/26) * i + (23/26) * a,
l == (1/26) * m + (1/26) * i + (24/26) * a,
m == (1/26) * n + (1/26) * b + (1/26) * i + (23/26) * a,
n == (1/26) * u + (1/26) * i + (24/26) * a,
u == 1
]

solution = solve(eqn, a, b, c, d, e, f, g, h, i, j, k, l, m, n, u)

print(solution)


And the output (formatting mine, for readability):



[
[
a == (308915775/617830874),
b == (154457887/308915437),
c == (308915749/617830874),
d == (154457549/308915437),
e == (308898173/617830874),
f == (308458799/617830874),
g == (297034399/617830874),
h == 0,
i == (154457888/308915437),
j == (308915801/617830874),
k == (308916451/617830874),
l == (308933351/617830874),
m == (154686375/308915437),
n == (320797125/617830874),
u == 1
]
]


We see that the probability that the second squirrel will win is exactly $frac{308915775}{617830874}$ or approximately 50.0000547%






share|improve this answer



















  • 1




    The coefficients you use to relate different $P_{?}$ correspond to the transition matrix $P3$ in my answer. I think you have a typo for $P_{COCO}$ where $P_C$ should be replaced with $P_{COC}$.
    – kaba
    Nov 27 at 13:04










  • Using notation from my answer, you are solving the equation $P3 begin{bmatrix} x_T \ x_A end{bmatrix} = begin{bmatrix}Q3 && R3 \ 0 && I end{bmatrix} begin{bmatrix} x_T \ x_A end{bmatrix} = begin{bmatrix} x_T \ 0 \ 1 end{bmatrix}$ which is equivalent to $x_A = begin{bmatrix} 0 \ 1 end{bmatrix}$ and $x_T = (I - Q3)^{-1} R3 begin{bmatrix} 0 \ 1 end{bmatrix}$.
    – kaba
    Nov 27 at 13:30












  • @kaba you're right, thanks for spotting that! It's not really a typo but rather a mistake that carries on to the result too. That probably explains the difference between my result and yours. I will fix it later.
    – abl
    Nov 27 at 13:38












  • Our solutions are equivalent, and therefore should be equal. I like how leaving it in the linear equation form, as you have, is clearer than solving explicitly using matrix inverse.
    – kaba
    Nov 27 at 13:42


















up vote
4
down vote













I'll try to make the argument that the answer to both parts is:




The two are just as likely.




We start from the simple fact that the probability that a window of 7 letters reads COCONUT, or TUNOCOC, or anything else, is equal. (It's 26^(-7), as @Ingix said.). Crucially, it's completely independent of any letters that appear before or after the window, simply because these letters are not in the window...



Now, let's start with (b). We look at letters 1-7. If they say COCONUT, the first squirrel wins. If they say TUNOCOC (at the same probability!), the second one wins. If they say neither, we move on to letters 2-8 and do the same. At this point, letter 1 is completely irrelevant. You see how we move on and on through the letters, until there's a winner, and the chances of either squirrels to win never differ from each other.



Further clarification:




  • Of course, if letters 2-7 happened to be COCONU, the first squirrel is in a very good position. Similarly (and in the same probability), if they were TUNOCO, the second squirrel is in a good position. But this is symmetrical, so no one has an overall advantage.

  • Of course, if letters 1-7 were TUNOCOC, the chance to get COCONUT at letters 5-11 increases, but this doesn't matter since, in this instance, the game has already finished...

  • In other words, if we were to ask "Which of the two words are we more likely to see at letters 5-11?" (see @abl's comment below), the answer would be TUNOCOC. But the difference is only due to a situation in which the game has already finished. And this is just not the question in hand.


Part (a) is a very similar. We look at letters 1-7 of both monkeys. If the first monkey typed COCONUT, it wins. If the second one types TUNOCOC, it wins. If neither won, we move on to look at letters 2-8 of both, and so on.



No Markov chains :)






share|improve this answer



















  • 1




    Except that in (b) the probabilities are not independent across windows. The probability of letters 5-11 being COCONUT, conditional to the fact that you reached this window, are lower than the equivalent for TUNOCOC, simply because you have to account for the case "TUNOCOCONUT" (in which case you wouldn't reach window 5-11). Maybe you can believe the discussion now :)
    – abl
    Nov 25 at 0:11












  • @abl Sorry, but this is just not true... Are you saying that, typing randomly, one is more likely to type COCONUT than TUNOCOC? These are just two 7-letter sequences, and they're just as likely. Arguments like "consider the case X" are not really valid, because there are so many other cases to consider too. The trick is to be able to consider all cases together with a general argument.
    – Angkor
    Nov 25 at 13:37








  • 4




    @Angkor Maybe I didn't explain myself well. Consider all possible sequences that start with "****TUNOCOC...". It's obvious that there are just as many of those as sequences of the form "****COCONUT...", i.e. both situations are equally probable, right? Note that all sequences of the form "****TUNOCOC..." are a win for squirrel 2. And all sequences of the form "****COCONUT..." are a win for squirrel 1, except for those that start with "TUNOCOCONUT..." which are a win for squirrel 2. There's an asymmetry here, so at the very least we can say that those are not "just two 7-letter sequences".
    – abl
    Nov 25 at 16:33






  • 2




    @Angkor Point taken, I apologize if I was out of tone. Imagine that the alphabet has only two letters, A and B. The squirrels bet for AAB and BAA respectively. The first three letters are typed. Each squirrel has 1/8 chances of winning at the third letter. If none of them wins, then the sequence is one of AAA, ABA, ABB, BAB, BBA, BBB, with equal probability. At the fourth letter, the possibilities (with equal probability) are AAAA, AAAB, ABAA, ABAB, ABBA, ABBB, BABA, BABB, BBAA, BBAB, BBBA, BBBB. Squirrel 1 wins with AAAB. Squirrel 2 wins with ABAA or BBAA, twice the chance.
    – abl
    Nov 25 at 20:43






  • 1




    @Angkor, also, this simplified case is simple enough to be simulated, which I did (see my answer) and the simulation over several million games shows the second squirrel has better chances.
    – abl
    Nov 25 at 20:49


















up vote
3
down vote













I have tried to create a Markov chain modelling problem a). I haven't even tried to solve it, but the model (i.e. the states and the transitions) should be correct (with one caveat which I'll explain at the end).



First, a simplification: I have reduced the problem to getting either the string A-A-B (as in CO-CO-NUT) or B-A-A (as in TUN-OC-OC), using an alphabet that only includes 3 letters: A, B, and C.



Here's what I've got. This is for AAB:




Markov chain for AAB




As you can see, I have separated the state "Only one A" from the state "AA". It is noteworthy that there is no way to go from "AA" to "A".



And this is for BAA:




enter image description here




I'm not sure what this leads to. I was hoping there would be some visible hint about the solution (like: both chains are very similar, with just one difference that tilts the balance in favour of one of them), but this doesn't seem to be the case.



To be honest, I'm not even sure my simplification from COCONUT to AAB is correct, because in my case A and B are equally likely (both of them are just a letter which is typed with probability P = 1/3), whereas CO and NUT are not (CO is only 2 letters long, NUT is 3). This means that solving my chains could, or could not, help solving the original problem.



Hopefully someone can pick up from here.






share|improve this answer




























    up vote
    2
    down vote













    I tried to make a C# program to do that for me:



    class Program
    {
    static void Main(string args)
    {
    Log("Le scimmie si preparano alla sfida");

    Monkey monkey1 = new Monkey("COCONUT");
    Monkey monkey2 = new Monkey("TUNOCOC");
    int counter = 0;

    while (true)
    {
    counter++;
    Log("Le scimmie digitano una lettera per la {0}° volta.", counter);

    monkey1.PressLetter();
    monkey2.PressLetter();

    if(monkey1.IsTargetReached() && monkey2.IsTargetReached())
    {
    Log("Entrambe le scimmie hanno vinto!");
    break;
    }
    else if (monkey1.IsTargetReached())
    {
    Log("La scimmia "{0}" ha vinto!", monkey1.Target);
    break;
    }
    else if (monkey2.IsTargetReached())
    {
    Log("La scimmia "{0}" ha vinto!", monkey2.Target);
    break;
    }
    }
    Log("Le scimmie si prendono il meritato riposo dopo {0} turni!", counter);
    }

    private static void Log(string message, params object args)
    {
    Console.WriteLine(DateTime.Now + " - " + string.Format(message, args));
    }
    }

    public class Monkey
    {
    private Random Random { get; set; }
    public string Target { get; set; }
    public string Text { get; set; }

    public Monkey(string target)
    {
    Target = target;
    Text = "";
    Random = new Random();
    }

    public void PressLetter()
    {
    int num = Random.Next(0, 26);
    char let = (char)('a' + num);
    string lastLetter = let.ToString().ToUpper();
    Text += lastLetter;
    }

    public bool IsTargetReached()
    {
    return Text.IndexOf(Target) > -1;
    }
    }



    After millions of Monkeys' tries, I ended up thinking that they have
    the same possibilities.




    Anyway my computer still didn't successfully solve the problem.






    share|improve this answer

















    • 2




      That's not surprising. The expected number of keystrokes before either monkey wins is well into the billions.
      – Sneftel
      Nov 23 at 15:57






    • 3




      You might want to reduce the alphabet you are using...
      – Evargalo
      Nov 23 at 17:41






    • 1




      +1; this works if you reduce the alphabet.
      – Tim C
      Nov 25 at 3:01










    • May I advice English comment? As a non native speaker I read every latin language as loren ipsum.
      – Drag and Drop
      Nov 27 at 13:44










    • @DragandDrop: Loren Ipsum? Is she an actress?
      – Laurent LA RIZZA
      Nov 27 at 13:53


















    up vote
    1
    down vote













    There's good arguments for both sides. Here's code that prove it definitely:



    import numpy as np

    def main():
    m1 = np.array([
    [25, 1, 0, 0, 0, 0, 0, 0],
    [24, 1, 1, 0, 0, 0, 0, 0],
    [25, 0, 0, 1, 0, 0, 0, 0],
    [24, 1, 0, 0, 1, 0, 0, 0],
    [24, 0, 0, 1, 0, 1, 0, 0],
    [24, 1, 0, 0, 0, 0, 1, 0],
    [24, 1, 0, 0, 0, 0, 0, 1],
    [0, 0, 0, 0, 0, 0, 0, 26]], dtype='object')

    m2 = np.array([
    [25, 1, 0, 0, 0, 0, 0, 0],
    [24, 1, 1, 0, 0, 0, 0, 0],
    [24, 1, 0, 1, 0, 0, 0, 0],
    [24, 1, 0, 0, 1, 0, 0, 0],
    [24, 1, 0, 0, 0, 1, 0, 0],
    [24, 1, 0, 0, 0, 0, 1, 0],
    [24, 1, 0, 0, 0, 0, 0, 1],
    [0, 0, 0, 0, 0, 0, 0, 26]], dtype = 'object')

    p1 = np.array([1,0,0,0,0,0,0,0])
    p2 = np.array([1,0,0,0,0,0,0,0])

    for i in range(1000):
    p1 = p1.dot(m1)
    p2 = p2.dot(m2)

    #this shows that how often the two monkes have 1,2,3 letters correct vary
    #but it doesn't affact past the 4th letter
    #print(i,p2 - p1)

    if p1[7] == p2[7]:
    print("after {} steps, both monkeys have equal chances".format(i))
    elif p2[7] > p1[7]:
    print("crap, this whole arguments just got destroyed")
    else:
    print("after {} steps, coconut monkey has {} more ways to have won".format(i, p1[7] - p2[7]))

    main()


    I ran it up to:




    after 99999 steps, both monkeys have equal chances




    No matter the number of steps, both monkeys have equal chances. Even though one monkey is more likely to have exactly the last 1,2, or 3 letters correct.



    (mind-boggling)






    share|improve this answer




























      up vote
      1
      down vote













      I have created the simulation in C# based on Marco's idea.
      The alphabet is reduced and the performance is improved:



      public class Monkey
      {
      private static readonly char alphabet = "ACNOTUXY".ToCharArray();
      private static readonly Random random = new Random();

      public Monkey(string Target)
      {
      target = Target.ToCharArray();
      length = target.Length;
      input = new char[length];
      inputPosition = 0;
      }

      private readonly char target;
      private readonly int length;
      private readonly char input;
      private int inputPosition;

      public bool TargetReached { get; private set; }

      public void PressKey()
      {
      var key = alphabet[random.Next(0, 8)];
      input[inputPosition] = key;
      inputPosition = ++inputPosition % length;
      TargetReached = IsTargetReached();
      }

      private bool IsTargetReached()
      {
      for (var i = 0; i < length; i++)
      {
      if (input[i] != target[i]) return false;
      }
      return true;
      }
      }

      public class Simulator
      {
      private int ties;
      private int monkey1Wins;
      private int monkey2Wins;

      public void Run(int numberOfCompetitons)
      {
      for (var i = 0; i < numberOfCompetitons; i++)
      {
      RunCompetition();
      }
      }

      private void RunCompetition()
      {
      var monkey1 = new Monkey("COCONUT");
      var monkey2 = new Monkey("TUNOCOC");

      while (true)
      {
      monkey1.PressKey();
      monkey2.PressKey();

      if (monkey1.TargetReached && monkey2.TargetReached)
      {
      ties++;
      Console.WriteLine("It's a TIE!");
      break;
      }
      if (monkey1.TargetReached)
      {
      monkey1Wins++;
      Console.WriteLine("Monkey 1 WON!");
      break;
      }
      if (monkey2.TargetReached)
      {
      monkey2Wins++;
      Console.WriteLine("Monkey 2 WON!");
      break;
      }
      }
      }

      public void ShowSummary()
      {
      Console.WriteLine();
      Console.WriteLine($"{ties} TIES");
      Console.WriteLine($
      "Monkey 1: {monkey1Wins} WINS");
      Console.WriteLine($"Monkey 2: {monkey2Wins} WINS");
      }
      }

      internal class Program
      {
      private static void Main()
      {
      var simulator = new Simulator();
      simulator.Run(1000);
      simulator.ShowSummary();
      Console.ReadKey();
      }
      }


      Sample output:



      ...
      Monkey 1 WON!
      Monkey 1 WON!
      Monkey 2 WON!
      Monkey 1 WON!
      Monkey 2 WON!
      Monkey 2 WON!
      Monkey 1 WON!
      Monkey 1 WON!
      Monkey 2 WON!



      0 TIES

      Monkey 1: 498 WINS

      Monkey 2: 502 WINS


      So it seems pretty tied to me







      share|improve this answer






























        up vote
        0
        down vote













        Monkey problem



        The crux the problem is the following: do repeated groups influence the expected characters needed to complete a sequence? The answer is




        No, repeated groups bear no influence on the result




        Let's reduce the problem to something similar - AAB vs ABC with only these letters being on the keyboard.



        The probability of obtaining AAB in a sequence is:




        starting at 1: P(A) * P(A) * P(B) = (1/3 * 1/3 * 1/3) +

        starting at 2: P(!A) * P(A) * P(A) * P(B) = 2/3 * (1/3 * 1/3 * 1/3) +

        starting at 3: P(!A) * P(!A) * P(A) * P(A) * P(B) = 2/3 * 2/3 * (1/3 * 1/3 * 1/3) + ...




        The probability of obtaining ABC in a sequence is:




        starting at 1: P(A) * P(B) * P(C) = (1/3 * 1/3 * 1/3) +

        starting at 2: P(!A) * P(A) * P(B) * P(C) = 2/3 * (1/3 * 1/3 * 1/3) +

        starting at 3: P(!A) * P(!A) * P(A) * P(B) * P(C) = 2/3 * 2/3 * (1/3 * 1/3 * 1/3) + ...




        This makes it easy to see that




        No matter how you shuffle the desired sequence around, the only thing that influences the probability of it occurring is the length of the sequence.




        Alternative proof:
        All of the possible sequences of 4 letters are:




        AAAA ABAA ACAA BAAA BBAA BCAA CAAA CBAA CCAA

        AAAB ABAB ACAB BAAB BBAB BCAB CAAB CBAB CCAB

        AAAC ABAC ACAC BAAC BBAC BCAC CAAC CBAC CCAC

        AABA ABBA ACBA BABA BBBA BCBA CABA CBBA CCBA

        AABB ABBB ACBB BABB BBBB BCBB CABB CBBB CCBB

        AABC ABBC ACBC BABC BBBC BCBC CABC CBBC CCBC

        AACA ABCA ACCA BACA BBCA BCCA CACA CBCA CCCA

        AACB ABCB ACCB BACB BBCB BCCB CACB CBCB CCCB

        AACC ABCC ACCC BACC BBCC BCCC CACC CBCC CCCC




        And here is the same table with the uninteresting sequences cut out:




        XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX

        AAAB XXXX XXXX BAAB XXXX XXXX CAAB XXXX XXXX

        XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX

        AABA XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX

        AABB XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX

        AABC XXXX XXXX BABC XXXX XXXX CABC XXXX XXXX

        XXXX ABCA XXXX XXXX XXXX XXXX XXXX XXXX XXXX

        XXXX ABCB XXXX XXXX XXXX XXXX XXXX XXXX XXXX

        XXXX ABCC XXXX XXXX XXXX XXXX XXXX XXXX XXXX




        There are




        3 sequences where AAB starts at position 1: AABA, AABB, AABC

        3 sequences where AAB starts at position 2: AAAB, BAAB, CAAB

        3 sequences where ABC starts at position 1: ABCA, ABCB, ABCC

        3 sequences where ABC starts at position 2: AABC, BABC, CABC




        Thus the answer to the monkey problem is that




        both monkeys have the same chance of finishing first




        Squirrel problem



        Let's consider the string typed by the monkey after a long (infinite) time and make some simplifications that don't affect the outcome to understand the problem better.




        Replace any letter not in COCONUT with X

        Replace all groups of COC with A

        Replace all remaining C's with X

        Replace all remaining O's with B, N's with C, U's with D and T's with E




        The problem now is:




        in the new string will we more likely first find ABCDE or EDCBA?




        And the solution:




        If A, B, C, D and E had the same chances of occurring, the answer would be "both are equally likely", but C, D and E have normal chances of occurring, A has a very low chance of occurring, B has a slightly lower than normal chance.

        Considering a simplified case - 2 letters - P and Q with P occurring more often than Q, PQ has more chances of appearing first than QP.
        This means EDCBA is more likely to appear first than ABCDE.




        Looking at it from a different angle:




        When the unlikely event that A appears in the string, it is equally likely that it is followed by a B as it is that it is preceded by one.

        In the even more unlikely event that an AB or BA appears in the string, it is equally likely that it is followed by a C as it is that it is preceded by one.

        ...

        Which makes it equally likely that ABCDE and EDCBA appear in the string, so when an A appears in either of these sequences, it is equally likely that it starts the ABCDE sequence as it is that it ends the EDCBA sequence.




        Thus, the answer to the squirrel problem is that




        the second squirrel's sequence (TUNOCOC) appears first







        share|improve this answer




























          up vote
          0
          down vote













          Similar to Marco Salerno's answer, I wrote a computer program to solve the monkey problem.



          static void Coconuts(int iterations)
          {
          Tuple<double, double> outcomes = new Tuple<double, double>[iterations];
          char alphabet = new char { 'C', 'O', 'N', 'U', 'T' };
          string target1 = "COCONUT";
          string target2 = "TUNOCOC";
          for(int i = 0; i < iterations; i++)
          {
          outcomes[i] = new Tuple<double, double>(RandomLettersToTarget(alphabet, target1), RandomLettersToTarget(alphabet, target2));
          Console.WriteLine($"{outcomes[i].Item1:0.0}t{outcomes[i].Item2:0.0}");
          }

          double average1 = outcomes.Select(pair => pair.Item1).Average();
          double average2 = outcomes.Select(pair => pair.Item2).Average();

          Console.WriteLine($"Average durations are {average1:00000.0}, {average2:00000.0}");
          }

          static int RandomLettersToTarget(char alphabet, string target, int stepsToInsanity = 100000000)
          {
          string soFar = "";
          int steps = 0;
          while(steps < stepsToInsanity)
          {
          soFar += alphabet[rng.Next(0, alphabet.Length)];
          if(soFar.Length >= target.Length)
          {
          soFar = soFar.Substring(soFar.Length - target.Length, target.Length);
          }
          if(target.Equals(soFar))
          {
          return steps;
          }
          steps++;
          }
          return stepsToInsanity;
          }


          With the reduced alphabet (containing only 5 letters) this program does complete, and the results over thousands of iterations




          show no clear winner. Thus, I conclude that each monkey has equal probability of winning the first game.







          share|improve this answer























          • The difference in odds is so slight that a computer program will give you no reliable answer. See my answer for details.
            – Laurent LA RIZZA
            Nov 27 at 13:35


















          up vote
          0
          down vote













          (a)




          The fact that two monkeys are typing is irrelevant to the study of the sequence. They're both typing on the same computer (see title), so there is only one sequence of letters. There is no bias, both monkeys have the exact same chance of having their word appear first. Winning, in the clear text, means "having my word come out first".




          We state obviously that, if less than $7$ letters are typed, both monkeys have equally $0$ chance of winning.



          Let's do some recurrence reasoning, trying to prove that if both monkeys have the same chance of winning after $n$ letters are typed, they have the same chance of winning after $n+1$ letters are typed.



          Let $n$ be greater than or equal to $6$. Given there is no winner after the $n$th letter is typed, and both monkeys have had the same chance of winning up until now, (which is trivially established for $n=6$) there are three cases:




          • the letters $n-5$ to $n$ say COCONU and the letter typed is T, monkey 1 wins. $1$ chance out of $26^7$ ($26^6$ suffixes times $26$ possible letters)

          • the letters $n-5$ to $n$ say TUNOCO and the letter typed is C, monkey 2 wins. $1$ chance out of $26^7$ ($26^6$ suffixes times $26$ possible letters)

          • all other cases, no winner. $26^7 - 2$ out of $26^7$


          And the monkeys have the same chance of winning upon the $n+1$th letter typed. This completes the proof by recurrence.




          So, reading the question correctly, the one expected to type more will be the one that stops last, assuming they type at the same frequency. There are two cases. Either the first word out is COCONUT, in which case, once monkey 1 stops typing, monkey 2 has T as a prefix for his word to come out. Or the first word out is TUNOCOC, in which case, once monkey 2 stops typing, monkey 1 has COC as a prefix for his word to come out. Therefore, monkey 2 is expected to have to type more than monkey 1 in this rare case, and then same as monkey 1 once he spoils his opportunity to finish the word.
          So, yes, monkey 2 is expected to type slightly more, due to these opportunities being unequal.




          (b)




          If "before" means "before in time", we are in the situation described in the clear text. Both squirrels have the same odds, if they stop watching after the first word appears. If they wait for both words to appear, well, this does not change anything, the word appearing first is the word appearing first.




          Whereas




          If "before" means "before in the lexical order", squirrel 1 wins, and the fact that a monkey is typing on a computer is irrelevant. This shows that squirrel 1 has better chances of winning anyway.




          Look, Ma! No Markov chains!






          share|improve this answer






























            15 Answers
            15






            active

            oldest

            votes








            15 Answers
            15






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            45
            down vote



            accepted










            (a) I claim that the expected typing length are the same for both monkeys. I guess something in my argument will be incorrect, as jafe's answer has 9 approvals, but finding that incorrectness would be helpful to me.



            I prove that the probabilty that a monkey ends his typing after exactly $n$ letters is the same for both monkeys, which then implies that the expected typing lengths are the same.



            In order for the typing to end after letter number $n$, the last seven letters must be exactly the monkey's preferred word "COCONUT" / "TUNOCOC" and that word cannot ever have appeared before in the text. That word appearing before can be of two kinds:




            1. It could share some letters with the last 7, or

            2. be completely contained in the first $n-7$ letters.


            It's easy to see that for the first monkey, case 1 can not happen: If the word "COCONUT" shared some of its letters with the last 7 letters, then the last letter "T" would have to appear a second time in "COCONUT" (besides last place), which it doesn't.



            The argument is slightly more complicated for the second monkey, as the last letter ("C") of "TUNOCOC" does appear a second time in it. But "TUNOC" is not the ending part of "TUNOCOC", so also for the second monkey case 1. cannot happen.



            Let's recap: In order for each monkey to end typing after exactly $n$ letters, the text it produced must fullfill two criteria:




            • A) The last seven letters must be exactly the monkey's preferred word
              "COCONUT"/"TUNOCOC", and

            • B) The previous $n-7$ letters cannot contain
              the monkey's preferred word.


            This is an equivalence: Any sequence of $n$ letters that fulfills A and B (for any of the two monkeys) is a sequence that has that monkey stop after exactly $n$ letters, and vice versa.



            Since each letter typed is independent of others, the events that A or B aply to a sequence of $n$ letters (for a given monkey) are also independent, so the respective probabilities can be multiplied. If we use indices $1$ and $2$ to apply for the first and second monkey, we obviuously get



            $$ p_1(A)=p_2(A)=left(frac1{26}right)^7$$



            Calculating $p_{1/2}(B)$ directly is much harder, but they are the same: Both consider sequences of $n-7$ independent, uniformly distributed letters, and if and only if such a sequence does (does not) contain the word "COCONUT", then the reversed sequence (which is again a sequence of $n-7$ independent, uniformly distributed letters) does (does not) contain the word "TUNOCOC".



            So we get



            $$p_1(B) = p_2 (B)$$ and this proves my initial claim: The probability that the typing ends after exactly $n$ letters is $p_1(A)p_1(B)$ for the first monkey and $p_2(A)p_2(B)$ for the second, and they are the same.



            +++++++



            I cannot say why jafe's argument is incorrect, it is very compelling, and as I said initially, it may very well be that it is correct and something I said is incorrect. I just don't see what that may be, in either case.



            ADDITION after more thinking and reading the comments:



            I think the flaw in jafe's argument is in double counting after "COCOCO". If the next letter is "N", this is counted as 'good' for both the ending "COCO" and continuation "NUT" and the intitial "COCO" and continution "CONUT". However, both can only lead to the same end result "COCOCONUT", the additional "CONUT" option is not really additional.






            share|improve this answer



















            • 3




              The caveat is that you show that the proba for each monkey to end after $n$ letters is the same assuming that they both type at least $n$ letters. But the monkey A has a higher probability of having finished beforehand.
              – Evargalo
              Nov 23 at 12:30








            • 3




              How can that be? If the first monkey has a higher probabilty to finish beforehand than monkey 2, it has to have a higher probabilty to have finished after exactly $k$ letters for some $k < n$, and then my argument applies again. In addition, I do not assume that they have typed at least $n$ letters, I have given exact condtions for that to be true and shown that they have equal probability.
              – Ingix
              Nov 23 at 12:42








            • 1




              Again, if you think that the first monkey has a higher probabilty to finish beforehand, which do you think is the first letter index at which the first monkey has a higher probability to have finished?
              – Ingix
              Nov 23 at 12:50






            • 1




              I am now convinced that this is the correct answer. Very confusing problem, but the flaw is in the other reasoning (see my comment down Viktor Mellgren's answer)
              – Evargalo
              Nov 23 at 13:49






            • 1




              I wrote a Python script with a smaller alphabet ('ABCNOTU'). Both monkeys seem to need around 800000 letters before writing the desired word. I repeated the experiment 300 times and couldn't find any bias toward monkey 1 or 2.
              – Eric Duminil
              Nov 23 at 14:11















            up vote
            45
            down vote



            accepted










            (a) I claim that the expected typing length are the same for both monkeys. I guess something in my argument will be incorrect, as jafe's answer has 9 approvals, but finding that incorrectness would be helpful to me.



            I prove that the probabilty that a monkey ends his typing after exactly $n$ letters is the same for both monkeys, which then implies that the expected typing lengths are the same.



            In order for the typing to end after letter number $n$, the last seven letters must be exactly the monkey's preferred word "COCONUT" / "TUNOCOC" and that word cannot ever have appeared before in the text. That word appearing before can be of two kinds:




            1. It could share some letters with the last 7, or

            2. be completely contained in the first $n-7$ letters.


            It's easy to see that for the first monkey, case 1 can not happen: If the word "COCONUT" shared some of its letters with the last 7 letters, then the last letter "T" would have to appear a second time in "COCONUT" (besides last place), which it doesn't.



            The argument is slightly more complicated for the second monkey, as the last letter ("C") of "TUNOCOC" does appear a second time in it. But "TUNOC" is not the ending part of "TUNOCOC", so also for the second monkey case 1. cannot happen.



            Let's recap: In order for each monkey to end typing after exactly $n$ letters, the text it produced must fullfill two criteria:




            • A) The last seven letters must be exactly the monkey's preferred word
              "COCONUT"/"TUNOCOC", and

            • B) The previous $n-7$ letters cannot contain
              the monkey's preferred word.


            This is an equivalence: Any sequence of $n$ letters that fulfills A and B (for any of the two monkeys) is a sequence that has that monkey stop after exactly $n$ letters, and vice versa.



            Since each letter typed is independent of others, the events that A or B aply to a sequence of $n$ letters (for a given monkey) are also independent, so the respective probabilities can be multiplied. If we use indices $1$ and $2$ to apply for the first and second monkey, we obviuously get



            $$ p_1(A)=p_2(A)=left(frac1{26}right)^7$$



            Calculating $p_{1/2}(B)$ directly is much harder, but they are the same: Both consider sequences of $n-7$ independent, uniformly distributed letters, and if and only if such a sequence does (does not) contain the word "COCONUT", then the reversed sequence (which is again a sequence of $n-7$ independent, uniformly distributed letters) does (does not) contain the word "TUNOCOC".



            So we get



            $$p_1(B) = p_2 (B)$$ and this proves my initial claim: The probability that the typing ends after exactly $n$ letters is $p_1(A)p_1(B)$ for the first monkey and $p_2(A)p_2(B)$ for the second, and they are the same.



            +++++++



            I cannot say why jafe's argument is incorrect, it is very compelling, and as I said initially, it may very well be that it is correct and something I said is incorrect. I just don't see what that may be, in either case.



            ADDITION after more thinking and reading the comments:



            I think the flaw in jafe's argument is in double counting after "COCOCO". If the next letter is "N", this is counted as 'good' for both the ending "COCO" and continuation "NUT" and the intitial "COCO" and continution "CONUT". However, both can only lead to the same end result "COCOCONUT", the additional "CONUT" option is not really additional.






            share|improve this answer



















            • 3




              The caveat is that you show that the proba for each monkey to end after $n$ letters is the same assuming that they both type at least $n$ letters. But the monkey A has a higher probability of having finished beforehand.
              – Evargalo
              Nov 23 at 12:30








            • 3




              How can that be? If the first monkey has a higher probabilty to finish beforehand than monkey 2, it has to have a higher probabilty to have finished after exactly $k$ letters for some $k < n$, and then my argument applies again. In addition, I do not assume that they have typed at least $n$ letters, I have given exact condtions for that to be true and shown that they have equal probability.
              – Ingix
              Nov 23 at 12:42








            • 1




              Again, if you think that the first monkey has a higher probabilty to finish beforehand, which do you think is the first letter index at which the first monkey has a higher probability to have finished?
              – Ingix
              Nov 23 at 12:50






            • 1




              I am now convinced that this is the correct answer. Very confusing problem, but the flaw is in the other reasoning (see my comment down Viktor Mellgren's answer)
              – Evargalo
              Nov 23 at 13:49






            • 1




              I wrote a Python script with a smaller alphabet ('ABCNOTU'). Both monkeys seem to need around 800000 letters before writing the desired word. I repeated the experiment 300 times and couldn't find any bias toward monkey 1 or 2.
              – Eric Duminil
              Nov 23 at 14:11













            up vote
            45
            down vote



            accepted







            up vote
            45
            down vote



            accepted






            (a) I claim that the expected typing length are the same for both monkeys. I guess something in my argument will be incorrect, as jafe's answer has 9 approvals, but finding that incorrectness would be helpful to me.



            I prove that the probabilty that a monkey ends his typing after exactly $n$ letters is the same for both monkeys, which then implies that the expected typing lengths are the same.



            In order for the typing to end after letter number $n$, the last seven letters must be exactly the monkey's preferred word "COCONUT" / "TUNOCOC" and that word cannot ever have appeared before in the text. That word appearing before can be of two kinds:




            1. It could share some letters with the last 7, or

            2. be completely contained in the first $n-7$ letters.


            It's easy to see that for the first monkey, case 1 can not happen: If the word "COCONUT" shared some of its letters with the last 7 letters, then the last letter "T" would have to appear a second time in "COCONUT" (besides last place), which it doesn't.



            The argument is slightly more complicated for the second monkey, as the last letter ("C") of "TUNOCOC" does appear a second time in it. But "TUNOC" is not the ending part of "TUNOCOC", so also for the second monkey case 1. cannot happen.



            Let's recap: In order for each monkey to end typing after exactly $n$ letters, the text it produced must fullfill two criteria:




            • A) The last seven letters must be exactly the monkey's preferred word
              "COCONUT"/"TUNOCOC", and

            • B) The previous $n-7$ letters cannot contain
              the monkey's preferred word.


            This is an equivalence: Any sequence of $n$ letters that fulfills A and B (for any of the two monkeys) is a sequence that has that monkey stop after exactly $n$ letters, and vice versa.



            Since each letter typed is independent of others, the events that A or B aply to a sequence of $n$ letters (for a given monkey) are also independent, so the respective probabilities can be multiplied. If we use indices $1$ and $2$ to apply for the first and second monkey, we obviuously get



            $$ p_1(A)=p_2(A)=left(frac1{26}right)^7$$



            Calculating $p_{1/2}(B)$ directly is much harder, but they are the same: Both consider sequences of $n-7$ independent, uniformly distributed letters, and if and only if such a sequence does (does not) contain the word "COCONUT", then the reversed sequence (which is again a sequence of $n-7$ independent, uniformly distributed letters) does (does not) contain the word "TUNOCOC".



            So we get



            $$p_1(B) = p_2 (B)$$ and this proves my initial claim: The probability that the typing ends after exactly $n$ letters is $p_1(A)p_1(B)$ for the first monkey and $p_2(A)p_2(B)$ for the second, and they are the same.



            +++++++



            I cannot say why jafe's argument is incorrect, it is very compelling, and as I said initially, it may very well be that it is correct and something I said is incorrect. I just don't see what that may be, in either case.



            ADDITION after more thinking and reading the comments:



            I think the flaw in jafe's argument is in double counting after "COCOCO". If the next letter is "N", this is counted as 'good' for both the ending "COCO" and continuation "NUT" and the intitial "COCO" and continution "CONUT". However, both can only lead to the same end result "COCOCONUT", the additional "CONUT" option is not really additional.






            share|improve this answer














            (a) I claim that the expected typing length are the same for both monkeys. I guess something in my argument will be incorrect, as jafe's answer has 9 approvals, but finding that incorrectness would be helpful to me.



            I prove that the probabilty that a monkey ends his typing after exactly $n$ letters is the same for both monkeys, which then implies that the expected typing lengths are the same.



            In order for the typing to end after letter number $n$, the last seven letters must be exactly the monkey's preferred word "COCONUT" / "TUNOCOC" and that word cannot ever have appeared before in the text. That word appearing before can be of two kinds:




            1. It could share some letters with the last 7, or

            2. be completely contained in the first $n-7$ letters.


            It's easy to see that for the first monkey, case 1 can not happen: If the word "COCONUT" shared some of its letters with the last 7 letters, then the last letter "T" would have to appear a second time in "COCONUT" (besides last place), which it doesn't.



            The argument is slightly more complicated for the second monkey, as the last letter ("C") of "TUNOCOC" does appear a second time in it. But "TUNOC" is not the ending part of "TUNOCOC", so also for the second monkey case 1. cannot happen.



            Let's recap: In order for each monkey to end typing after exactly $n$ letters, the text it produced must fullfill two criteria:




            • A) The last seven letters must be exactly the monkey's preferred word
              "COCONUT"/"TUNOCOC", and

            • B) The previous $n-7$ letters cannot contain
              the monkey's preferred word.


            This is an equivalence: Any sequence of $n$ letters that fulfills A and B (for any of the two monkeys) is a sequence that has that monkey stop after exactly $n$ letters, and vice versa.



            Since each letter typed is independent of others, the events that A or B aply to a sequence of $n$ letters (for a given monkey) are also independent, so the respective probabilities can be multiplied. If we use indices $1$ and $2$ to apply for the first and second monkey, we obviuously get



            $$ p_1(A)=p_2(A)=left(frac1{26}right)^7$$



            Calculating $p_{1/2}(B)$ directly is much harder, but they are the same: Both consider sequences of $n-7$ independent, uniformly distributed letters, and if and only if such a sequence does (does not) contain the word "COCONUT", then the reversed sequence (which is again a sequence of $n-7$ independent, uniformly distributed letters) does (does not) contain the word "TUNOCOC".



            So we get



            $$p_1(B) = p_2 (B)$$ and this proves my initial claim: The probability that the typing ends after exactly $n$ letters is $p_1(A)p_1(B)$ for the first monkey and $p_2(A)p_2(B)$ for the second, and they are the same.



            +++++++



            I cannot say why jafe's argument is incorrect, it is very compelling, and as I said initially, it may very well be that it is correct and something I said is incorrect. I just don't see what that may be, in either case.



            ADDITION after more thinking and reading the comments:



            I think the flaw in jafe's argument is in double counting after "COCOCO". If the next letter is "N", this is counted as 'good' for both the ending "COCO" and continuation "NUT" and the intitial "COCO" and continution "CONUT". However, both can only lead to the same end result "COCOCONUT", the additional "CONUT" option is not really additional.







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Nov 23 at 14:39

























            answered Nov 23 at 12:10









            Ingix

            56634




            56634








            • 3




              The caveat is that you show that the proba for each monkey to end after $n$ letters is the same assuming that they both type at least $n$ letters. But the monkey A has a higher probability of having finished beforehand.
              – Evargalo
              Nov 23 at 12:30








            • 3




              How can that be? If the first monkey has a higher probabilty to finish beforehand than monkey 2, it has to have a higher probabilty to have finished after exactly $k$ letters for some $k < n$, and then my argument applies again. In addition, I do not assume that they have typed at least $n$ letters, I have given exact condtions for that to be true and shown that they have equal probability.
              – Ingix
              Nov 23 at 12:42








            • 1




              Again, if you think that the first monkey has a higher probabilty to finish beforehand, which do you think is the first letter index at which the first monkey has a higher probability to have finished?
              – Ingix
              Nov 23 at 12:50






            • 1




              I am now convinced that this is the correct answer. Very confusing problem, but the flaw is in the other reasoning (see my comment down Viktor Mellgren's answer)
              – Evargalo
              Nov 23 at 13:49






            • 1




              I wrote a Python script with a smaller alphabet ('ABCNOTU'). Both monkeys seem to need around 800000 letters before writing the desired word. I repeated the experiment 300 times and couldn't find any bias toward monkey 1 or 2.
              – Eric Duminil
              Nov 23 at 14:11














            • 3




              The caveat is that you show that the proba for each monkey to end after $n$ letters is the same assuming that they both type at least $n$ letters. But the monkey A has a higher probability of having finished beforehand.
              – Evargalo
              Nov 23 at 12:30








            • 3




              How can that be? If the first monkey has a higher probabilty to finish beforehand than monkey 2, it has to have a higher probabilty to have finished after exactly $k$ letters for some $k < n$, and then my argument applies again. In addition, I do not assume that they have typed at least $n$ letters, I have given exact condtions for that to be true and shown that they have equal probability.
              – Ingix
              Nov 23 at 12:42








            • 1




              Again, if you think that the first monkey has a higher probabilty to finish beforehand, which do you think is the first letter index at which the first monkey has a higher probability to have finished?
              – Ingix
              Nov 23 at 12:50






            • 1




              I am now convinced that this is the correct answer. Very confusing problem, but the flaw is in the other reasoning (see my comment down Viktor Mellgren's answer)
              – Evargalo
              Nov 23 at 13:49






            • 1




              I wrote a Python script with a smaller alphabet ('ABCNOTU'). Both monkeys seem to need around 800000 letters before writing the desired word. I repeated the experiment 300 times and couldn't find any bias toward monkey 1 or 2.
              – Eric Duminil
              Nov 23 at 14:11








            3




            3




            The caveat is that you show that the proba for each monkey to end after $n$ letters is the same assuming that they both type at least $n$ letters. But the monkey A has a higher probability of having finished beforehand.
            – Evargalo
            Nov 23 at 12:30






            The caveat is that you show that the proba for each monkey to end after $n$ letters is the same assuming that they both type at least $n$ letters. But the monkey A has a higher probability of having finished beforehand.
            – Evargalo
            Nov 23 at 12:30






            3




            3




            How can that be? If the first monkey has a higher probabilty to finish beforehand than monkey 2, it has to have a higher probabilty to have finished after exactly $k$ letters for some $k < n$, and then my argument applies again. In addition, I do not assume that they have typed at least $n$ letters, I have given exact condtions for that to be true and shown that they have equal probability.
            – Ingix
            Nov 23 at 12:42






            How can that be? If the first monkey has a higher probabilty to finish beforehand than monkey 2, it has to have a higher probabilty to have finished after exactly $k$ letters for some $k < n$, and then my argument applies again. In addition, I do not assume that they have typed at least $n$ letters, I have given exact condtions for that to be true and shown that they have equal probability.
            – Ingix
            Nov 23 at 12:42






            1




            1




            Again, if you think that the first monkey has a higher probabilty to finish beforehand, which do you think is the first letter index at which the first monkey has a higher probability to have finished?
            – Ingix
            Nov 23 at 12:50




            Again, if you think that the first monkey has a higher probabilty to finish beforehand, which do you think is the first letter index at which the first monkey has a higher probability to have finished?
            – Ingix
            Nov 23 at 12:50




            1




            1




            I am now convinced that this is the correct answer. Very confusing problem, but the flaw is in the other reasoning (see my comment down Viktor Mellgren's answer)
            – Evargalo
            Nov 23 at 13:49




            I am now convinced that this is the correct answer. Very confusing problem, but the flaw is in the other reasoning (see my comment down Viktor Mellgren's answer)
            – Evargalo
            Nov 23 at 13:49




            1




            1




            I wrote a Python script with a smaller alphabet ('ABCNOTU'). Both monkeys seem to need around 800000 letters before writing the desired word. I repeated the experiment 300 times and couldn't find any bias toward monkey 1 or 2.
            – Eric Duminil
            Nov 23 at 14:11




            I wrote a Python script with a smaller alphabet ('ABCNOTU'). Both monkeys seem to need around 800000 letters before writing the desired word. I repeated the experiment 300 times and couldn't find any bias toward monkey 1 or 2.
            – Eric Duminil
            Nov 23 at 14:11










            up vote
            30
            down vote













            (a)

            Edit: This is incorrect, see comments




            The TUNOCOC monkey is expected to type more.


            Regardless of the previous letters, the TUNOCOC monkey has 1/26 chance of getting the next letter right. Clearly the COCONUT monkey has at least this, but turns out it does even better: Once it has typed COCO, it has two ways to get the correct result (NUT or CONUT).







            share|improve this answer



















            • 4




              Sounds very compelling, still I think it is wrong, see my solution below. Would be very interested in you (and everybody else, of course) to check my approach.
              – Ingix
              Nov 23 at 12:12






            • 3




              I reduced the problem to the monkeys typing FFA or AFF which I think is equivalent. Then I made a computer simulation where I had them imput random characters and select a winner. The simulation does NOT support your answer. They both were winner the same amount of times.
              – Pieter B
              Nov 23 at 13:00






            • 1




              PieterB & Evargalo, what is the characterset of the typewriter in the FFA example? I think it needs a third 'wrong' character.
              – elias
              Nov 23 at 13:52






            • 2




              This answer indeed appears to be wrong but I don't know why. ¯_(ツ)_/¯
              – Eric Duminil
              Nov 23 at 14:12






            • 2




              I listed out all combinations of C,O,U,N,T for up to 11 letters. Indeed both COCONUT and TUNOCOC show up the same number of times for each letter count. D'oh!
              – jafe
              Nov 23 at 14:12

















            up vote
            30
            down vote













            (a)

            Edit: This is incorrect, see comments




            The TUNOCOC monkey is expected to type more.


            Regardless of the previous letters, the TUNOCOC monkey has 1/26 chance of getting the next letter right. Clearly the COCONUT monkey has at least this, but turns out it does even better: Once it has typed COCO, it has two ways to get the correct result (NUT or CONUT).







            share|improve this answer



















            • 4




              Sounds very compelling, still I think it is wrong, see my solution below. Would be very interested in you (and everybody else, of course) to check my approach.
              – Ingix
              Nov 23 at 12:12






            • 3




              I reduced the problem to the monkeys typing FFA or AFF which I think is equivalent. Then I made a computer simulation where I had them imput random characters and select a winner. The simulation does NOT support your answer. They both were winner the same amount of times.
              – Pieter B
              Nov 23 at 13:00






            • 1




              PieterB & Evargalo, what is the characterset of the typewriter in the FFA example? I think it needs a third 'wrong' character.
              – elias
              Nov 23 at 13:52






            • 2




              This answer indeed appears to be wrong but I don't know why. ¯_(ツ)_/¯
              – Eric Duminil
              Nov 23 at 14:12






            • 2




              I listed out all combinations of C,O,U,N,T for up to 11 letters. Indeed both COCONUT and TUNOCOC show up the same number of times for each letter count. D'oh!
              – jafe
              Nov 23 at 14:12















            up vote
            30
            down vote










            up vote
            30
            down vote









            (a)

            Edit: This is incorrect, see comments




            The TUNOCOC monkey is expected to type more.


            Regardless of the previous letters, the TUNOCOC monkey has 1/26 chance of getting the next letter right. Clearly the COCONUT monkey has at least this, but turns out it does even better: Once it has typed COCO, it has two ways to get the correct result (NUT or CONUT).







            share|improve this answer














            (a)

            Edit: This is incorrect, see comments




            The TUNOCOC monkey is expected to type more.


            Regardless of the previous letters, the TUNOCOC monkey has 1/26 chance of getting the next letter right. Clearly the COCONUT monkey has at least this, but turns out it does even better: Once it has typed COCO, it has two ways to get the correct result (NUT or CONUT).








            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Nov 25 at 16:23

























            answered Nov 23 at 7:56









            jafe

            15.2k37150




            15.2k37150








            • 4




              Sounds very compelling, still I think it is wrong, see my solution below. Would be very interested in you (and everybody else, of course) to check my approach.
              – Ingix
              Nov 23 at 12:12






            • 3




              I reduced the problem to the monkeys typing FFA or AFF which I think is equivalent. Then I made a computer simulation where I had them imput random characters and select a winner. The simulation does NOT support your answer. They both were winner the same amount of times.
              – Pieter B
              Nov 23 at 13:00






            • 1




              PieterB & Evargalo, what is the characterset of the typewriter in the FFA example? I think it needs a third 'wrong' character.
              – elias
              Nov 23 at 13:52






            • 2




              This answer indeed appears to be wrong but I don't know why. ¯_(ツ)_/¯
              – Eric Duminil
              Nov 23 at 14:12






            • 2




              I listed out all combinations of C,O,U,N,T for up to 11 letters. Indeed both COCONUT and TUNOCOC show up the same number of times for each letter count. D'oh!
              – jafe
              Nov 23 at 14:12
















            • 4




              Sounds very compelling, still I think it is wrong, see my solution below. Would be very interested in you (and everybody else, of course) to check my approach.
              – Ingix
              Nov 23 at 12:12






            • 3




              I reduced the problem to the monkeys typing FFA or AFF which I think is equivalent. Then I made a computer simulation where I had them imput random characters and select a winner. The simulation does NOT support your answer. They both were winner the same amount of times.
              – Pieter B
              Nov 23 at 13:00






            • 1




              PieterB & Evargalo, what is the characterset of the typewriter in the FFA example? I think it needs a third 'wrong' character.
              – elias
              Nov 23 at 13:52






            • 2




              This answer indeed appears to be wrong but I don't know why. ¯_(ツ)_/¯
              – Eric Duminil
              Nov 23 at 14:12






            • 2




              I listed out all combinations of C,O,U,N,T for up to 11 letters. Indeed both COCONUT and TUNOCOC show up the same number of times for each letter count. D'oh!
              – jafe
              Nov 23 at 14:12










            4




            4




            Sounds very compelling, still I think it is wrong, see my solution below. Would be very interested in you (and everybody else, of course) to check my approach.
            – Ingix
            Nov 23 at 12:12




            Sounds very compelling, still I think it is wrong, see my solution below. Would be very interested in you (and everybody else, of course) to check my approach.
            – Ingix
            Nov 23 at 12:12




            3




            3




            I reduced the problem to the monkeys typing FFA or AFF which I think is equivalent. Then I made a computer simulation where I had them imput random characters and select a winner. The simulation does NOT support your answer. They both were winner the same amount of times.
            – Pieter B
            Nov 23 at 13:00




            I reduced the problem to the monkeys typing FFA or AFF which I think is equivalent. Then I made a computer simulation where I had them imput random characters and select a winner. The simulation does NOT support your answer. They both were winner the same amount of times.
            – Pieter B
            Nov 23 at 13:00




            1




            1




            PieterB & Evargalo, what is the characterset of the typewriter in the FFA example? I think it needs a third 'wrong' character.
            – elias
            Nov 23 at 13:52




            PieterB & Evargalo, what is the characterset of the typewriter in the FFA example? I think it needs a third 'wrong' character.
            – elias
            Nov 23 at 13:52




            2




            2




            This answer indeed appears to be wrong but I don't know why. ¯_(ツ)_/¯
            – Eric Duminil
            Nov 23 at 14:12




            This answer indeed appears to be wrong but I don't know why. ¯_(ツ)_/¯
            – Eric Duminil
            Nov 23 at 14:12




            2




            2




            I listed out all combinations of C,O,U,N,T for up to 11 letters. Indeed both COCONUT and TUNOCOC show up the same number of times for each letter count. D'oh!
            – jafe
            Nov 23 at 14:12






            I listed out all combinations of C,O,U,N,T for up to 11 letters. Indeed both COCONUT and TUNOCOC show up the same number of times for each letter count. D'oh!
            – jafe
            Nov 23 at 14:12












            up vote
            23
            down vote













            Monkey problem



            To settle down which monkey is faster on average, I'll use Markov chains and Mathematica. Define a state $i$, for $i = 0..6$, as that the monkey 1 has currently written $i$ correct subsequent characters of COCONUT, but has never written all of COCONUT. Define a state 7 as the monkey 1 having written COCONUT at least once. Since state 7 cannot be escaped, we say that it is absorptive (other states are transient). The transition matrix for monkey 1 is:





            P1 = {
            {25, 1, 0, 0, 0, 0, 0, 0},
            {24, 1, 1, 0, 0, 0, 0, 0},
            {25, 0, 0, 1, 0, 0, 0, 0},
            {24, 1, 0, 0, 1, 0, 0, 0},
            {24, 0, 0, 1, 0, 1, 0, 0},
            {24, 1, 0, 0, 0, 0, 1, 0},
            {24, 1, 0, 0, 0, 0, 0, 1},
            {0, 0, 0, 0, 0, 0, 0, 26}
            } / 26;


            Here $P1_{ij}$ is the probability that the monkey 1 transitions from state $i$ to state $j$ on pushing a key. The Markov chain can be visualized as a graph:



            p1 = DiscreteMarkovProcess[{1, 0, 0, 0, 0, 0, 0, 0}, P1];
            Graph[Table[StringTake["COCONUT", x], {x, 0, 7}], p1,
            VertexSize -> 0.6, GraphLayout -> {VertexLayout -> "CircularEmbedding" }]


            Markov chain for monkey 1



            Next up, we extract the sub-matrix which does not contain the absorptive state:



            Q1 = P1[[1 ;; 7, 1 ;; 7]];


            As described here, the expected time for the monkey 1 to write COCONUT (expected time to absorption) starting from state $i$ is given by



            N1 = Inverse[IdentityMatrix[7] - Q1]
            t1 = N1.{1, 1, 1, 1, 1, 1, 1};


            The exact solution is given by:



            -      8031810176
            C 8031810150
            CO 8031809500
            COC 8031792574
            COCO 8031352524
            COCON 8019928800
            COCONU 7722894400


            The transition matrix for monkey 2 is given by:



            P2 = {
            {25, 1, 0, 0, 0, 0, 0, 0},
            {24, 1, 1, 0, 0, 0, 0, 0},
            {24, 1, 0, 1, 0, 0, 0, 0},
            {24, 1, 0, 0, 1, 0, 0, 0},
            {24, 1, 0, 0, 0, 1, 0, 0},
            {24, 1, 0, 0, 0, 0, 1, 0},
            {24, 1, 0, 0, 0, 0, 0, 1},
            {0, 0, 0, 0, 0, 0, 0, 26}
            } / 26;


            The Markov chain for monkey 2 visualized as a graph:



            p2 = DiscreteMarkovProcess[{1, 0, 0, 0, 0, 0, 0, 0}, P2];
            Graph[Table[StringTake["TUNOCOC", x], {x, 0, 7}], p2,
            VertexSize -> 0.6, GraphLayout -> {VertexLayout -> "CircularEmbedding" }]


            Markov chain for monkey 2



            By following the same procedure as for monkey 1, we solve $t2$ as:



            -      8031810176
            T 8031810150
            TU 8031809500
            TUN 8031792600
            TUNO 8031353200
            TUNOC 8019928800
            TUNOCO 7722894400


            Because the monkeys start from the situation when nothing has been written yet, we see that the expected time for them to write their word is the same $C = 8031810176$



            This result can also be obtained directly in Mathematica:



            d1 = FirstPassageTimeDistribution[p1, 8];
            Mean[d1]


            In fact, we can compute the characteristic function for the absorption-time-distribution:



            CharacteristicFunction[d1, x]


            which evaluates to



            e^(7ix) / (C - C e^(ix) + e^(7ix))


            The distributions for both monkeys have this same characteristic function. Wikipedia claims that cumulative distribution functions and characteristic functions are in one-to-one correspondence. This implies that the monkeys' finishing-time-distributions are the same.



            Squirrel problem



            The transition matrix for a monkey aiming for both words COCONUT and TUNOCOC is:



            P3 = {
            {24, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0},
            {23, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0},
            {24, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0},
            {23, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0},
            {23, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0},
            {23, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0},
            {24, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0},
            {23, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0},
            {23, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0},
            {23, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0},
            {24, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0},
            {23, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0},
            {24, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1},
            {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 26, 0},
            {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 26}} / 26


            Here the states have been numbered in order - C CO COC COCO COCON COCONU T TU TUN TUNO TUNOC TUNOCO COCONUT TUNOCOC. Note that the absorbing states come last. The same reference as above shows that we can compute the probability $B3_{ij}$ of ending from a transient state $i$ to an absorption state $j + 13$ as follows:



            Q3 = P3[[1 ;; 13, 1 ;; 13]];
            R3 = P3[[1 ;; 13, 14 ;; 15]];
            N3 = Inverse[IdentityMatrix[13] - Q3]
            B3 = N3.R3


            The matrix $B3 * 617830874$ is



                    COCONUT   TUNOCOC
            - 308915099 308915775
            C 308915100 308915774
            CO 308915125 308915749
            COC 308915776 308915098
            COCO 308932701 308898173
            COCON 309372075 308458799
            COCONU 320796475 297034399
            T 308915098 308915776
            TU 308915073 308915801
            TUN 308914423 308916451
            TUNO 308897523 308933351
            TUNOC 308458124 309372750
            TUNOCO 297033749 320797125


            Since the monkey starts from nothing, the probability for writing COCONUT and TUNOCOC is 308915099 and 308915775, respectively, divided by 617830874. These correspond to 0.4999994529 and 0.5000005471, computed with Mathematica accurate to 10 decimal places. Therefore the monkey is more probable to write TUNOCOC than COCONUT.



            Code



            Here is the Python-code I used to generate the transition matrices.





            def computeTransitionMatrix(words, keys):
            def properPrefixes(word):
            return (word[:i] for i in range(len(word)))

            def suffixesInDecreasingLength(word):
            return (word[i:] for i in range(len(word) + 1))

            prefixToState = {}
            stateToPrefix =

            def addState(prefix):
            if prefix in prefixToState:
            return
            prefixToState[prefix] = len(stateToPrefix)
            stateToPrefix.append(prefix)

            # Create a state for each proper prefix of the word.
            for word in words:
            for prefix in properPrefixes(word):
            addState(prefix)

            # Create a state for each word last.
            for word in words:
            addState(word)

            print(stateToPrefix)

            # Number of states.
            nStates = len(stateToPrefix)

            # Compute the (scaled) transition probabilities.
            transitions =
            for i in range(nStates):
            row = [0] * nStates
            transitions.append(row)

            prefix = stateToPrefix[i]
            if prefix in words:
            # The word is an absorptive state.
            row[i] = len(keys)
            continue
            for key in keys:
            nextPrefix = prefix + key
            # Find the longest suffix which
            # is a prefix of the word.
            for suffix in suffixesInDecreasingLength(nextPrefix):
            j = prefixToState.get(suffix)
            if j != None:
            row[j] += 1
            break

            return transitions

            # The words the monkey is supposed to write.
            words = ['COCONUT', 'TUNOCOC']

            # Keys available on monkey's keyboard.
            keys = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'

            transitions = computeTransitionMatrix(words, keys)
            print(repr(transitions).replace('[', '{').replace(']','}').replace('}, ', '},n'))





            share|improve this answer



















            • 3




              Thanks for doing the work of establishing the Markov matrices and doing the calculations! It should be noted that the discussed effects about the repeated sub-words "CO" and "OC" can be seen: the expected value vectors start and and end the same, but have different values for COC/TUN and COCO/TUNO. The differences just vanish again in the end. The reason for this is IMO that COCONUT can not overlap with itself in a text (like e.g. NUTNU could). That means that for all such words of the same length (say "TOBACCO"), the expected time should be exactly the same.
              – Ingix
              Nov 24 at 19:18






            • 2




              It means the expected time is simply 26**7. How anticlimactic! :)
              – Eric Duminil
              Nov 25 at 9:09










            • Thanks. This is a great answer for learning some cool tricks in Mathematica.
              – Erel Segal-Halevi
              Nov 26 at 8:26










            • This is a great answer! In my answer I calculated the probabilities for the two squirrels in the second problem too, but my result is slightly more in favour to the second squirrel (rounds to around 50.037 %). I'd expect the methods used to be equivalent, so maybe there's some typo or silly error in one of the two answers? I'll check when I have more time
              – abl
              Nov 27 at 9:30















            up vote
            23
            down vote













            Monkey problem



            To settle down which monkey is faster on average, I'll use Markov chains and Mathematica. Define a state $i$, for $i = 0..6$, as that the monkey 1 has currently written $i$ correct subsequent characters of COCONUT, but has never written all of COCONUT. Define a state 7 as the monkey 1 having written COCONUT at least once. Since state 7 cannot be escaped, we say that it is absorptive (other states are transient). The transition matrix for monkey 1 is:





            P1 = {
            {25, 1, 0, 0, 0, 0, 0, 0},
            {24, 1, 1, 0, 0, 0, 0, 0},
            {25, 0, 0, 1, 0, 0, 0, 0},
            {24, 1, 0, 0, 1, 0, 0, 0},
            {24, 0, 0, 1, 0, 1, 0, 0},
            {24, 1, 0, 0, 0, 0, 1, 0},
            {24, 1, 0, 0, 0, 0, 0, 1},
            {0, 0, 0, 0, 0, 0, 0, 26}
            } / 26;


            Here $P1_{ij}$ is the probability that the monkey 1 transitions from state $i$ to state $j$ on pushing a key. The Markov chain can be visualized as a graph:



            p1 = DiscreteMarkovProcess[{1, 0, 0, 0, 0, 0, 0, 0}, P1];
            Graph[Table[StringTake["COCONUT", x], {x, 0, 7}], p1,
            VertexSize -> 0.6, GraphLayout -> {VertexLayout -> "CircularEmbedding" }]


            Markov chain for monkey 1



            Next up, we extract the sub-matrix which does not contain the absorptive state:



            Q1 = P1[[1 ;; 7, 1 ;; 7]];


            As described here, the expected time for the monkey 1 to write COCONUT (expected time to absorption) starting from state $i$ is given by



            N1 = Inverse[IdentityMatrix[7] - Q1]
            t1 = N1.{1, 1, 1, 1, 1, 1, 1};


            The exact solution is given by:



            -      8031810176
            C 8031810150
            CO 8031809500
            COC 8031792574
            COCO 8031352524
            COCON 8019928800
            COCONU 7722894400


            The transition matrix for monkey 2 is given by:



            P2 = {
            {25, 1, 0, 0, 0, 0, 0, 0},
            {24, 1, 1, 0, 0, 0, 0, 0},
            {24, 1, 0, 1, 0, 0, 0, 0},
            {24, 1, 0, 0, 1, 0, 0, 0},
            {24, 1, 0, 0, 0, 1, 0, 0},
            {24, 1, 0, 0, 0, 0, 1, 0},
            {24, 1, 0, 0, 0, 0, 0, 1},
            {0, 0, 0, 0, 0, 0, 0, 26}
            } / 26;


            The Markov chain for monkey 2 visualized as a graph:



            p2 = DiscreteMarkovProcess[{1, 0, 0, 0, 0, 0, 0, 0}, P2];
            Graph[Table[StringTake["TUNOCOC", x], {x, 0, 7}], p2,
            VertexSize -> 0.6, GraphLayout -> {VertexLayout -> "CircularEmbedding" }]


            Markov chain for monkey 2



            By following the same procedure as for monkey 1, we solve $t2$ as:



            -      8031810176
            T 8031810150
            TU 8031809500
            TUN 8031792600
            TUNO 8031353200
            TUNOC 8019928800
            TUNOCO 7722894400


            Because the monkeys start from the situation when nothing has been written yet, we see that the expected time for them to write their word is the same $C = 8031810176$



            This result can also be obtained directly in Mathematica:



            d1 = FirstPassageTimeDistribution[p1, 8];
            Mean[d1]


            In fact, we can compute the characteristic function for the absorption-time-distribution:



            CharacteristicFunction[d1, x]


            which evaluates to



            e^(7ix) / (C - C e^(ix) + e^(7ix))


            The distributions for both monkeys have this same characteristic function. Wikipedia claims that cumulative distribution functions and characteristic functions are in one-to-one correspondence. This implies that the monkeys' finishing-time-distributions are the same.



            Squirrel problem



            The transition matrix for a monkey aiming for both words COCONUT and TUNOCOC is:



            P3 = {
            {24, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0},
            {23, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0},
            {24, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0},
            {23, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0},
            {23, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0},
            {23, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0},
            {24, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0},
            {23, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0},
            {23, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0},
            {23, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0},
            {24, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0},
            {23, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0},
            {24, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1},
            {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 26, 0},
            {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 26}} / 26


            Here the states have been numbered in order - C CO COC COCO COCON COCONU T TU TUN TUNO TUNOC TUNOCO COCONUT TUNOCOC. Note that the absorbing states come last. The same reference as above shows that we can compute the probability $B3_{ij}$ of ending from a transient state $i$ to an absorption state $j + 13$ as follows:



            Q3 = P3[[1 ;; 13, 1 ;; 13]];
            R3 = P3[[1 ;; 13, 14 ;; 15]];
            N3 = Inverse[IdentityMatrix[13] - Q3]
            B3 = N3.R3


            The matrix $B3 * 617830874$ is



                    COCONUT   TUNOCOC
            - 308915099 308915775
            C 308915100 308915774
            CO 308915125 308915749
            COC 308915776 308915098
            COCO 308932701 308898173
            COCON 309372075 308458799
            COCONU 320796475 297034399
            T 308915098 308915776
            TU 308915073 308915801
            TUN 308914423 308916451
            TUNO 308897523 308933351
            TUNOC 308458124 309372750
            TUNOCO 297033749 320797125


            Since the monkey starts from nothing, the probability for writing COCONUT and TUNOCOC is 308915099 and 308915775, respectively, divided by 617830874. These correspond to 0.4999994529 and 0.5000005471, computed with Mathematica accurate to 10 decimal places. Therefore the monkey is more probable to write TUNOCOC than COCONUT.



            Code



            Here is the Python-code I used to generate the transition matrices.





            def computeTransitionMatrix(words, keys):
            def properPrefixes(word):
            return (word[:i] for i in range(len(word)))

            def suffixesInDecreasingLength(word):
            return (word[i:] for i in range(len(word) + 1))

            prefixToState = {}
            stateToPrefix =

            def addState(prefix):
            if prefix in prefixToState:
            return
            prefixToState[prefix] = len(stateToPrefix)
            stateToPrefix.append(prefix)

            # Create a state for each proper prefix of the word.
            for word in words:
            for prefix in properPrefixes(word):
            addState(prefix)

            # Create a state for each word last.
            for word in words:
            addState(word)

            print(stateToPrefix)

            # Number of states.
            nStates = len(stateToPrefix)

            # Compute the (scaled) transition probabilities.
            transitions =
            for i in range(nStates):
            row = [0] * nStates
            transitions.append(row)

            prefix = stateToPrefix[i]
            if prefix in words:
            # The word is an absorptive state.
            row[i] = len(keys)
            continue
            for key in keys:
            nextPrefix = prefix + key
            # Find the longest suffix which
            # is a prefix of the word.
            for suffix in suffixesInDecreasingLength(nextPrefix):
            j = prefixToState.get(suffix)
            if j != None:
            row[j] += 1
            break

            return transitions

            # The words the monkey is supposed to write.
            words = ['COCONUT', 'TUNOCOC']

            # Keys available on monkey's keyboard.
            keys = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'

            transitions = computeTransitionMatrix(words, keys)
            print(repr(transitions).replace('[', '{').replace(']','}').replace('}, ', '},n'))





            share|improve this answer



















            • 3




              Thanks for doing the work of establishing the Markov matrices and doing the calculations! It should be noted that the discussed effects about the repeated sub-words "CO" and "OC" can be seen: the expected value vectors start and and end the same, but have different values for COC/TUN and COCO/TUNO. The differences just vanish again in the end. The reason for this is IMO that COCONUT can not overlap with itself in a text (like e.g. NUTNU could). That means that for all such words of the same length (say "TOBACCO"), the expected time should be exactly the same.
              – Ingix
              Nov 24 at 19:18






            • 2




              It means the expected time is simply 26**7. How anticlimactic! :)
              – Eric Duminil
              Nov 25 at 9:09










            • Thanks. This is a great answer for learning some cool tricks in Mathematica.
              – Erel Segal-Halevi
              Nov 26 at 8:26










            • This is a great answer! In my answer I calculated the probabilities for the two squirrels in the second problem too, but my result is slightly more in favour to the second squirrel (rounds to around 50.037 %). I'd expect the methods used to be equivalent, so maybe there's some typo or silly error in one of the two answers? I'll check when I have more time
              – abl
              Nov 27 at 9:30













            up vote
            23
            down vote










            up vote
            23
            down vote









            Monkey problem



            To settle down which monkey is faster on average, I'll use Markov chains and Mathematica. Define a state $i$, for $i = 0..6$, as that the monkey 1 has currently written $i$ correct subsequent characters of COCONUT, but has never written all of COCONUT. Define a state 7 as the monkey 1 having written COCONUT at least once. Since state 7 cannot be escaped, we say that it is absorptive (other states are transient). The transition matrix for monkey 1 is:





            P1 = {
            {25, 1, 0, 0, 0, 0, 0, 0},
            {24, 1, 1, 0, 0, 0, 0, 0},
            {25, 0, 0, 1, 0, 0, 0, 0},
            {24, 1, 0, 0, 1, 0, 0, 0},
            {24, 0, 0, 1, 0, 1, 0, 0},
            {24, 1, 0, 0, 0, 0, 1, 0},
            {24, 1, 0, 0, 0, 0, 0, 1},
            {0, 0, 0, 0, 0, 0, 0, 26}
            } / 26;


            Here $P1_{ij}$ is the probability that the monkey 1 transitions from state $i$ to state $j$ on pushing a key. The Markov chain can be visualized as a graph:



            p1 = DiscreteMarkovProcess[{1, 0, 0, 0, 0, 0, 0, 0}, P1];
            Graph[Table[StringTake["COCONUT", x], {x, 0, 7}], p1,
            VertexSize -> 0.6, GraphLayout -> {VertexLayout -> "CircularEmbedding" }]


            Markov chain for monkey 1



            Next up, we extract the sub-matrix which does not contain the absorptive state:



            Q1 = P1[[1 ;; 7, 1 ;; 7]];


            As described here, the expected time for the monkey 1 to write COCONUT (expected time to absorption) starting from state $i$ is given by



            N1 = Inverse[IdentityMatrix[7] - Q1]
            t1 = N1.{1, 1, 1, 1, 1, 1, 1};


            The exact solution is given by:



            -      8031810176
            C 8031810150
            CO 8031809500
            COC 8031792574
            COCO 8031352524
            COCON 8019928800
            COCONU 7722894400


            The transition matrix for monkey 2 is given by:



            P2 = {
            {25, 1, 0, 0, 0, 0, 0, 0},
            {24, 1, 1, 0, 0, 0, 0, 0},
            {24, 1, 0, 1, 0, 0, 0, 0},
            {24, 1, 0, 0, 1, 0, 0, 0},
            {24, 1, 0, 0, 0, 1, 0, 0},
            {24, 1, 0, 0, 0, 0, 1, 0},
            {24, 1, 0, 0, 0, 0, 0, 1},
            {0, 0, 0, 0, 0, 0, 0, 26}
            } / 26;


            The Markov chain for monkey 2 visualized as a graph:



            p2 = DiscreteMarkovProcess[{1, 0, 0, 0, 0, 0, 0, 0}, P2];
            Graph[Table[StringTake["TUNOCOC", x], {x, 0, 7}], p2,
            VertexSize -> 0.6, GraphLayout -> {VertexLayout -> "CircularEmbedding" }]


            Markov chain for monkey 2



            By following the same procedure as for monkey 1, we solve $t2$ as:



            -      8031810176
            T 8031810150
            TU 8031809500
            TUN 8031792600
            TUNO 8031353200
            TUNOC 8019928800
            TUNOCO 7722894400


            Because the monkeys start from the situation when nothing has been written yet, we see that the expected time for them to write their word is the same $C = 8031810176$



            This result can also be obtained directly in Mathematica:



            d1 = FirstPassageTimeDistribution[p1, 8];
            Mean[d1]


            In fact, we can compute the characteristic function for the absorption-time-distribution:



            CharacteristicFunction[d1, x]


            which evaluates to



            e^(7ix) / (C - C e^(ix) + e^(7ix))


            The distributions for both monkeys have this same characteristic function. Wikipedia claims that cumulative distribution functions and characteristic functions are in one-to-one correspondence. This implies that the monkeys' finishing-time-distributions are the same.



            Squirrel problem



            The transition matrix for a monkey aiming for both words COCONUT and TUNOCOC is:



            P3 = {
            {24, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0},
            {23, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0},
            {24, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0},
            {23, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0},
            {23, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0},
            {23, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0},
            {24, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0},
            {23, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0},
            {23, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0},
            {23, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0},
            {24, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0},
            {23, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0},
            {24, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1},
            {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 26, 0},
            {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 26}} / 26


            Here the states have been numbered in order - C CO COC COCO COCON COCONU T TU TUN TUNO TUNOC TUNOCO COCONUT TUNOCOC. Note that the absorbing states come last. The same reference as above shows that we can compute the probability $B3_{ij}$ of ending from a transient state $i$ to an absorption state $j + 13$ as follows:



            Q3 = P3[[1 ;; 13, 1 ;; 13]];
            R3 = P3[[1 ;; 13, 14 ;; 15]];
            N3 = Inverse[IdentityMatrix[13] - Q3]
            B3 = N3.R3


            The matrix $B3 * 617830874$ is



                    COCONUT   TUNOCOC
            - 308915099 308915775
            C 308915100 308915774
            CO 308915125 308915749
            COC 308915776 308915098
            COCO 308932701 308898173
            COCON 309372075 308458799
            COCONU 320796475 297034399
            T 308915098 308915776
            TU 308915073 308915801
            TUN 308914423 308916451
            TUNO 308897523 308933351
            TUNOC 308458124 309372750
            TUNOCO 297033749 320797125


            Since the monkey starts from nothing, the probability for writing COCONUT and TUNOCOC is 308915099 and 308915775, respectively, divided by 617830874. These correspond to 0.4999994529 and 0.5000005471, computed with Mathematica accurate to 10 decimal places. Therefore the monkey is more probable to write TUNOCOC than COCONUT.



            Code



            Here is the Python-code I used to generate the transition matrices.





            def computeTransitionMatrix(words, keys):
            def properPrefixes(word):
            return (word[:i] for i in range(len(word)))

            def suffixesInDecreasingLength(word):
            return (word[i:] for i in range(len(word) + 1))

            prefixToState = {}
            stateToPrefix =

            def addState(prefix):
            if prefix in prefixToState:
            return
            prefixToState[prefix] = len(stateToPrefix)
            stateToPrefix.append(prefix)

            # Create a state for each proper prefix of the word.
            for word in words:
            for prefix in properPrefixes(word):
            addState(prefix)

            # Create a state for each word last.
            for word in words:
            addState(word)

            print(stateToPrefix)

            # Number of states.
            nStates = len(stateToPrefix)

            # Compute the (scaled) transition probabilities.
            transitions =
            for i in range(nStates):
            row = [0] * nStates
            transitions.append(row)

            prefix = stateToPrefix[i]
            if prefix in words:
            # The word is an absorptive state.
            row[i] = len(keys)
            continue
            for key in keys:
            nextPrefix = prefix + key
            # Find the longest suffix which
            # is a prefix of the word.
            for suffix in suffixesInDecreasingLength(nextPrefix):
            j = prefixToState.get(suffix)
            if j != None:
            row[j] += 1
            break

            return transitions

            # The words the monkey is supposed to write.
            words = ['COCONUT', 'TUNOCOC']

            # Keys available on monkey's keyboard.
            keys = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'

            transitions = computeTransitionMatrix(words, keys)
            print(repr(transitions).replace('[', '{').replace(']','}').replace('}, ', '},n'))





            share|improve this answer














            Monkey problem



            To settle down which monkey is faster on average, I'll use Markov chains and Mathematica. Define a state $i$, for $i = 0..6$, as that the monkey 1 has currently written $i$ correct subsequent characters of COCONUT, but has never written all of COCONUT. Define a state 7 as the monkey 1 having written COCONUT at least once. Since state 7 cannot be escaped, we say that it is absorptive (other states are transient). The transition matrix for monkey 1 is:





            P1 = {
            {25, 1, 0, 0, 0, 0, 0, 0},
            {24, 1, 1, 0, 0, 0, 0, 0},
            {25, 0, 0, 1, 0, 0, 0, 0},
            {24, 1, 0, 0, 1, 0, 0, 0},
            {24, 0, 0, 1, 0, 1, 0, 0},
            {24, 1, 0, 0, 0, 0, 1, 0},
            {24, 1, 0, 0, 0, 0, 0, 1},
            {0, 0, 0, 0, 0, 0, 0, 26}
            } / 26;


            Here $P1_{ij}$ is the probability that the monkey 1 transitions from state $i$ to state $j$ on pushing a key. The Markov chain can be visualized as a graph:



            p1 = DiscreteMarkovProcess[{1, 0, 0, 0, 0, 0, 0, 0}, P1];
            Graph[Table[StringTake["COCONUT", x], {x, 0, 7}], p1,
            VertexSize -> 0.6, GraphLayout -> {VertexLayout -> "CircularEmbedding" }]


            Markov chain for monkey 1



            Next up, we extract the sub-matrix which does not contain the absorptive state:



            Q1 = P1[[1 ;; 7, 1 ;; 7]];


            As described here, the expected time for the monkey 1 to write COCONUT (expected time to absorption) starting from state $i$ is given by



            N1 = Inverse[IdentityMatrix[7] - Q1]
            t1 = N1.{1, 1, 1, 1, 1, 1, 1};


            The exact solution is given by:



            -      8031810176
            C 8031810150
            CO 8031809500
            COC 8031792574
            COCO 8031352524
            COCON 8019928800
            COCONU 7722894400


            The transition matrix for monkey 2 is given by:



            P2 = {
            {25, 1, 0, 0, 0, 0, 0, 0},
            {24, 1, 1, 0, 0, 0, 0, 0},
            {24, 1, 0, 1, 0, 0, 0, 0},
            {24, 1, 0, 0, 1, 0, 0, 0},
            {24, 1, 0, 0, 0, 1, 0, 0},
            {24, 1, 0, 0, 0, 0, 1, 0},
            {24, 1, 0, 0, 0, 0, 0, 1},
            {0, 0, 0, 0, 0, 0, 0, 26}
            } / 26;


            The Markov chain for monkey 2 visualized as a graph:



            p2 = DiscreteMarkovProcess[{1, 0, 0, 0, 0, 0, 0, 0}, P2];
            Graph[Table[StringTake["TUNOCOC", x], {x, 0, 7}], p2,
            VertexSize -> 0.6, GraphLayout -> {VertexLayout -> "CircularEmbedding" }]


            Markov chain for monkey 2



            By following the same procedure as for monkey 1, we solve $t2$ as:



            -      8031810176
            T 8031810150
            TU 8031809500
            TUN 8031792600
            TUNO 8031353200
            TUNOC 8019928800
            TUNOCO 7722894400


            Because the monkeys start from the situation when nothing has been written yet, we see that the expected time for them to write their word is the same $C = 8031810176$



            This result can also be obtained directly in Mathematica:



            d1 = FirstPassageTimeDistribution[p1, 8];
            Mean[d1]


            In fact, we can compute the characteristic function for the absorption-time-distribution:



            CharacteristicFunction[d1, x]


            which evaluates to



            e^(7ix) / (C - C e^(ix) + e^(7ix))


            The distributions for both monkeys have this same characteristic function. Wikipedia claims that cumulative distribution functions and characteristic functions are in one-to-one correspondence. This implies that the monkeys' finishing-time-distributions are the same.



            Squirrel problem



            The transition matrix for a monkey aiming for both words COCONUT and TUNOCOC is:



            P3 = {
            {24, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0},
            {23, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0},
            {24, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0},
            {23, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0},
            {23, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0},
            {23, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0},
            {24, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0},
            {23, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0},
            {23, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0},
            {23, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0},
            {24, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0},
            {23, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0},
            {24, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1},
            {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 26, 0},
            {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 26}} / 26


            Here the states have been numbered in order - C CO COC COCO COCON COCONU T TU TUN TUNO TUNOC TUNOCO COCONUT TUNOCOC. Note that the absorbing states come last. The same reference as above shows that we can compute the probability $B3_{ij}$ of ending from a transient state $i$ to an absorption state $j + 13$ as follows:



            Q3 = P3[[1 ;; 13, 1 ;; 13]];
            R3 = P3[[1 ;; 13, 14 ;; 15]];
            N3 = Inverse[IdentityMatrix[13] - Q3]
            B3 = N3.R3


            The matrix $B3 * 617830874$ is



                    COCONUT   TUNOCOC
            - 308915099 308915775
            C 308915100 308915774
            CO 308915125 308915749
            COC 308915776 308915098
            COCO 308932701 308898173
            COCON 309372075 308458799
            COCONU 320796475 297034399
            T 308915098 308915776
            TU 308915073 308915801
            TUN 308914423 308916451
            TUNO 308897523 308933351
            TUNOC 308458124 309372750
            TUNOCO 297033749 320797125


            Since the monkey starts from nothing, the probability for writing COCONUT and TUNOCOC is 308915099 and 308915775, respectively, divided by 617830874. These correspond to 0.4999994529 and 0.5000005471, computed with Mathematica accurate to 10 decimal places. Therefore the monkey is more probable to write TUNOCOC than COCONUT.



            Code



            Here is the Python-code I used to generate the transition matrices.





            def computeTransitionMatrix(words, keys):
            def properPrefixes(word):
            return (word[:i] for i in range(len(word)))

            def suffixesInDecreasingLength(word):
            return (word[i:] for i in range(len(word) + 1))

            prefixToState = {}
            stateToPrefix =

            def addState(prefix):
            if prefix in prefixToState:
            return
            prefixToState[prefix] = len(stateToPrefix)
            stateToPrefix.append(prefix)

            # Create a state for each proper prefix of the word.
            for word in words:
            for prefix in properPrefixes(word):
            addState(prefix)

            # Create a state for each word last.
            for word in words:
            addState(word)

            print(stateToPrefix)

            # Number of states.
            nStates = len(stateToPrefix)

            # Compute the (scaled) transition probabilities.
            transitions =
            for i in range(nStates):
            row = [0] * nStates
            transitions.append(row)

            prefix = stateToPrefix[i]
            if prefix in words:
            # The word is an absorptive state.
            row[i] = len(keys)
            continue
            for key in keys:
            nextPrefix = prefix + key
            # Find the longest suffix which
            # is a prefix of the word.
            for suffix in suffixesInDecreasingLength(nextPrefix):
            j = prefixToState.get(suffix)
            if j != None:
            row[j] += 1
            break

            return transitions

            # The words the monkey is supposed to write.
            words = ['COCONUT', 'TUNOCOC']

            # Keys available on monkey's keyboard.
            keys = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'

            transitions = computeTransitionMatrix(words, keys)
            print(repr(transitions).replace('[', '{').replace(']','}').replace('}, ', '},n'))






            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Nov 27 at 4:39

























            answered Nov 24 at 16:12









            kaba

            3314




            3314








            • 3




              Thanks for doing the work of establishing the Markov matrices and doing the calculations! It should be noted that the discussed effects about the repeated sub-words "CO" and "OC" can be seen: the expected value vectors start and and end the same, but have different values for COC/TUN and COCO/TUNO. The differences just vanish again in the end. The reason for this is IMO that COCONUT can not overlap with itself in a text (like e.g. NUTNU could). That means that for all such words of the same length (say "TOBACCO"), the expected time should be exactly the same.
              – Ingix
              Nov 24 at 19:18






            • 2




              It means the expected time is simply 26**7. How anticlimactic! :)
              – Eric Duminil
              Nov 25 at 9:09










            • Thanks. This is a great answer for learning some cool tricks in Mathematica.
              – Erel Segal-Halevi
              Nov 26 at 8:26










            • This is a great answer! In my answer I calculated the probabilities for the two squirrels in the second problem too, but my result is slightly more in favour to the second squirrel (rounds to around 50.037 %). I'd expect the methods used to be equivalent, so maybe there's some typo or silly error in one of the two answers? I'll check when I have more time
              – abl
              Nov 27 at 9:30














            • 3




              Thanks for doing the work of establishing the Markov matrices and doing the calculations! It should be noted that the discussed effects about the repeated sub-words "CO" and "OC" can be seen: the expected value vectors start and and end the same, but have different values for COC/TUN and COCO/TUNO. The differences just vanish again in the end. The reason for this is IMO that COCONUT can not overlap with itself in a text (like e.g. NUTNU could). That means that for all such words of the same length (say "TOBACCO"), the expected time should be exactly the same.
              – Ingix
              Nov 24 at 19:18






            • 2




              It means the expected time is simply 26**7. How anticlimactic! :)
              – Eric Duminil
              Nov 25 at 9:09










            • Thanks. This is a great answer for learning some cool tricks in Mathematica.
              – Erel Segal-Halevi
              Nov 26 at 8:26










            • This is a great answer! In my answer I calculated the probabilities for the two squirrels in the second problem too, but my result is slightly more in favour to the second squirrel (rounds to around 50.037 %). I'd expect the methods used to be equivalent, so maybe there's some typo or silly error in one of the two answers? I'll check when I have more time
              – abl
              Nov 27 at 9:30








            3




            3




            Thanks for doing the work of establishing the Markov matrices and doing the calculations! It should be noted that the discussed effects about the repeated sub-words "CO" and "OC" can be seen: the expected value vectors start and and end the same, but have different values for COC/TUN and COCO/TUNO. The differences just vanish again in the end. The reason for this is IMO that COCONUT can not overlap with itself in a text (like e.g. NUTNU could). That means that for all such words of the same length (say "TOBACCO"), the expected time should be exactly the same.
            – Ingix
            Nov 24 at 19:18




            Thanks for doing the work of establishing the Markov matrices and doing the calculations! It should be noted that the discussed effects about the repeated sub-words "CO" and "OC" can be seen: the expected value vectors start and and end the same, but have different values for COC/TUN and COCO/TUNO. The differences just vanish again in the end. The reason for this is IMO that COCONUT can not overlap with itself in a text (like e.g. NUTNU could). That means that for all such words of the same length (say "TOBACCO"), the expected time should be exactly the same.
            – Ingix
            Nov 24 at 19:18




            2




            2




            It means the expected time is simply 26**7. How anticlimactic! :)
            – Eric Duminil
            Nov 25 at 9:09




            It means the expected time is simply 26**7. How anticlimactic! :)
            – Eric Duminil
            Nov 25 at 9:09












            Thanks. This is a great answer for learning some cool tricks in Mathematica.
            – Erel Segal-Halevi
            Nov 26 at 8:26




            Thanks. This is a great answer for learning some cool tricks in Mathematica.
            – Erel Segal-Halevi
            Nov 26 at 8:26












            This is a great answer! In my answer I calculated the probabilities for the two squirrels in the second problem too, but my result is slightly more in favour to the second squirrel (rounds to around 50.037 %). I'd expect the methods used to be equivalent, so maybe there's some typo or silly error in one of the two answers? I'll check when I have more time
            – abl
            Nov 27 at 9:30




            This is a great answer! In my answer I calculated the probabilities for the two squirrels in the second problem too, but my result is slightly more in favour to the second squirrel (rounds to around 50.037 %). I'd expect the methods used to be equivalent, so maybe there's some typo or silly error in one of the two answers? I'll check when I have more time
            – abl
            Nov 27 at 9:30










            up vote
            16
            down vote













            (a) has been adequately answered by




            Ingix.




            In (b), though:




            The second squirrel is more likely to win.




            Consider:




            A simpler example, where instead of typing "COCONUT" or "TUNOCOC", the monkeys are trying to type "AB" or "CA". Also, the keyboard only has "A", "B", and "C" keys. Consider what happens after the first three keystrokes, for which there are 27 possibilities.




            In this case:




            In 5 out of the 27 possiblities, "AB" appears and "CA" does not. Similarly, in 5 of the possiblities, "CA" appears and "AB" does not. But there's also one possibility, "CAB", where both appear, and squirrel 2 wins. In this instance, the "A" is useful for both, but it was useful for the second squirrel earlier. The "setup" for the first squirrel to match will often result in the second squirrel already having matched.




            Now consider:




            A similar thing is true of "COCONUT" versus "TUNOCOC". The first squirrel wants "COC" as a setup, but if "COC" does appear, it's often at the tail end (pun intended) of the second squirrel's victory.







            share|improve this answer























            • It sounds plausible. But it would also mean that if there's any bias, it will be probably too small to be detected by random simulations, right?
              – Eric Duminil
              Nov 23 at 18:45












            • @Duminil correct. Even simulating a single play-through would be fairly expensive.
              – Sneftel
              Nov 23 at 20:11










            • Can't we check the CAB-Thesis, very cheaply and extrapolate from there?
              – Falco
              Nov 26 at 13:46










            • @Falco Sure, but there's no need for that. The situation is simple enough to solve analytically based on the initial conditions and the transition matrix. In that situation, the second squirrel wins precisely 60% of the time. Of course, I don't offer a formal proof that it is valid to extrapolate from AB to COCONUT.
              – Sneftel
              Nov 26 at 14:51












            • I ran a simulation with "C,O,N,U,T,X" as possible letters (since the number of dummy-letters will only make the bias smaller. And I got a fairly consistent bias of over 50% wins for the second squirrel
              – Falco
              Nov 26 at 14:51















            up vote
            16
            down vote













            (a) has been adequately answered by




            Ingix.




            In (b), though:




            The second squirrel is more likely to win.




            Consider:




            A simpler example, where instead of typing "COCONUT" or "TUNOCOC", the monkeys are trying to type "AB" or "CA". Also, the keyboard only has "A", "B", and "C" keys. Consider what happens after the first three keystrokes, for which there are 27 possibilities.




            In this case:




            In 5 out of the 27 possiblities, "AB" appears and "CA" does not. Similarly, in 5 of the possiblities, "CA" appears and "AB" does not. But there's also one possibility, "CAB", where both appear, and squirrel 2 wins. In this instance, the "A" is useful for both, but it was useful for the second squirrel earlier. The "setup" for the first squirrel to match will often result in the second squirrel already having matched.




            Now consider:




            A similar thing is true of "COCONUT" versus "TUNOCOC". The first squirrel wants "COC" as a setup, but if "COC" does appear, it's often at the tail end (pun intended) of the second squirrel's victory.







            share|improve this answer























            • It sounds plausible. But it would also mean that if there's any bias, it will be probably too small to be detected by random simulations, right?
              – Eric Duminil
              Nov 23 at 18:45












            • @Duminil correct. Even simulating a single play-through would be fairly expensive.
              – Sneftel
              Nov 23 at 20:11










            • Can't we check the CAB-Thesis, very cheaply and extrapolate from there?
              – Falco
              Nov 26 at 13:46










            • @Falco Sure, but there's no need for that. The situation is simple enough to solve analytically based on the initial conditions and the transition matrix. In that situation, the second squirrel wins precisely 60% of the time. Of course, I don't offer a formal proof that it is valid to extrapolate from AB to COCONUT.
              – Sneftel
              Nov 26 at 14:51












            • I ran a simulation with "C,O,N,U,T,X" as possible letters (since the number of dummy-letters will only make the bias smaller. And I got a fairly consistent bias of over 50% wins for the second squirrel
              – Falco
              Nov 26 at 14:51













            up vote
            16
            down vote










            up vote
            16
            down vote









            (a) has been adequately answered by




            Ingix.




            In (b), though:




            The second squirrel is more likely to win.




            Consider:




            A simpler example, where instead of typing "COCONUT" or "TUNOCOC", the monkeys are trying to type "AB" or "CA". Also, the keyboard only has "A", "B", and "C" keys. Consider what happens after the first three keystrokes, for which there are 27 possibilities.




            In this case:




            In 5 out of the 27 possiblities, "AB" appears and "CA" does not. Similarly, in 5 of the possiblities, "CA" appears and "AB" does not. But there's also one possibility, "CAB", where both appear, and squirrel 2 wins. In this instance, the "A" is useful for both, but it was useful for the second squirrel earlier. The "setup" for the first squirrel to match will often result in the second squirrel already having matched.




            Now consider:




            A similar thing is true of "COCONUT" versus "TUNOCOC". The first squirrel wants "COC" as a setup, but if "COC" does appear, it's often at the tail end (pun intended) of the second squirrel's victory.







            share|improve this answer














            (a) has been adequately answered by




            Ingix.




            In (b), though:




            The second squirrel is more likely to win.




            Consider:




            A simpler example, where instead of typing "COCONUT" or "TUNOCOC", the monkeys are trying to type "AB" or "CA". Also, the keyboard only has "A", "B", and "C" keys. Consider what happens after the first three keystrokes, for which there are 27 possibilities.




            In this case:




            In 5 out of the 27 possiblities, "AB" appears and "CA" does not. Similarly, in 5 of the possiblities, "CA" appears and "AB" does not. But there's also one possibility, "CAB", where both appear, and squirrel 2 wins. In this instance, the "A" is useful for both, but it was useful for the second squirrel earlier. The "setup" for the first squirrel to match will often result in the second squirrel already having matched.




            Now consider:




            A similar thing is true of "COCONUT" versus "TUNOCOC". The first squirrel wants "COC" as a setup, but if "COC" does appear, it's often at the tail end (pun intended) of the second squirrel's victory.








            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Nov 23 at 14:48

























            answered Nov 23 at 14:29









            Sneftel

            1,6041019




            1,6041019












            • It sounds plausible. But it would also mean that if there's any bias, it will be probably too small to be detected by random simulations, right?
              – Eric Duminil
              Nov 23 at 18:45












            • @Duminil correct. Even simulating a single play-through would be fairly expensive.
              – Sneftel
              Nov 23 at 20:11










            • Can't we check the CAB-Thesis, very cheaply and extrapolate from there?
              – Falco
              Nov 26 at 13:46










            • @Falco Sure, but there's no need for that. The situation is simple enough to solve analytically based on the initial conditions and the transition matrix. In that situation, the second squirrel wins precisely 60% of the time. Of course, I don't offer a formal proof that it is valid to extrapolate from AB to COCONUT.
              – Sneftel
              Nov 26 at 14:51












            • I ran a simulation with "C,O,N,U,T,X" as possible letters (since the number of dummy-letters will only make the bias smaller. And I got a fairly consistent bias of over 50% wins for the second squirrel
              – Falco
              Nov 26 at 14:51


















            • It sounds plausible. But it would also mean that if there's any bias, it will be probably too small to be detected by random simulations, right?
              – Eric Duminil
              Nov 23 at 18:45












            • @Duminil correct. Even simulating a single play-through would be fairly expensive.
              – Sneftel
              Nov 23 at 20:11










            • Can't we check the CAB-Thesis, very cheaply and extrapolate from there?
              – Falco
              Nov 26 at 13:46










            • @Falco Sure, but there's no need for that. The situation is simple enough to solve analytically based on the initial conditions and the transition matrix. In that situation, the second squirrel wins precisely 60% of the time. Of course, I don't offer a formal proof that it is valid to extrapolate from AB to COCONUT.
              – Sneftel
              Nov 26 at 14:51












            • I ran a simulation with "C,O,N,U,T,X" as possible letters (since the number of dummy-letters will only make the bias smaller. And I got a fairly consistent bias of over 50% wins for the second squirrel
              – Falco
              Nov 26 at 14:51
















            It sounds plausible. But it would also mean that if there's any bias, it will be probably too small to be detected by random simulations, right?
            – Eric Duminil
            Nov 23 at 18:45






            It sounds plausible. But it would also mean that if there's any bias, it will be probably too small to be detected by random simulations, right?
            – Eric Duminil
            Nov 23 at 18:45














            @Duminil correct. Even simulating a single play-through would be fairly expensive.
            – Sneftel
            Nov 23 at 20:11




            @Duminil correct. Even simulating a single play-through would be fairly expensive.
            – Sneftel
            Nov 23 at 20:11












            Can't we check the CAB-Thesis, very cheaply and extrapolate from there?
            – Falco
            Nov 26 at 13:46




            Can't we check the CAB-Thesis, very cheaply and extrapolate from there?
            – Falco
            Nov 26 at 13:46












            @Falco Sure, but there's no need for that. The situation is simple enough to solve analytically based on the initial conditions and the transition matrix. In that situation, the second squirrel wins precisely 60% of the time. Of course, I don't offer a formal proof that it is valid to extrapolate from AB to COCONUT.
            – Sneftel
            Nov 26 at 14:51






            @Falco Sure, but there's no need for that. The situation is simple enough to solve analytically based on the initial conditions and the transition matrix. In that situation, the second squirrel wins precisely 60% of the time. Of course, I don't offer a formal proof that it is valid to extrapolate from AB to COCONUT.
            – Sneftel
            Nov 26 at 14:51














            I ran a simulation with "C,O,N,U,T,X" as possible letters (since the number of dummy-letters will only make the bias smaller. And I got a fairly consistent bias of over 50% wins for the second squirrel
            – Falco
            Nov 26 at 14:51




            I ran a simulation with "C,O,N,U,T,X" as possible letters (since the number of dummy-letters will only make the bias smaller. And I got a fairly consistent bias of over 50% wins for the second squirrel
            – Falco
            Nov 26 at 14:51










            up vote
            15
            down vote













            @Ingix provides a good proof, one certainly worthy of the tick, but one that could perhaps use some reinforcement. The answer is that the N number of letters would have the same expected value. A few points that might help @jafe realise the flaw in his proof.




            1. The order of independent events occurs has no directional bias. For example, drawing a King Of Hearts followed by an Ace Of Hearts, from a 52 card deck is no more probable than an Ace followed by a King.

            2. Whilst the COCOCONUT argument might seem to hold water, it's first important to realise that the probability of that 9 letter sequence is the same as the probability of DOCOCONUT, or for that matter TUTUNOCOC having the initial CO does not gain anything to the probability as there's still a 1/26 chance of getting the next letter correct.

            3. The apparent gain of the fallback that COCOCONUT gives you is offset by the amount of extra letters that you have to type. If you get the sequence COCOCONUX, you have wasted 9 letters, whereas TUNOCOX only wastes 7. (And COCOCONUX is far more improbable than TUNOCOX).


            TLDR: the order of independent variables has no favoured direction.



            (P.S. This is my first stack answer, so feel free to give any constructive feedback)






            share|improve this answer



















            • 2




              Makes sense! Welcome to Puzzling.
              – jafe
              Nov 23 at 14:25






            • 1




              I'm still confused about this, but your 3) point seems very good to me
              – George Menoutis
              Nov 23 at 14:26










            • Welcome to Puzzling! Yup, your first two points are exactly right. As for the wastage of letters, I don't think it's really a good argument. I'm not sure it works that way. IMO the reason why COCOCONUT is a false benefit, is because when considering the probability of the events happening, you are assuming the letters before the last 7 are essentially random. This means each letter has 26 possibilities. As such, the combination of CO- as a prefix to COCONUT is already considered within the probability count. To factor it in again is double counting.
              – Ong Yu Hann
              Nov 23 at 14:52












            • Thanks for your feedback @Ong Yu Hann. The difference between the 3rd point and the others is that the first 2 are easily provable mathematical facts. The 3rd is simply to help people logically understand why COCOCOCO doesn't provide a probabilistic benefit.
              – WittierDinosaur
              Nov 23 at 14:58












            • While it's true that COCONUT and TUNOCOC are equivalent, you are making a far more sweeping (and false) statement when you imply that all sequences of the same length are equivalent. For a simple counterexample that can be checked by hand (using Markov chains, or even just by trying it): if you flip coins over and over, on average it will take you 6 coinflips until you see HH, but only 4 coinflips until you see HT.
              – Misha Lavrov
              Nov 27 at 1:14















            up vote
            15
            down vote













            @Ingix provides a good proof, one certainly worthy of the tick, but one that could perhaps use some reinforcement. The answer is that the N number of letters would have the same expected value. A few points that might help @jafe realise the flaw in his proof.




            1. The order of independent events occurs has no directional bias. For example, drawing a King Of Hearts followed by an Ace Of Hearts, from a 52 card deck is no more probable than an Ace followed by a King.

            2. Whilst the COCOCONUT argument might seem to hold water, it's first important to realise that the probability of that 9 letter sequence is the same as the probability of DOCOCONUT, or for that matter TUTUNOCOC having the initial CO does not gain anything to the probability as there's still a 1/26 chance of getting the next letter correct.

            3. The apparent gain of the fallback that COCOCONUT gives you is offset by the amount of extra letters that you have to type. If you get the sequence COCOCONUX, you have wasted 9 letters, whereas TUNOCOX only wastes 7. (And COCOCONUX is far more improbable than TUNOCOX).


            TLDR: the order of independent variables has no favoured direction.



            (P.S. This is my first stack answer, so feel free to give any constructive feedback)






            share|improve this answer



















            • 2




              Makes sense! Welcome to Puzzling.
              – jafe
              Nov 23 at 14:25






            • 1




              I'm still confused about this, but your 3) point seems very good to me
              – George Menoutis
              Nov 23 at 14:26










            • Welcome to Puzzling! Yup, your first two points are exactly right. As for the wastage of letters, I don't think it's really a good argument. I'm not sure it works that way. IMO the reason why COCOCONUT is a false benefit, is because when considering the probability of the events happening, you are assuming the letters before the last 7 are essentially random. This means each letter has 26 possibilities. As such, the combination of CO- as a prefix to COCONUT is already considered within the probability count. To factor it in again is double counting.
              – Ong Yu Hann
              Nov 23 at 14:52












            • Thanks for your feedback @Ong Yu Hann. The difference between the 3rd point and the others is that the first 2 are easily provable mathematical facts. The 3rd is simply to help people logically understand why COCOCOCO doesn't provide a probabilistic benefit.
              – WittierDinosaur
              Nov 23 at 14:58












            • While it's true that COCONUT and TUNOCOC are equivalent, you are making a far more sweeping (and false) statement when you imply that all sequences of the same length are equivalent. For a simple counterexample that can be checked by hand (using Markov chains, or even just by trying it): if you flip coins over and over, on average it will take you 6 coinflips until you see HH, but only 4 coinflips until you see HT.
              – Misha Lavrov
              Nov 27 at 1:14













            up vote
            15
            down vote










            up vote
            15
            down vote









            @Ingix provides a good proof, one certainly worthy of the tick, but one that could perhaps use some reinforcement. The answer is that the N number of letters would have the same expected value. A few points that might help @jafe realise the flaw in his proof.




            1. The order of independent events occurs has no directional bias. For example, drawing a King Of Hearts followed by an Ace Of Hearts, from a 52 card deck is no more probable than an Ace followed by a King.

            2. Whilst the COCOCONUT argument might seem to hold water, it's first important to realise that the probability of that 9 letter sequence is the same as the probability of DOCOCONUT, or for that matter TUTUNOCOC having the initial CO does not gain anything to the probability as there's still a 1/26 chance of getting the next letter correct.

            3. The apparent gain of the fallback that COCOCONUT gives you is offset by the amount of extra letters that you have to type. If you get the sequence COCOCONUX, you have wasted 9 letters, whereas TUNOCOX only wastes 7. (And COCOCONUX is far more improbable than TUNOCOX).


            TLDR: the order of independent variables has no favoured direction.



            (P.S. This is my first stack answer, so feel free to give any constructive feedback)






            share|improve this answer














            @Ingix provides a good proof, one certainly worthy of the tick, but one that could perhaps use some reinforcement. The answer is that the N number of letters would have the same expected value. A few points that might help @jafe realise the flaw in his proof.




            1. The order of independent events occurs has no directional bias. For example, drawing a King Of Hearts followed by an Ace Of Hearts, from a 52 card deck is no more probable than an Ace followed by a King.

            2. Whilst the COCOCONUT argument might seem to hold water, it's first important to realise that the probability of that 9 letter sequence is the same as the probability of DOCOCONUT, or for that matter TUTUNOCOC having the initial CO does not gain anything to the probability as there's still a 1/26 chance of getting the next letter correct.

            3. The apparent gain of the fallback that COCOCONUT gives you is offset by the amount of extra letters that you have to type. If you get the sequence COCOCONUX, you have wasted 9 letters, whereas TUNOCOX only wastes 7. (And COCOCONUX is far more improbable than TUNOCOX).


            TLDR: the order of independent variables has no favoured direction.



            (P.S. This is my first stack answer, so feel free to give any constructive feedback)







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Nov 23 at 14:42

























            answered Nov 23 at 14:22









            WittierDinosaur

            1594




            1594








            • 2




              Makes sense! Welcome to Puzzling.
              – jafe
              Nov 23 at 14:25






            • 1




              I'm still confused about this, but your 3) point seems very good to me
              – George Menoutis
              Nov 23 at 14:26










            • Welcome to Puzzling! Yup, your first two points are exactly right. As for the wastage of letters, I don't think it's really a good argument. I'm not sure it works that way. IMO the reason why COCOCONUT is a false benefit, is because when considering the probability of the events happening, you are assuming the letters before the last 7 are essentially random. This means each letter has 26 possibilities. As such, the combination of CO- as a prefix to COCONUT is already considered within the probability count. To factor it in again is double counting.
              – Ong Yu Hann
              Nov 23 at 14:52












            • Thanks for your feedback @Ong Yu Hann. The difference between the 3rd point and the others is that the first 2 are easily provable mathematical facts. The 3rd is simply to help people logically understand why COCOCOCO doesn't provide a probabilistic benefit.
              – WittierDinosaur
              Nov 23 at 14:58












            • While it's true that COCONUT and TUNOCOC are equivalent, you are making a far more sweeping (and false) statement when you imply that all sequences of the same length are equivalent. For a simple counterexample that can be checked by hand (using Markov chains, or even just by trying it): if you flip coins over and over, on average it will take you 6 coinflips until you see HH, but only 4 coinflips until you see HT.
              – Misha Lavrov
              Nov 27 at 1:14














            • 2




              Makes sense! Welcome to Puzzling.
              – jafe
              Nov 23 at 14:25






            • 1




              I'm still confused about this, but your 3) point seems very good to me
              – George Menoutis
              Nov 23 at 14:26










            • Welcome to Puzzling! Yup, your first two points are exactly right. As for the wastage of letters, I don't think it's really a good argument. I'm not sure it works that way. IMO the reason why COCOCONUT is a false benefit, is because when considering the probability of the events happening, you are assuming the letters before the last 7 are essentially random. This means each letter has 26 possibilities. As such, the combination of CO- as a prefix to COCONUT is already considered within the probability count. To factor it in again is double counting.
              – Ong Yu Hann
              Nov 23 at 14:52












            • Thanks for your feedback @Ong Yu Hann. The difference between the 3rd point and the others is that the first 2 are easily provable mathematical facts. The 3rd is simply to help people logically understand why COCOCOCO doesn't provide a probabilistic benefit.
              – WittierDinosaur
              Nov 23 at 14:58












            • While it's true that COCONUT and TUNOCOC are equivalent, you are making a far more sweeping (and false) statement when you imply that all sequences of the same length are equivalent. For a simple counterexample that can be checked by hand (using Markov chains, or even just by trying it): if you flip coins over and over, on average it will take you 6 coinflips until you see HH, but only 4 coinflips until you see HT.
              – Misha Lavrov
              Nov 27 at 1:14








            2




            2




            Makes sense! Welcome to Puzzling.
            – jafe
            Nov 23 at 14:25




            Makes sense! Welcome to Puzzling.
            – jafe
            Nov 23 at 14:25




            1




            1




            I'm still confused about this, but your 3) point seems very good to me
            – George Menoutis
            Nov 23 at 14:26




            I'm still confused about this, but your 3) point seems very good to me
            – George Menoutis
            Nov 23 at 14:26












            Welcome to Puzzling! Yup, your first two points are exactly right. As for the wastage of letters, I don't think it's really a good argument. I'm not sure it works that way. IMO the reason why COCOCONUT is a false benefit, is because when considering the probability of the events happening, you are assuming the letters before the last 7 are essentially random. This means each letter has 26 possibilities. As such, the combination of CO- as a prefix to COCONUT is already considered within the probability count. To factor it in again is double counting.
            – Ong Yu Hann
            Nov 23 at 14:52






            Welcome to Puzzling! Yup, your first two points are exactly right. As for the wastage of letters, I don't think it's really a good argument. I'm not sure it works that way. IMO the reason why COCOCONUT is a false benefit, is because when considering the probability of the events happening, you are assuming the letters before the last 7 are essentially random. This means each letter has 26 possibilities. As such, the combination of CO- as a prefix to COCONUT is already considered within the probability count. To factor it in again is double counting.
            – Ong Yu Hann
            Nov 23 at 14:52














            Thanks for your feedback @Ong Yu Hann. The difference between the 3rd point and the others is that the first 2 are easily provable mathematical facts. The 3rd is simply to help people logically understand why COCOCOCO doesn't provide a probabilistic benefit.
            – WittierDinosaur
            Nov 23 at 14:58






            Thanks for your feedback @Ong Yu Hann. The difference between the 3rd point and the others is that the first 2 are easily provable mathematical facts. The 3rd is simply to help people logically understand why COCOCOCO doesn't provide a probabilistic benefit.
            – WittierDinosaur
            Nov 23 at 14:58














            While it's true that COCONUT and TUNOCOC are equivalent, you are making a far more sweeping (and false) statement when you imply that all sequences of the same length are equivalent. For a simple counterexample that can be checked by hand (using Markov chains, or even just by trying it): if you flip coins over and over, on average it will take you 6 coinflips until you see HH, but only 4 coinflips until you see HT.
            – Misha Lavrov
            Nov 27 at 1:14




            While it's true that COCONUT and TUNOCOC are equivalent, you are making a far more sweeping (and false) statement when you imply that all sequences of the same length are equivalent. For a simple counterexample that can be checked by hand (using Markov chains, or even just by trying it): if you flip coins over and over, on average it will take you 6 coinflips until you see HH, but only 4 coinflips until you see HT.
            – Misha Lavrov
            Nov 27 at 1:14










            up vote
            6
            down vote













            There is a clever trick to finding the kind of expected times that appear in part (a). It turns out that both for COCONUT and TUNOCOC, the expected time is just $26^7$ (so the monkeys have equal expected times to win), but for this to be convincing, I will demonstrate the method on a case where the expected value is unexpected: the word ABRACADABRA.



            Suppose that a monkey is typing randomly, and a family of beavers decides to bet on the outcome. The rules for betting are as follows:




            1. Right before a new letter is typed, a new beaver shows up and is given $1$ dollar. The beaver bets that dollar on that letter being A, at $26:1$ odds.

            2. If the beaver wins (and now has $26$ dollars), the beaver bets all of them on the next letter being B.

            3. If the beaver wins (and now has $26^2$ dollars), the beaver bets all of them on the next letter being R.

            4. In general, whenever the beaver wins the first $k$ bets (and has $26^k$ dollars), the beaver bets all of them on the next, $(k+1$)-th letter still matching the $(k+1)$-th letter of ABRACADABRA.


            Any time a beaver loses a bet, that beaver has no more money and goes home. (But at any time, multiple beavers may be betting, because a new beaver shows up right before each letter.)



            Each bet is fair, so after $n$ letters are typed, the expected number of money the beavers have altogether is $n$: the $n$ dollars that the beavers get when they show up.



            When ABRACADABRA first appears, we can count the total amount of money in the system:




            • One lucky beaver has $26^{11}$ dollars from betting on ABRACADABRA all the way through.

            • One beaver started on the ABRA at the end of ABRACADABRA, and has $26^4$ dollars.

            • One beaver just started betting before the last A, and has $26$ dollars.

            • All other beavers lost their bets and have no money.


            So we are guaranteed to have $26^{11} + 26^4 + 26$ dollars in the system. But the expected amount of money in the system is equal to the number of letters typed. So in expectation, it takes $26^{11} + 26^4 + 26$ letters to type ABRACADABRA.



            (Formally, we are invoking Wald's equation to see that the rule "the expected money in the system is equal to the expected number of letters typed" still holds when the number of letters typed depends on the letters that have been typed - when we stop once ABRACADABRA is reached)





            Now, going back to COCONUT and TUNOCOC: neither word ends with an initial segment of the word. (The CONUT in COCONUT doesn't change anything.) If the beavers were betting on COCONUT, then at the end when COCONUT is typed, one beaver would have $26^7$ dollars and no other beavers would have anything. The same is true for TUNOCOC. So the expected time is just $26^7$ in both cases.



            (A weird consequence is that a word like AAAAA has a longer expected waiting time than a word like ABCDE. To explain this intuitively, consider that if ABCDE hasn't been typed yet, the last five letters can can be any of AABCD, BABCD, CABCD, ..., ZABCD, giving us 26 ways to win on the next letter. But if AAAAA hasn't been typed yet, the last five letters can only be any of BAAAA, CAAAA, DAAAA, ..., ZAAAA, giving us only 25 ways to win.)






            share|improve this answer























            • This sounds exciting. Could you add a note on how Wald's equation is applied here? Referring to Wikipedia's page, what is the sequence (X_n), what is N, and how to see that X_N are i.i.d., and in particular why N is independent of the sequence (X_n)?
              – kaba
              Nov 26 at 23:03










            • We are using the general version, so we don't need either of the requirements you mention. Here, $N$ is the number of steps taken and $X_n$ is the change in how much money there is in the system at step $n$. Condition 2 in Wikipedia's article essentially requires that $N$ be a stopping time: $N ge n$ is independent of $X_n, X_{n+1}, X_{n+2}, dots$ (so it does not require telling the future).
              – Misha Lavrov
              Nov 26 at 23:13












            • Thanks, I somehow missed the general version. I replicated that the mean time for ABRACADABRA is indeed $26^{11} + 26^4 + 26$ using Markov chains as in my answer. This answer is probably the best one for problem (a) this far.
              – kaba
              Nov 27 at 1:04












            • This is a cool answer +1
              – justhalf
              Nov 27 at 14:15















            up vote
            6
            down vote













            There is a clever trick to finding the kind of expected times that appear in part (a). It turns out that both for COCONUT and TUNOCOC, the expected time is just $26^7$ (so the monkeys have equal expected times to win), but for this to be convincing, I will demonstrate the method on a case where the expected value is unexpected: the word ABRACADABRA.



            Suppose that a monkey is typing randomly, and a family of beavers decides to bet on the outcome. The rules for betting are as follows:




            1. Right before a new letter is typed, a new beaver shows up and is given $1$ dollar. The beaver bets that dollar on that letter being A, at $26:1$ odds.

            2. If the beaver wins (and now has $26$ dollars), the beaver bets all of them on the next letter being B.

            3. If the beaver wins (and now has $26^2$ dollars), the beaver bets all of them on the next letter being R.

            4. In general, whenever the beaver wins the first $k$ bets (and has $26^k$ dollars), the beaver bets all of them on the next, $(k+1$)-th letter still matching the $(k+1)$-th letter of ABRACADABRA.


            Any time a beaver loses a bet, that beaver has no more money and goes home. (But at any time, multiple beavers may be betting, because a new beaver shows up right before each letter.)



            Each bet is fair, so after $n$ letters are typed, the expected number of money the beavers have altogether is $n$: the $n$ dollars that the beavers get when they show up.



            When ABRACADABRA first appears, we can count the total amount of money in the system:




            • One lucky beaver has $26^{11}$ dollars from betting on ABRACADABRA all the way through.

            • One beaver started on the ABRA at the end of ABRACADABRA, and has $26^4$ dollars.

            • One beaver just started betting before the last A, and has $26$ dollars.

            • All other beavers lost their bets and have no money.


            So we are guaranteed to have $26^{11} + 26^4 + 26$ dollars in the system. But the expected amount of money in the system is equal to the number of letters typed. So in expectation, it takes $26^{11} + 26^4 + 26$ letters to type ABRACADABRA.



            (Formally, we are invoking Wald's equation to see that the rule "the expected money in the system is equal to the expected number of letters typed" still holds when the number of letters typed depends on the letters that have been typed - when we stop once ABRACADABRA is reached)





            Now, going back to COCONUT and TUNOCOC: neither word ends with an initial segment of the word. (The CONUT in COCONUT doesn't change anything.) If the beavers were betting on COCONUT, then at the end when COCONUT is typed, one beaver would have $26^7$ dollars and no other beavers would have anything. The same is true for TUNOCOC. So the expected time is just $26^7$ in both cases.



            (A weird consequence is that a word like AAAAA has a longer expected waiting time than a word like ABCDE. To explain this intuitively, consider that if ABCDE hasn't been typed yet, the last five letters can can be any of AABCD, BABCD, CABCD, ..., ZABCD, giving us 26 ways to win on the next letter. But if AAAAA hasn't been typed yet, the last five letters can only be any of BAAAA, CAAAA, DAAAA, ..., ZAAAA, giving us only 25 ways to win.)






            share|improve this answer























            • This sounds exciting. Could you add a note on how Wald's equation is applied here? Referring to Wikipedia's page, what is the sequence (X_n), what is N, and how to see that X_N are i.i.d., and in particular why N is independent of the sequence (X_n)?
              – kaba
              Nov 26 at 23:03










            • We are using the general version, so we don't need either of the requirements you mention. Here, $N$ is the number of steps taken and $X_n$ is the change in how much money there is in the system at step $n$. Condition 2 in Wikipedia's article essentially requires that $N$ be a stopping time: $N ge n$ is independent of $X_n, X_{n+1}, X_{n+2}, dots$ (so it does not require telling the future).
              – Misha Lavrov
              Nov 26 at 23:13












            • Thanks, I somehow missed the general version. I replicated that the mean time for ABRACADABRA is indeed $26^{11} + 26^4 + 26$ using Markov chains as in my answer. This answer is probably the best one for problem (a) this far.
              – kaba
              Nov 27 at 1:04












            • This is a cool answer +1
              – justhalf
              Nov 27 at 14:15













            up vote
            6
            down vote










            up vote
            6
            down vote









            There is a clever trick to finding the kind of expected times that appear in part (a). It turns out that both for COCONUT and TUNOCOC, the expected time is just $26^7$ (so the monkeys have equal expected times to win), but for this to be convincing, I will demonstrate the method on a case where the expected value is unexpected: the word ABRACADABRA.



            Suppose that a monkey is typing randomly, and a family of beavers decides to bet on the outcome. The rules for betting are as follows:




            1. Right before a new letter is typed, a new beaver shows up and is given $1$ dollar. The beaver bets that dollar on that letter being A, at $26:1$ odds.

            2. If the beaver wins (and now has $26$ dollars), the beaver bets all of them on the next letter being B.

            3. If the beaver wins (and now has $26^2$ dollars), the beaver bets all of them on the next letter being R.

            4. In general, whenever the beaver wins the first $k$ bets (and has $26^k$ dollars), the beaver bets all of them on the next, $(k+1$)-th letter still matching the $(k+1)$-th letter of ABRACADABRA.


            Any time a beaver loses a bet, that beaver has no more money and goes home. (But at any time, multiple beavers may be betting, because a new beaver shows up right before each letter.)



            Each bet is fair, so after $n$ letters are typed, the expected number of money the beavers have altogether is $n$: the $n$ dollars that the beavers get when they show up.



            When ABRACADABRA first appears, we can count the total amount of money in the system:




            • One lucky beaver has $26^{11}$ dollars from betting on ABRACADABRA all the way through.

            • One beaver started on the ABRA at the end of ABRACADABRA, and has $26^4$ dollars.

            • One beaver just started betting before the last A, and has $26$ dollars.

            • All other beavers lost their bets and have no money.


            So we are guaranteed to have $26^{11} + 26^4 + 26$ dollars in the system. But the expected amount of money in the system is equal to the number of letters typed. So in expectation, it takes $26^{11} + 26^4 + 26$ letters to type ABRACADABRA.



            (Formally, we are invoking Wald's equation to see that the rule "the expected money in the system is equal to the expected number of letters typed" still holds when the number of letters typed depends on the letters that have been typed - when we stop once ABRACADABRA is reached)





            Now, going back to COCONUT and TUNOCOC: neither word ends with an initial segment of the word. (The CONUT in COCONUT doesn't change anything.) If the beavers were betting on COCONUT, then at the end when COCONUT is typed, one beaver would have $26^7$ dollars and no other beavers would have anything. The same is true for TUNOCOC. So the expected time is just $26^7$ in both cases.



            (A weird consequence is that a word like AAAAA has a longer expected waiting time than a word like ABCDE. To explain this intuitively, consider that if ABCDE hasn't been typed yet, the last five letters can can be any of AABCD, BABCD, CABCD, ..., ZABCD, giving us 26 ways to win on the next letter. But if AAAAA hasn't been typed yet, the last five letters can only be any of BAAAA, CAAAA, DAAAA, ..., ZAAAA, giving us only 25 ways to win.)






            share|improve this answer














            There is a clever trick to finding the kind of expected times that appear in part (a). It turns out that both for COCONUT and TUNOCOC, the expected time is just $26^7$ (so the monkeys have equal expected times to win), but for this to be convincing, I will demonstrate the method on a case where the expected value is unexpected: the word ABRACADABRA.



            Suppose that a monkey is typing randomly, and a family of beavers decides to bet on the outcome. The rules for betting are as follows:




            1. Right before a new letter is typed, a new beaver shows up and is given $1$ dollar. The beaver bets that dollar on that letter being A, at $26:1$ odds.

            2. If the beaver wins (and now has $26$ dollars), the beaver bets all of them on the next letter being B.

            3. If the beaver wins (and now has $26^2$ dollars), the beaver bets all of them on the next letter being R.

            4. In general, whenever the beaver wins the first $k$ bets (and has $26^k$ dollars), the beaver bets all of them on the next, $(k+1$)-th letter still matching the $(k+1)$-th letter of ABRACADABRA.


            Any time a beaver loses a bet, that beaver has no more money and goes home. (But at any time, multiple beavers may be betting, because a new beaver shows up right before each letter.)



            Each bet is fair, so after $n$ letters are typed, the expected number of money the beavers have altogether is $n$: the $n$ dollars that the beavers get when they show up.



            When ABRACADABRA first appears, we can count the total amount of money in the system:




            • One lucky beaver has $26^{11}$ dollars from betting on ABRACADABRA all the way through.

            • One beaver started on the ABRA at the end of ABRACADABRA, and has $26^4$ dollars.

            • One beaver just started betting before the last A, and has $26$ dollars.

            • All other beavers lost their bets and have no money.


            So we are guaranteed to have $26^{11} + 26^4 + 26$ dollars in the system. But the expected amount of money in the system is equal to the number of letters typed. So in expectation, it takes $26^{11} + 26^4 + 26$ letters to type ABRACADABRA.



            (Formally, we are invoking Wald's equation to see that the rule "the expected money in the system is equal to the expected number of letters typed" still holds when the number of letters typed depends on the letters that have been typed - when we stop once ABRACADABRA is reached)





            Now, going back to COCONUT and TUNOCOC: neither word ends with an initial segment of the word. (The CONUT in COCONUT doesn't change anything.) If the beavers were betting on COCONUT, then at the end when COCONUT is typed, one beaver would have $26^7$ dollars and no other beavers would have anything. The same is true for TUNOCOC. So the expected time is just $26^7$ in both cases.



            (A weird consequence is that a word like AAAAA has a longer expected waiting time than a word like ABCDE. To explain this intuitively, consider that if ABCDE hasn't been typed yet, the last five letters can can be any of AABCD, BABCD, CABCD, ..., ZABCD, giving us 26 ways to win on the next letter. But if AAAAA hasn't been typed yet, the last five letters can only be any of BAAAA, CAAAA, DAAAA, ..., ZAAAA, giving us only 25 ways to win.)







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Nov 26 at 15:46

























            answered Nov 26 at 15:39









            Misha Lavrov

            1595




            1595












            • This sounds exciting. Could you add a note on how Wald's equation is applied here? Referring to Wikipedia's page, what is the sequence (X_n), what is N, and how to see that X_N are i.i.d., and in particular why N is independent of the sequence (X_n)?
              – kaba
              Nov 26 at 23:03










            • We are using the general version, so we don't need either of the requirements you mention. Here, $N$ is the number of steps taken and $X_n$ is the change in how much money there is in the system at step $n$. Condition 2 in Wikipedia's article essentially requires that $N$ be a stopping time: $N ge n$ is independent of $X_n, X_{n+1}, X_{n+2}, dots$ (so it does not require telling the future).
              – Misha Lavrov
              Nov 26 at 23:13












            • Thanks, I somehow missed the general version. I replicated that the mean time for ABRACADABRA is indeed $26^{11} + 26^4 + 26$ using Markov chains as in my answer. This answer is probably the best one for problem (a) this far.
              – kaba
              Nov 27 at 1:04












            • This is a cool answer +1
              – justhalf
              Nov 27 at 14:15


















            • This sounds exciting. Could you add a note on how Wald's equation is applied here? Referring to Wikipedia's page, what is the sequence (X_n), what is N, and how to see that X_N are i.i.d., and in particular why N is independent of the sequence (X_n)?
              – kaba
              Nov 26 at 23:03










            • We are using the general version, so we don't need either of the requirements you mention. Here, $N$ is the number of steps taken and $X_n$ is the change in how much money there is in the system at step $n$. Condition 2 in Wikipedia's article essentially requires that $N$ be a stopping time: $N ge n$ is independent of $X_n, X_{n+1}, X_{n+2}, dots$ (so it does not require telling the future).
              – Misha Lavrov
              Nov 26 at 23:13












            • Thanks, I somehow missed the general version. I replicated that the mean time for ABRACADABRA is indeed $26^{11} + 26^4 + 26$ using Markov chains as in my answer. This answer is probably the best one for problem (a) this far.
              – kaba
              Nov 27 at 1:04












            • This is a cool answer +1
              – justhalf
              Nov 27 at 14:15
















            This sounds exciting. Could you add a note on how Wald's equation is applied here? Referring to Wikipedia's page, what is the sequence (X_n), what is N, and how to see that X_N are i.i.d., and in particular why N is independent of the sequence (X_n)?
            – kaba
            Nov 26 at 23:03




            This sounds exciting. Could you add a note on how Wald's equation is applied here? Referring to Wikipedia's page, what is the sequence (X_n), what is N, and how to see that X_N are i.i.d., and in particular why N is independent of the sequence (X_n)?
            – kaba
            Nov 26 at 23:03












            We are using the general version, so we don't need either of the requirements you mention. Here, $N$ is the number of steps taken and $X_n$ is the change in how much money there is in the system at step $n$. Condition 2 in Wikipedia's article essentially requires that $N$ be a stopping time: $N ge n$ is independent of $X_n, X_{n+1}, X_{n+2}, dots$ (so it does not require telling the future).
            – Misha Lavrov
            Nov 26 at 23:13






            We are using the general version, so we don't need either of the requirements you mention. Here, $N$ is the number of steps taken and $X_n$ is the change in how much money there is in the system at step $n$. Condition 2 in Wikipedia's article essentially requires that $N$ be a stopping time: $N ge n$ is independent of $X_n, X_{n+1}, X_{n+2}, dots$ (so it does not require telling the future).
            – Misha Lavrov
            Nov 26 at 23:13














            Thanks, I somehow missed the general version. I replicated that the mean time for ABRACADABRA is indeed $26^{11} + 26^4 + 26$ using Markov chains as in my answer. This answer is probably the best one for problem (a) this far.
            – kaba
            Nov 27 at 1:04






            Thanks, I somehow missed the general version. I replicated that the mean time for ABRACADABRA is indeed $26^{11} + 26^4 + 26$ using Markov chains as in my answer. This answer is probably the best one for problem (a) this far.
            – kaba
            Nov 27 at 1:04














            This is a cool answer +1
            – justhalf
            Nov 27 at 14:15




            This is a cool answer +1
            – justhalf
            Nov 27 at 14:15










            up vote
            5
            down vote













            Problem (a) has been adequately covered in other answers.



            In this answer I'll show that in problem (b), the second squirrel has better chances.




            • First (for introduction purposes only) we'll see a simplified version of the problem, that is simple enough to be run on a simulation

            • Then we'll see how we can exactly calculate the probability that the second squirrel will win, in this simplified example. The method that we'll use does not depend on any simplification, and can be applied to the original problem as well.

            • We'll see that the result that we obtain with this method matches with the results of the simulation.

            • Having then some extra confidence in the method (not that it's really needed, since it's fairly clear), we'll apply it to the original problem and obtain a result.


            Simplified problem:




            A monkey is typing capital letters from an alphabet that contains the letters A, B and X. Two squirrels observe the sequence of letters thus generated. The first squirrel wins if AAB appears (as three successive letters) before BAA. The second squirrel wins if BAA appears before AAB. Which squirrel is more likely to win?




            If chosen this problem because it has a similarity with the original, which I believe is crucial: "AAB" can overlap with "BAA" only in the form "AABAA", while in the other way around it can be either "BAAAB" or "BAAB". Just like "COCONUT" followed by "TUNOCOC" can be overlapped only in the form "COCONUTUNOCOC", while, the other way around, they can be overlapped like "TUNOCOCOCONUT" or like "TUNOCOCONUT".



            Note however that we'll be solving the actual problem in the end. The only reason I'm including a simpler version is to explain the method on a simpler case, and because for the simpler problem we can run a simulation to verify our predictions.



            Explanation



            We must realize that at any point in the game, any of the two strings can be partly under construction, or not. More importantly, the only thing that affects the probabilities at any point in the game is how much of each string is already written. If, for example, the two last letters of the string so far are "XA", that means that the string AAB is partly written up to "A", and the string BAA is not partly written at all. This and only this is what determines which of the two squirrels is better off at this point, regardless of what has been written so far before "XA".



            Therefore the state of the game at any given point is defined exclusively by the longest suffix of the sequence written so far that is also a prefix of at least one of the strings that the squirrels are betting at.



            Thus this game has exactly seven different possible states, which are:





            • $epsilon$ (empty string, meaning no matching suffix)

            • $verb|A|$

            • $verb|AA|$

            • $verb|AAB|$

            • $verb|B|$

            • $verb|BA|$

            • $verb|BAA|$


            On each of these different states, the squirrels may have different chances of winning. Let's note $P_epsilon$ as the chances that the second squirrel has when the state is $epsilon$, $P_{A}$ as the chances that he has when the state is $verb|A|$, etc. What we're interested in is $P_epsilon$, because that's the initial state of the game.



            We can easily formulate each of this probabilities in terms of the others. From state $epsilon$ there's a 1/3 chance of writing an "A", thus moving to state A. There's also a 1/3 chance of transitioning to B, and finally a 1/3 chance of staying at e (if an "X" is written). That means that the chances for the second squirrel on state e are:



            $$
            P_{epsilon} = frac{1}{3} P_{A} + frac{1}{3} P_{B} + frac{1}{3} P_{epsilon}
            $$



            We can similarly express the chances for all the other states. Note that $P_{AAB} = 0$, because the second squirrel loses immediately on that state, and $P_{BAA} = 1$, because it wins immediately. The full list is:



            $$
            begin{align*}
            P_epsilon &= frac{1}{3} P_{A} + frac{1}{3} P_{B} + frac{1}{3} P_epsilon \
            P_{A} &= frac{1}{3} P_{AA} + frac{1}{3} P_{B} + frac{1}{3} P_epsilon \
            P_{AA} &= frac{1}{3} P_{AAB} + frac{1}{3} P_{AA} + frac{1}{3} P_epsilon \
            P_{AAB} &= 0 \
            P_{B} &= frac{1}{3} P_{BA} + frac{1}{3} P_{B} + frac{1}{3} P_epsilon \
            P_{BA} &= frac{1}{3} P_{BAA} + frac{1}{3} P_{B} + frac{1}{3} P_epsilon \
            P_{BAA} &= 1
            end{align*}
            $$



            This is simply a system of linear equations, and it's simple enough to be solved by hand. The solution is $P_epsilon = frac{8}{13}$ or approximately 61.54%. These are the chances for the second squirrel when the game starts. The chances for the first squirrel are $frac{5}{13}$, about 38.46%.



            Running the simplified problem



            Now, because this game has only three letters, the average game will be pretty short, so much in fact that we can play it enough times to have some statistics. The following program (in Java) simulates 10 million games and prints the percentage of wins for each squirrel. I've run it several times and the results are always in the range of 38.4 - 38.5% of wins for the first squirrel, and 61.5 - 61.6% of wins for the second squirrel, which matches very well with our previous calculations.



            public class Monkeys {

            private enum Squirrel {
            FIRST_SQUIRREL,
            SECOND_SQUIRREL
            }

            private static final Random random = new Random();
            private static final char letters = "ABX".toCharArray();

            public static void main(String args) {

            int winsF = 0;
            int winsS = 0;
            int iterations = 10000000;

            for (int i = 0; i < iterations; i++) {

            Squirrel winner = playGame();
            if (winner == Squirrel.FIRST_SQUIRREL) {
            winsF++;
            } else if (winner == Squirrel.SECOND_SQUIRREL) {
            winsS++;
            }
            }

            System.out.println("First squirrel wins: " + ((double) winsF) / (winsF + winsS));
            System.out.println("Second squirrel wins: " + ((double) winsS) / (winsF + winsS));

            }

            private static Squirrel playGame() {
            StringBuilder sb = new StringBuilder();
            while (true) {
            if (sb.length() >= 3) {
            if (sb.substring(sb.length() - 3, sb.length()).equals("AAB")) {
            return Squirrel.FIRST_SQUIRREL;
            }
            if (sb.substring(sb.length() - 3, sb.length()).equals("BAA")) {
            return Squirrel.SECOND_SQUIRREL;
            }
            }
            sb.append(letters[random.nextInt(letters.length)]);
            }
            }
            }


            Actual problem



            Great! So the only thing we need to do now is apply the same method for the original problem. Following the same reasoning, in the original problem there are 15 distinct states:




            • $epsilon$

            • $verb|C|$

            • $verb|CO|$

            • $verb|COC|$

            • $verb|COCO|$

            • $verb|COCON|$

            • $verb|COCONU|$

            • $verb|COCONUT|$

            • $verb|T|$

            • $verb|TU|$

            • $verb|TUN|$

            • $verb|TUNO|$

            • $verb|TUNOC|$

            • $verb|TUNOCO|$

            • $verb|TUNOCOC|$


            The chances for the second squirrel in each of the states are (thanks @kaba for pointing a previous error):



            $$
            begin{align*}
            P_epsilon &= frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{24}{26} P_epsilon \
            P_{C} &= frac{1}{26} P_{CO} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{CO} &= frac{1}{26} P_{COC} + frac{1}{26} P_{T} + frac{24}{26} P_epsilon \
            P_{COC} &= frac{1}{26} P_{COCO} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{COCO} &= frac{1}{26} P_{COCON} + frac{1}{26} P_{COC} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{COCON} &= frac{1}{26} P_{COCONU} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{COCONU} &= frac{1}{26} P_{COCONUT} + frac{1}{26} P_{C} + frac{24}{26} P_epsilon \
            P_{COCONUT} &= 0 \
            P_{T} &= frac{1}{26} P_{TU} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{TU} &= frac{1}{26} P_{TUN} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{TUN} &= frac{1}{26} P_{TUNO} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{TUNO} &= frac{1}{26} P_{TUNOC} + frac{1}{26} P_{T} + frac{24}{26} P_epsilon \
            P_{TUNOC} &= frac{1}{26} P_{TUNOCO} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{TUNOCO} &= frac{1}{26} P_{TUNOCOC} + frac{1}{26} P_{T} + frac{24}{26} P_epsilon \
            P_{TUNOCOC} &= 1
            end{align*}
            $$



            Finally let's solve this system of equations. I've used SageMath for that purpose. The script is (after renaming the variables):



            var('a, b, c, d, e, f, g, h, i, j, k, l, m, n, u')

            eqn = [
            a == (1/26) * b + (1/26) * i + (24/26) * a,
            b == (1/26) * c + (1/26) * b + (1/26) * i + (23/26) * a,
            c == (1/26) * d + (1/26) * i + (24/26) * a,
            d == (1/26) * e + (1/26) * b + (1/26) * i + (23/26) * a,
            e == (1/26) * f + (1/26) * d + (1/26) * i + (23/26) * a,
            f == (1/26) * g + (1/26) * b + (1/26) * i + (23/26) * a,
            g == (1/26) * h + (1/26) * b + (24/26) * a,
            h == 0,
            i == (1/26) * j + (1/26) * b + (1/26) * i + (23/26) * a,
            j == (1/26) * k + (1/26) * b + (1/26) * i + (23/26) * a,
            k == (1/26) * l + (1/26) * b + (1/26) * i + (23/26) * a,
            l == (1/26) * m + (1/26) * i + (24/26) * a,
            m == (1/26) * n + (1/26) * b + (1/26) * i + (23/26) * a,
            n == (1/26) * u + (1/26) * i + (24/26) * a,
            u == 1
            ]

            solution = solve(eqn, a, b, c, d, e, f, g, h, i, j, k, l, m, n, u)

            print(solution)


            And the output (formatting mine, for readability):



            [
            [
            a == (308915775/617830874),
            b == (154457887/308915437),
            c == (308915749/617830874),
            d == (154457549/308915437),
            e == (308898173/617830874),
            f == (308458799/617830874),
            g == (297034399/617830874),
            h == 0,
            i == (154457888/308915437),
            j == (308915801/617830874),
            k == (308916451/617830874),
            l == (308933351/617830874),
            m == (154686375/308915437),
            n == (320797125/617830874),
            u == 1
            ]
            ]


            We see that the probability that the second squirrel will win is exactly $frac{308915775}{617830874}$ or approximately 50.0000547%






            share|improve this answer



















            • 1




              The coefficients you use to relate different $P_{?}$ correspond to the transition matrix $P3$ in my answer. I think you have a typo for $P_{COCO}$ where $P_C$ should be replaced with $P_{COC}$.
              – kaba
              Nov 27 at 13:04










            • Using notation from my answer, you are solving the equation $P3 begin{bmatrix} x_T \ x_A end{bmatrix} = begin{bmatrix}Q3 && R3 \ 0 && I end{bmatrix} begin{bmatrix} x_T \ x_A end{bmatrix} = begin{bmatrix} x_T \ 0 \ 1 end{bmatrix}$ which is equivalent to $x_A = begin{bmatrix} 0 \ 1 end{bmatrix}$ and $x_T = (I - Q3)^{-1} R3 begin{bmatrix} 0 \ 1 end{bmatrix}$.
              – kaba
              Nov 27 at 13:30












            • @kaba you're right, thanks for spotting that! It's not really a typo but rather a mistake that carries on to the result too. That probably explains the difference between my result and yours. I will fix it later.
              – abl
              Nov 27 at 13:38












            • Our solutions are equivalent, and therefore should be equal. I like how leaving it in the linear equation form, as you have, is clearer than solving explicitly using matrix inverse.
              – kaba
              Nov 27 at 13:42















            up vote
            5
            down vote













            Problem (a) has been adequately covered in other answers.



            In this answer I'll show that in problem (b), the second squirrel has better chances.




            • First (for introduction purposes only) we'll see a simplified version of the problem, that is simple enough to be run on a simulation

            • Then we'll see how we can exactly calculate the probability that the second squirrel will win, in this simplified example. The method that we'll use does not depend on any simplification, and can be applied to the original problem as well.

            • We'll see that the result that we obtain with this method matches with the results of the simulation.

            • Having then some extra confidence in the method (not that it's really needed, since it's fairly clear), we'll apply it to the original problem and obtain a result.


            Simplified problem:




            A monkey is typing capital letters from an alphabet that contains the letters A, B and X. Two squirrels observe the sequence of letters thus generated. The first squirrel wins if AAB appears (as three successive letters) before BAA. The second squirrel wins if BAA appears before AAB. Which squirrel is more likely to win?




            If chosen this problem because it has a similarity with the original, which I believe is crucial: "AAB" can overlap with "BAA" only in the form "AABAA", while in the other way around it can be either "BAAAB" or "BAAB". Just like "COCONUT" followed by "TUNOCOC" can be overlapped only in the form "COCONUTUNOCOC", while, the other way around, they can be overlapped like "TUNOCOCOCONUT" or like "TUNOCOCONUT".



            Note however that we'll be solving the actual problem in the end. The only reason I'm including a simpler version is to explain the method on a simpler case, and because for the simpler problem we can run a simulation to verify our predictions.



            Explanation



            We must realize that at any point in the game, any of the two strings can be partly under construction, or not. More importantly, the only thing that affects the probabilities at any point in the game is how much of each string is already written. If, for example, the two last letters of the string so far are "XA", that means that the string AAB is partly written up to "A", and the string BAA is not partly written at all. This and only this is what determines which of the two squirrels is better off at this point, regardless of what has been written so far before "XA".



            Therefore the state of the game at any given point is defined exclusively by the longest suffix of the sequence written so far that is also a prefix of at least one of the strings that the squirrels are betting at.



            Thus this game has exactly seven different possible states, which are:





            • $epsilon$ (empty string, meaning no matching suffix)

            • $verb|A|$

            • $verb|AA|$

            • $verb|AAB|$

            • $verb|B|$

            • $verb|BA|$

            • $verb|BAA|$


            On each of these different states, the squirrels may have different chances of winning. Let's note $P_epsilon$ as the chances that the second squirrel has when the state is $epsilon$, $P_{A}$ as the chances that he has when the state is $verb|A|$, etc. What we're interested in is $P_epsilon$, because that's the initial state of the game.



            We can easily formulate each of this probabilities in terms of the others. From state $epsilon$ there's a 1/3 chance of writing an "A", thus moving to state A. There's also a 1/3 chance of transitioning to B, and finally a 1/3 chance of staying at e (if an "X" is written). That means that the chances for the second squirrel on state e are:



            $$
            P_{epsilon} = frac{1}{3} P_{A} + frac{1}{3} P_{B} + frac{1}{3} P_{epsilon}
            $$



            We can similarly express the chances for all the other states. Note that $P_{AAB} = 0$, because the second squirrel loses immediately on that state, and $P_{BAA} = 1$, because it wins immediately. The full list is:



            $$
            begin{align*}
            P_epsilon &= frac{1}{3} P_{A} + frac{1}{3} P_{B} + frac{1}{3} P_epsilon \
            P_{A} &= frac{1}{3} P_{AA} + frac{1}{3} P_{B} + frac{1}{3} P_epsilon \
            P_{AA} &= frac{1}{3} P_{AAB} + frac{1}{3} P_{AA} + frac{1}{3} P_epsilon \
            P_{AAB} &= 0 \
            P_{B} &= frac{1}{3} P_{BA} + frac{1}{3} P_{B} + frac{1}{3} P_epsilon \
            P_{BA} &= frac{1}{3} P_{BAA} + frac{1}{3} P_{B} + frac{1}{3} P_epsilon \
            P_{BAA} &= 1
            end{align*}
            $$



            This is simply a system of linear equations, and it's simple enough to be solved by hand. The solution is $P_epsilon = frac{8}{13}$ or approximately 61.54%. These are the chances for the second squirrel when the game starts. The chances for the first squirrel are $frac{5}{13}$, about 38.46%.



            Running the simplified problem



            Now, because this game has only three letters, the average game will be pretty short, so much in fact that we can play it enough times to have some statistics. The following program (in Java) simulates 10 million games and prints the percentage of wins for each squirrel. I've run it several times and the results are always in the range of 38.4 - 38.5% of wins for the first squirrel, and 61.5 - 61.6% of wins for the second squirrel, which matches very well with our previous calculations.



            public class Monkeys {

            private enum Squirrel {
            FIRST_SQUIRREL,
            SECOND_SQUIRREL
            }

            private static final Random random = new Random();
            private static final char letters = "ABX".toCharArray();

            public static void main(String args) {

            int winsF = 0;
            int winsS = 0;
            int iterations = 10000000;

            for (int i = 0; i < iterations; i++) {

            Squirrel winner = playGame();
            if (winner == Squirrel.FIRST_SQUIRREL) {
            winsF++;
            } else if (winner == Squirrel.SECOND_SQUIRREL) {
            winsS++;
            }
            }

            System.out.println("First squirrel wins: " + ((double) winsF) / (winsF + winsS));
            System.out.println("Second squirrel wins: " + ((double) winsS) / (winsF + winsS));

            }

            private static Squirrel playGame() {
            StringBuilder sb = new StringBuilder();
            while (true) {
            if (sb.length() >= 3) {
            if (sb.substring(sb.length() - 3, sb.length()).equals("AAB")) {
            return Squirrel.FIRST_SQUIRREL;
            }
            if (sb.substring(sb.length() - 3, sb.length()).equals("BAA")) {
            return Squirrel.SECOND_SQUIRREL;
            }
            }
            sb.append(letters[random.nextInt(letters.length)]);
            }
            }
            }


            Actual problem



            Great! So the only thing we need to do now is apply the same method for the original problem. Following the same reasoning, in the original problem there are 15 distinct states:




            • $epsilon$

            • $verb|C|$

            • $verb|CO|$

            • $verb|COC|$

            • $verb|COCO|$

            • $verb|COCON|$

            • $verb|COCONU|$

            • $verb|COCONUT|$

            • $verb|T|$

            • $verb|TU|$

            • $verb|TUN|$

            • $verb|TUNO|$

            • $verb|TUNOC|$

            • $verb|TUNOCO|$

            • $verb|TUNOCOC|$


            The chances for the second squirrel in each of the states are (thanks @kaba for pointing a previous error):



            $$
            begin{align*}
            P_epsilon &= frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{24}{26} P_epsilon \
            P_{C} &= frac{1}{26} P_{CO} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{CO} &= frac{1}{26} P_{COC} + frac{1}{26} P_{T} + frac{24}{26} P_epsilon \
            P_{COC} &= frac{1}{26} P_{COCO} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{COCO} &= frac{1}{26} P_{COCON} + frac{1}{26} P_{COC} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{COCON} &= frac{1}{26} P_{COCONU} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{COCONU} &= frac{1}{26} P_{COCONUT} + frac{1}{26} P_{C} + frac{24}{26} P_epsilon \
            P_{COCONUT} &= 0 \
            P_{T} &= frac{1}{26} P_{TU} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{TU} &= frac{1}{26} P_{TUN} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{TUN} &= frac{1}{26} P_{TUNO} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{TUNO} &= frac{1}{26} P_{TUNOC} + frac{1}{26} P_{T} + frac{24}{26} P_epsilon \
            P_{TUNOC} &= frac{1}{26} P_{TUNOCO} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{TUNOCO} &= frac{1}{26} P_{TUNOCOC} + frac{1}{26} P_{T} + frac{24}{26} P_epsilon \
            P_{TUNOCOC} &= 1
            end{align*}
            $$



            Finally let's solve this system of equations. I've used SageMath for that purpose. The script is (after renaming the variables):



            var('a, b, c, d, e, f, g, h, i, j, k, l, m, n, u')

            eqn = [
            a == (1/26) * b + (1/26) * i + (24/26) * a,
            b == (1/26) * c + (1/26) * b + (1/26) * i + (23/26) * a,
            c == (1/26) * d + (1/26) * i + (24/26) * a,
            d == (1/26) * e + (1/26) * b + (1/26) * i + (23/26) * a,
            e == (1/26) * f + (1/26) * d + (1/26) * i + (23/26) * a,
            f == (1/26) * g + (1/26) * b + (1/26) * i + (23/26) * a,
            g == (1/26) * h + (1/26) * b + (24/26) * a,
            h == 0,
            i == (1/26) * j + (1/26) * b + (1/26) * i + (23/26) * a,
            j == (1/26) * k + (1/26) * b + (1/26) * i + (23/26) * a,
            k == (1/26) * l + (1/26) * b + (1/26) * i + (23/26) * a,
            l == (1/26) * m + (1/26) * i + (24/26) * a,
            m == (1/26) * n + (1/26) * b + (1/26) * i + (23/26) * a,
            n == (1/26) * u + (1/26) * i + (24/26) * a,
            u == 1
            ]

            solution = solve(eqn, a, b, c, d, e, f, g, h, i, j, k, l, m, n, u)

            print(solution)


            And the output (formatting mine, for readability):



            [
            [
            a == (308915775/617830874),
            b == (154457887/308915437),
            c == (308915749/617830874),
            d == (154457549/308915437),
            e == (308898173/617830874),
            f == (308458799/617830874),
            g == (297034399/617830874),
            h == 0,
            i == (154457888/308915437),
            j == (308915801/617830874),
            k == (308916451/617830874),
            l == (308933351/617830874),
            m == (154686375/308915437),
            n == (320797125/617830874),
            u == 1
            ]
            ]


            We see that the probability that the second squirrel will win is exactly $frac{308915775}{617830874}$ or approximately 50.0000547%






            share|improve this answer



















            • 1




              The coefficients you use to relate different $P_{?}$ correspond to the transition matrix $P3$ in my answer. I think you have a typo for $P_{COCO}$ where $P_C$ should be replaced with $P_{COC}$.
              – kaba
              Nov 27 at 13:04










            • Using notation from my answer, you are solving the equation $P3 begin{bmatrix} x_T \ x_A end{bmatrix} = begin{bmatrix}Q3 && R3 \ 0 && I end{bmatrix} begin{bmatrix} x_T \ x_A end{bmatrix} = begin{bmatrix} x_T \ 0 \ 1 end{bmatrix}$ which is equivalent to $x_A = begin{bmatrix} 0 \ 1 end{bmatrix}$ and $x_T = (I - Q3)^{-1} R3 begin{bmatrix} 0 \ 1 end{bmatrix}$.
              – kaba
              Nov 27 at 13:30












            • @kaba you're right, thanks for spotting that! It's not really a typo but rather a mistake that carries on to the result too. That probably explains the difference between my result and yours. I will fix it later.
              – abl
              Nov 27 at 13:38












            • Our solutions are equivalent, and therefore should be equal. I like how leaving it in the linear equation form, as you have, is clearer than solving explicitly using matrix inverse.
              – kaba
              Nov 27 at 13:42













            up vote
            5
            down vote










            up vote
            5
            down vote









            Problem (a) has been adequately covered in other answers.



            In this answer I'll show that in problem (b), the second squirrel has better chances.




            • First (for introduction purposes only) we'll see a simplified version of the problem, that is simple enough to be run on a simulation

            • Then we'll see how we can exactly calculate the probability that the second squirrel will win, in this simplified example. The method that we'll use does not depend on any simplification, and can be applied to the original problem as well.

            • We'll see that the result that we obtain with this method matches with the results of the simulation.

            • Having then some extra confidence in the method (not that it's really needed, since it's fairly clear), we'll apply it to the original problem and obtain a result.


            Simplified problem:




            A monkey is typing capital letters from an alphabet that contains the letters A, B and X. Two squirrels observe the sequence of letters thus generated. The first squirrel wins if AAB appears (as three successive letters) before BAA. The second squirrel wins if BAA appears before AAB. Which squirrel is more likely to win?




            If chosen this problem because it has a similarity with the original, which I believe is crucial: "AAB" can overlap with "BAA" only in the form "AABAA", while in the other way around it can be either "BAAAB" or "BAAB". Just like "COCONUT" followed by "TUNOCOC" can be overlapped only in the form "COCONUTUNOCOC", while, the other way around, they can be overlapped like "TUNOCOCOCONUT" or like "TUNOCOCONUT".



            Note however that we'll be solving the actual problem in the end. The only reason I'm including a simpler version is to explain the method on a simpler case, and because for the simpler problem we can run a simulation to verify our predictions.



            Explanation



            We must realize that at any point in the game, any of the two strings can be partly under construction, or not. More importantly, the only thing that affects the probabilities at any point in the game is how much of each string is already written. If, for example, the two last letters of the string so far are "XA", that means that the string AAB is partly written up to "A", and the string BAA is not partly written at all. This and only this is what determines which of the two squirrels is better off at this point, regardless of what has been written so far before "XA".



            Therefore the state of the game at any given point is defined exclusively by the longest suffix of the sequence written so far that is also a prefix of at least one of the strings that the squirrels are betting at.



            Thus this game has exactly seven different possible states, which are:





            • $epsilon$ (empty string, meaning no matching suffix)

            • $verb|A|$

            • $verb|AA|$

            • $verb|AAB|$

            • $verb|B|$

            • $verb|BA|$

            • $verb|BAA|$


            On each of these different states, the squirrels may have different chances of winning. Let's note $P_epsilon$ as the chances that the second squirrel has when the state is $epsilon$, $P_{A}$ as the chances that he has when the state is $verb|A|$, etc. What we're interested in is $P_epsilon$, because that's the initial state of the game.



            We can easily formulate each of this probabilities in terms of the others. From state $epsilon$ there's a 1/3 chance of writing an "A", thus moving to state A. There's also a 1/3 chance of transitioning to B, and finally a 1/3 chance of staying at e (if an "X" is written). That means that the chances for the second squirrel on state e are:



            $$
            P_{epsilon} = frac{1}{3} P_{A} + frac{1}{3} P_{B} + frac{1}{3} P_{epsilon}
            $$



            We can similarly express the chances for all the other states. Note that $P_{AAB} = 0$, because the second squirrel loses immediately on that state, and $P_{BAA} = 1$, because it wins immediately. The full list is:



            $$
            begin{align*}
            P_epsilon &= frac{1}{3} P_{A} + frac{1}{3} P_{B} + frac{1}{3} P_epsilon \
            P_{A} &= frac{1}{3} P_{AA} + frac{1}{3} P_{B} + frac{1}{3} P_epsilon \
            P_{AA} &= frac{1}{3} P_{AAB} + frac{1}{3} P_{AA} + frac{1}{3} P_epsilon \
            P_{AAB} &= 0 \
            P_{B} &= frac{1}{3} P_{BA} + frac{1}{3} P_{B} + frac{1}{3} P_epsilon \
            P_{BA} &= frac{1}{3} P_{BAA} + frac{1}{3} P_{B} + frac{1}{3} P_epsilon \
            P_{BAA} &= 1
            end{align*}
            $$



            This is simply a system of linear equations, and it's simple enough to be solved by hand. The solution is $P_epsilon = frac{8}{13}$ or approximately 61.54%. These are the chances for the second squirrel when the game starts. The chances for the first squirrel are $frac{5}{13}$, about 38.46%.



            Running the simplified problem



            Now, because this game has only three letters, the average game will be pretty short, so much in fact that we can play it enough times to have some statistics. The following program (in Java) simulates 10 million games and prints the percentage of wins for each squirrel. I've run it several times and the results are always in the range of 38.4 - 38.5% of wins for the first squirrel, and 61.5 - 61.6% of wins for the second squirrel, which matches very well with our previous calculations.



            public class Monkeys {

            private enum Squirrel {
            FIRST_SQUIRREL,
            SECOND_SQUIRREL
            }

            private static final Random random = new Random();
            private static final char letters = "ABX".toCharArray();

            public static void main(String args) {

            int winsF = 0;
            int winsS = 0;
            int iterations = 10000000;

            for (int i = 0; i < iterations; i++) {

            Squirrel winner = playGame();
            if (winner == Squirrel.FIRST_SQUIRREL) {
            winsF++;
            } else if (winner == Squirrel.SECOND_SQUIRREL) {
            winsS++;
            }
            }

            System.out.println("First squirrel wins: " + ((double) winsF) / (winsF + winsS));
            System.out.println("Second squirrel wins: " + ((double) winsS) / (winsF + winsS));

            }

            private static Squirrel playGame() {
            StringBuilder sb = new StringBuilder();
            while (true) {
            if (sb.length() >= 3) {
            if (sb.substring(sb.length() - 3, sb.length()).equals("AAB")) {
            return Squirrel.FIRST_SQUIRREL;
            }
            if (sb.substring(sb.length() - 3, sb.length()).equals("BAA")) {
            return Squirrel.SECOND_SQUIRREL;
            }
            }
            sb.append(letters[random.nextInt(letters.length)]);
            }
            }
            }


            Actual problem



            Great! So the only thing we need to do now is apply the same method for the original problem. Following the same reasoning, in the original problem there are 15 distinct states:




            • $epsilon$

            • $verb|C|$

            • $verb|CO|$

            • $verb|COC|$

            • $verb|COCO|$

            • $verb|COCON|$

            • $verb|COCONU|$

            • $verb|COCONUT|$

            • $verb|T|$

            • $verb|TU|$

            • $verb|TUN|$

            • $verb|TUNO|$

            • $verb|TUNOC|$

            • $verb|TUNOCO|$

            • $verb|TUNOCOC|$


            The chances for the second squirrel in each of the states are (thanks @kaba for pointing a previous error):



            $$
            begin{align*}
            P_epsilon &= frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{24}{26} P_epsilon \
            P_{C} &= frac{1}{26} P_{CO} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{CO} &= frac{1}{26} P_{COC} + frac{1}{26} P_{T} + frac{24}{26} P_epsilon \
            P_{COC} &= frac{1}{26} P_{COCO} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{COCO} &= frac{1}{26} P_{COCON} + frac{1}{26} P_{COC} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{COCON} &= frac{1}{26} P_{COCONU} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{COCONU} &= frac{1}{26} P_{COCONUT} + frac{1}{26} P_{C} + frac{24}{26} P_epsilon \
            P_{COCONUT} &= 0 \
            P_{T} &= frac{1}{26} P_{TU} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{TU} &= frac{1}{26} P_{TUN} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{TUN} &= frac{1}{26} P_{TUNO} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{TUNO} &= frac{1}{26} P_{TUNOC} + frac{1}{26} P_{T} + frac{24}{26} P_epsilon \
            P_{TUNOC} &= frac{1}{26} P_{TUNOCO} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{TUNOCO} &= frac{1}{26} P_{TUNOCOC} + frac{1}{26} P_{T} + frac{24}{26} P_epsilon \
            P_{TUNOCOC} &= 1
            end{align*}
            $$



            Finally let's solve this system of equations. I've used SageMath for that purpose. The script is (after renaming the variables):



            var('a, b, c, d, e, f, g, h, i, j, k, l, m, n, u')

            eqn = [
            a == (1/26) * b + (1/26) * i + (24/26) * a,
            b == (1/26) * c + (1/26) * b + (1/26) * i + (23/26) * a,
            c == (1/26) * d + (1/26) * i + (24/26) * a,
            d == (1/26) * e + (1/26) * b + (1/26) * i + (23/26) * a,
            e == (1/26) * f + (1/26) * d + (1/26) * i + (23/26) * a,
            f == (1/26) * g + (1/26) * b + (1/26) * i + (23/26) * a,
            g == (1/26) * h + (1/26) * b + (24/26) * a,
            h == 0,
            i == (1/26) * j + (1/26) * b + (1/26) * i + (23/26) * a,
            j == (1/26) * k + (1/26) * b + (1/26) * i + (23/26) * a,
            k == (1/26) * l + (1/26) * b + (1/26) * i + (23/26) * a,
            l == (1/26) * m + (1/26) * i + (24/26) * a,
            m == (1/26) * n + (1/26) * b + (1/26) * i + (23/26) * a,
            n == (1/26) * u + (1/26) * i + (24/26) * a,
            u == 1
            ]

            solution = solve(eqn, a, b, c, d, e, f, g, h, i, j, k, l, m, n, u)

            print(solution)


            And the output (formatting mine, for readability):



            [
            [
            a == (308915775/617830874),
            b == (154457887/308915437),
            c == (308915749/617830874),
            d == (154457549/308915437),
            e == (308898173/617830874),
            f == (308458799/617830874),
            g == (297034399/617830874),
            h == 0,
            i == (154457888/308915437),
            j == (308915801/617830874),
            k == (308916451/617830874),
            l == (308933351/617830874),
            m == (154686375/308915437),
            n == (320797125/617830874),
            u == 1
            ]
            ]


            We see that the probability that the second squirrel will win is exactly $frac{308915775}{617830874}$ or approximately 50.0000547%






            share|improve this answer














            Problem (a) has been adequately covered in other answers.



            In this answer I'll show that in problem (b), the second squirrel has better chances.




            • First (for introduction purposes only) we'll see a simplified version of the problem, that is simple enough to be run on a simulation

            • Then we'll see how we can exactly calculate the probability that the second squirrel will win, in this simplified example. The method that we'll use does not depend on any simplification, and can be applied to the original problem as well.

            • We'll see that the result that we obtain with this method matches with the results of the simulation.

            • Having then some extra confidence in the method (not that it's really needed, since it's fairly clear), we'll apply it to the original problem and obtain a result.


            Simplified problem:




            A monkey is typing capital letters from an alphabet that contains the letters A, B and X. Two squirrels observe the sequence of letters thus generated. The first squirrel wins if AAB appears (as three successive letters) before BAA. The second squirrel wins if BAA appears before AAB. Which squirrel is more likely to win?




            If chosen this problem because it has a similarity with the original, which I believe is crucial: "AAB" can overlap with "BAA" only in the form "AABAA", while in the other way around it can be either "BAAAB" or "BAAB". Just like "COCONUT" followed by "TUNOCOC" can be overlapped only in the form "COCONUTUNOCOC", while, the other way around, they can be overlapped like "TUNOCOCOCONUT" or like "TUNOCOCONUT".



            Note however that we'll be solving the actual problem in the end. The only reason I'm including a simpler version is to explain the method on a simpler case, and because for the simpler problem we can run a simulation to verify our predictions.



            Explanation



            We must realize that at any point in the game, any of the two strings can be partly under construction, or not. More importantly, the only thing that affects the probabilities at any point in the game is how much of each string is already written. If, for example, the two last letters of the string so far are "XA", that means that the string AAB is partly written up to "A", and the string BAA is not partly written at all. This and only this is what determines which of the two squirrels is better off at this point, regardless of what has been written so far before "XA".



            Therefore the state of the game at any given point is defined exclusively by the longest suffix of the sequence written so far that is also a prefix of at least one of the strings that the squirrels are betting at.



            Thus this game has exactly seven different possible states, which are:





            • $epsilon$ (empty string, meaning no matching suffix)

            • $verb|A|$

            • $verb|AA|$

            • $verb|AAB|$

            • $verb|B|$

            • $verb|BA|$

            • $verb|BAA|$


            On each of these different states, the squirrels may have different chances of winning. Let's note $P_epsilon$ as the chances that the second squirrel has when the state is $epsilon$, $P_{A}$ as the chances that he has when the state is $verb|A|$, etc. What we're interested in is $P_epsilon$, because that's the initial state of the game.



            We can easily formulate each of this probabilities in terms of the others. From state $epsilon$ there's a 1/3 chance of writing an "A", thus moving to state A. There's also a 1/3 chance of transitioning to B, and finally a 1/3 chance of staying at e (if an "X" is written). That means that the chances for the second squirrel on state e are:



            $$
            P_{epsilon} = frac{1}{3} P_{A} + frac{1}{3} P_{B} + frac{1}{3} P_{epsilon}
            $$



            We can similarly express the chances for all the other states. Note that $P_{AAB} = 0$, because the second squirrel loses immediately on that state, and $P_{BAA} = 1$, because it wins immediately. The full list is:



            $$
            begin{align*}
            P_epsilon &= frac{1}{3} P_{A} + frac{1}{3} P_{B} + frac{1}{3} P_epsilon \
            P_{A} &= frac{1}{3} P_{AA} + frac{1}{3} P_{B} + frac{1}{3} P_epsilon \
            P_{AA} &= frac{1}{3} P_{AAB} + frac{1}{3} P_{AA} + frac{1}{3} P_epsilon \
            P_{AAB} &= 0 \
            P_{B} &= frac{1}{3} P_{BA} + frac{1}{3} P_{B} + frac{1}{3} P_epsilon \
            P_{BA} &= frac{1}{3} P_{BAA} + frac{1}{3} P_{B} + frac{1}{3} P_epsilon \
            P_{BAA} &= 1
            end{align*}
            $$



            This is simply a system of linear equations, and it's simple enough to be solved by hand. The solution is $P_epsilon = frac{8}{13}$ or approximately 61.54%. These are the chances for the second squirrel when the game starts. The chances for the first squirrel are $frac{5}{13}$, about 38.46%.



            Running the simplified problem



            Now, because this game has only three letters, the average game will be pretty short, so much in fact that we can play it enough times to have some statistics. The following program (in Java) simulates 10 million games and prints the percentage of wins for each squirrel. I've run it several times and the results are always in the range of 38.4 - 38.5% of wins for the first squirrel, and 61.5 - 61.6% of wins for the second squirrel, which matches very well with our previous calculations.



            public class Monkeys {

            private enum Squirrel {
            FIRST_SQUIRREL,
            SECOND_SQUIRREL
            }

            private static final Random random = new Random();
            private static final char letters = "ABX".toCharArray();

            public static void main(String args) {

            int winsF = 0;
            int winsS = 0;
            int iterations = 10000000;

            for (int i = 0; i < iterations; i++) {

            Squirrel winner = playGame();
            if (winner == Squirrel.FIRST_SQUIRREL) {
            winsF++;
            } else if (winner == Squirrel.SECOND_SQUIRREL) {
            winsS++;
            }
            }

            System.out.println("First squirrel wins: " + ((double) winsF) / (winsF + winsS));
            System.out.println("Second squirrel wins: " + ((double) winsS) / (winsF + winsS));

            }

            private static Squirrel playGame() {
            StringBuilder sb = new StringBuilder();
            while (true) {
            if (sb.length() >= 3) {
            if (sb.substring(sb.length() - 3, sb.length()).equals("AAB")) {
            return Squirrel.FIRST_SQUIRREL;
            }
            if (sb.substring(sb.length() - 3, sb.length()).equals("BAA")) {
            return Squirrel.SECOND_SQUIRREL;
            }
            }
            sb.append(letters[random.nextInt(letters.length)]);
            }
            }
            }


            Actual problem



            Great! So the only thing we need to do now is apply the same method for the original problem. Following the same reasoning, in the original problem there are 15 distinct states:




            • $epsilon$

            • $verb|C|$

            • $verb|CO|$

            • $verb|COC|$

            • $verb|COCO|$

            • $verb|COCON|$

            • $verb|COCONU|$

            • $verb|COCONUT|$

            • $verb|T|$

            • $verb|TU|$

            • $verb|TUN|$

            • $verb|TUNO|$

            • $verb|TUNOC|$

            • $verb|TUNOCO|$

            • $verb|TUNOCOC|$


            The chances for the second squirrel in each of the states are (thanks @kaba for pointing a previous error):



            $$
            begin{align*}
            P_epsilon &= frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{24}{26} P_epsilon \
            P_{C} &= frac{1}{26} P_{CO} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{CO} &= frac{1}{26} P_{COC} + frac{1}{26} P_{T} + frac{24}{26} P_epsilon \
            P_{COC} &= frac{1}{26} P_{COCO} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{COCO} &= frac{1}{26} P_{COCON} + frac{1}{26} P_{COC} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{COCON} &= frac{1}{26} P_{COCONU} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{COCONU} &= frac{1}{26} P_{COCONUT} + frac{1}{26} P_{C} + frac{24}{26} P_epsilon \
            P_{COCONUT} &= 0 \
            P_{T} &= frac{1}{26} P_{TU} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{TU} &= frac{1}{26} P_{TUN} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{TUN} &= frac{1}{26} P_{TUNO} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{TUNO} &= frac{1}{26} P_{TUNOC} + frac{1}{26} P_{T} + frac{24}{26} P_epsilon \
            P_{TUNOC} &= frac{1}{26} P_{TUNOCO} + frac{1}{26} P_{C} + frac{1}{26} P_{T} + frac{23}{26} P_epsilon \
            P_{TUNOCO} &= frac{1}{26} P_{TUNOCOC} + frac{1}{26} P_{T} + frac{24}{26} P_epsilon \
            P_{TUNOCOC} &= 1
            end{align*}
            $$



            Finally let's solve this system of equations. I've used SageMath for that purpose. The script is (after renaming the variables):



            var('a, b, c, d, e, f, g, h, i, j, k, l, m, n, u')

            eqn = [
            a == (1/26) * b + (1/26) * i + (24/26) * a,
            b == (1/26) * c + (1/26) * b + (1/26) * i + (23/26) * a,
            c == (1/26) * d + (1/26) * i + (24/26) * a,
            d == (1/26) * e + (1/26) * b + (1/26) * i + (23/26) * a,
            e == (1/26) * f + (1/26) * d + (1/26) * i + (23/26) * a,
            f == (1/26) * g + (1/26) * b + (1/26) * i + (23/26) * a,
            g == (1/26) * h + (1/26) * b + (24/26) * a,
            h == 0,
            i == (1/26) * j + (1/26) * b + (1/26) * i + (23/26) * a,
            j == (1/26) * k + (1/26) * b + (1/26) * i + (23/26) * a,
            k == (1/26) * l + (1/26) * b + (1/26) * i + (23/26) * a,
            l == (1/26) * m + (1/26) * i + (24/26) * a,
            m == (1/26) * n + (1/26) * b + (1/26) * i + (23/26) * a,
            n == (1/26) * u + (1/26) * i + (24/26) * a,
            u == 1
            ]

            solution = solve(eqn, a, b, c, d, e, f, g, h, i, j, k, l, m, n, u)

            print(solution)


            And the output (formatting mine, for readability):



            [
            [
            a == (308915775/617830874),
            b == (154457887/308915437),
            c == (308915749/617830874),
            d == (154457549/308915437),
            e == (308898173/617830874),
            f == (308458799/617830874),
            g == (297034399/617830874),
            h == 0,
            i == (154457888/308915437),
            j == (308915801/617830874),
            k == (308916451/617830874),
            l == (308933351/617830874),
            m == (154686375/308915437),
            n == (320797125/617830874),
            u == 1
            ]
            ]


            We see that the probability that the second squirrel will win is exactly $frac{308915775}{617830874}$ or approximately 50.0000547%







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Nov 27 at 22:29

























            answered Nov 27 at 1:34









            abl

            2064




            2064








            • 1




              The coefficients you use to relate different $P_{?}$ correspond to the transition matrix $P3$ in my answer. I think you have a typo for $P_{COCO}$ where $P_C$ should be replaced with $P_{COC}$.
              – kaba
              Nov 27 at 13:04










            • Using notation from my answer, you are solving the equation $P3 begin{bmatrix} x_T \ x_A end{bmatrix} = begin{bmatrix}Q3 && R3 \ 0 && I end{bmatrix} begin{bmatrix} x_T \ x_A end{bmatrix} = begin{bmatrix} x_T \ 0 \ 1 end{bmatrix}$ which is equivalent to $x_A = begin{bmatrix} 0 \ 1 end{bmatrix}$ and $x_T = (I - Q3)^{-1} R3 begin{bmatrix} 0 \ 1 end{bmatrix}$.
              – kaba
              Nov 27 at 13:30












            • @kaba you're right, thanks for spotting that! It's not really a typo but rather a mistake that carries on to the result too. That probably explains the difference between my result and yours. I will fix it later.
              – abl
              Nov 27 at 13:38












            • Our solutions are equivalent, and therefore should be equal. I like how leaving it in the linear equation form, as you have, is clearer than solving explicitly using matrix inverse.
              – kaba
              Nov 27 at 13:42














            • 1




              The coefficients you use to relate different $P_{?}$ correspond to the transition matrix $P3$ in my answer. I think you have a typo for $P_{COCO}$ where $P_C$ should be replaced with $P_{COC}$.
              – kaba
              Nov 27 at 13:04










            • Using notation from my answer, you are solving the equation $P3 begin{bmatrix} x_T \ x_A end{bmatrix} = begin{bmatrix}Q3 && R3 \ 0 && I end{bmatrix} begin{bmatrix} x_T \ x_A end{bmatrix} = begin{bmatrix} x_T \ 0 \ 1 end{bmatrix}$ which is equivalent to $x_A = begin{bmatrix} 0 \ 1 end{bmatrix}$ and $x_T = (I - Q3)^{-1} R3 begin{bmatrix} 0 \ 1 end{bmatrix}$.
              – kaba
              Nov 27 at 13:30












            • @kaba you're right, thanks for spotting that! It's not really a typo but rather a mistake that carries on to the result too. That probably explains the difference between my result and yours. I will fix it later.
              – abl
              Nov 27 at 13:38












            • Our solutions are equivalent, and therefore should be equal. I like how leaving it in the linear equation form, as you have, is clearer than solving explicitly using matrix inverse.
              – kaba
              Nov 27 at 13:42








            1




            1




            The coefficients you use to relate different $P_{?}$ correspond to the transition matrix $P3$ in my answer. I think you have a typo for $P_{COCO}$ where $P_C$ should be replaced with $P_{COC}$.
            – kaba
            Nov 27 at 13:04




            The coefficients you use to relate different $P_{?}$ correspond to the transition matrix $P3$ in my answer. I think you have a typo for $P_{COCO}$ where $P_C$ should be replaced with $P_{COC}$.
            – kaba
            Nov 27 at 13:04












            Using notation from my answer, you are solving the equation $P3 begin{bmatrix} x_T \ x_A end{bmatrix} = begin{bmatrix}Q3 && R3 \ 0 && I end{bmatrix} begin{bmatrix} x_T \ x_A end{bmatrix} = begin{bmatrix} x_T \ 0 \ 1 end{bmatrix}$ which is equivalent to $x_A = begin{bmatrix} 0 \ 1 end{bmatrix}$ and $x_T = (I - Q3)^{-1} R3 begin{bmatrix} 0 \ 1 end{bmatrix}$.
            – kaba
            Nov 27 at 13:30






            Using notation from my answer, you are solving the equation $P3 begin{bmatrix} x_T \ x_A end{bmatrix} = begin{bmatrix}Q3 && R3 \ 0 && I end{bmatrix} begin{bmatrix} x_T \ x_A end{bmatrix} = begin{bmatrix} x_T \ 0 \ 1 end{bmatrix}$ which is equivalent to $x_A = begin{bmatrix} 0 \ 1 end{bmatrix}$ and $x_T = (I - Q3)^{-1} R3 begin{bmatrix} 0 \ 1 end{bmatrix}$.
            – kaba
            Nov 27 at 13:30














            @kaba you're right, thanks for spotting that! It's not really a typo but rather a mistake that carries on to the result too. That probably explains the difference between my result and yours. I will fix it later.
            – abl
            Nov 27 at 13:38






            @kaba you're right, thanks for spotting that! It's not really a typo but rather a mistake that carries on to the result too. That probably explains the difference between my result and yours. I will fix it later.
            – abl
            Nov 27 at 13:38














            Our solutions are equivalent, and therefore should be equal. I like how leaving it in the linear equation form, as you have, is clearer than solving explicitly using matrix inverse.
            – kaba
            Nov 27 at 13:42




            Our solutions are equivalent, and therefore should be equal. I like how leaving it in the linear equation form, as you have, is clearer than solving explicitly using matrix inverse.
            – kaba
            Nov 27 at 13:42










            up vote
            4
            down vote













            I'll try to make the argument that the answer to both parts is:




            The two are just as likely.




            We start from the simple fact that the probability that a window of 7 letters reads COCONUT, or TUNOCOC, or anything else, is equal. (It's 26^(-7), as @Ingix said.). Crucially, it's completely independent of any letters that appear before or after the window, simply because these letters are not in the window...



            Now, let's start with (b). We look at letters 1-7. If they say COCONUT, the first squirrel wins. If they say TUNOCOC (at the same probability!), the second one wins. If they say neither, we move on to letters 2-8 and do the same. At this point, letter 1 is completely irrelevant. You see how we move on and on through the letters, until there's a winner, and the chances of either squirrels to win never differ from each other.



            Further clarification:




            • Of course, if letters 2-7 happened to be COCONU, the first squirrel is in a very good position. Similarly (and in the same probability), if they were TUNOCO, the second squirrel is in a good position. But this is symmetrical, so no one has an overall advantage.

            • Of course, if letters 1-7 were TUNOCOC, the chance to get COCONUT at letters 5-11 increases, but this doesn't matter since, in this instance, the game has already finished...

            • In other words, if we were to ask "Which of the two words are we more likely to see at letters 5-11?" (see @abl's comment below), the answer would be TUNOCOC. But the difference is only due to a situation in which the game has already finished. And this is just not the question in hand.


            Part (a) is a very similar. We look at letters 1-7 of both monkeys. If the first monkey typed COCONUT, it wins. If the second one types TUNOCOC, it wins. If neither won, we move on to look at letters 2-8 of both, and so on.



            No Markov chains :)






            share|improve this answer



















            • 1




              Except that in (b) the probabilities are not independent across windows. The probability of letters 5-11 being COCONUT, conditional to the fact that you reached this window, are lower than the equivalent for TUNOCOC, simply because you have to account for the case "TUNOCOCONUT" (in which case you wouldn't reach window 5-11). Maybe you can believe the discussion now :)
              – abl
              Nov 25 at 0:11












            • @abl Sorry, but this is just not true... Are you saying that, typing randomly, one is more likely to type COCONUT than TUNOCOC? These are just two 7-letter sequences, and they're just as likely. Arguments like "consider the case X" are not really valid, because there are so many other cases to consider too. The trick is to be able to consider all cases together with a general argument.
              – Angkor
              Nov 25 at 13:37








            • 4




              @Angkor Maybe I didn't explain myself well. Consider all possible sequences that start with "****TUNOCOC...". It's obvious that there are just as many of those as sequences of the form "****COCONUT...", i.e. both situations are equally probable, right? Note that all sequences of the form "****TUNOCOC..." are a win for squirrel 2. And all sequences of the form "****COCONUT..." are a win for squirrel 1, except for those that start with "TUNOCOCONUT..." which are a win for squirrel 2. There's an asymmetry here, so at the very least we can say that those are not "just two 7-letter sequences".
              – abl
              Nov 25 at 16:33






            • 2




              @Angkor Point taken, I apologize if I was out of tone. Imagine that the alphabet has only two letters, A and B. The squirrels bet for AAB and BAA respectively. The first three letters are typed. Each squirrel has 1/8 chances of winning at the third letter. If none of them wins, then the sequence is one of AAA, ABA, ABB, BAB, BBA, BBB, with equal probability. At the fourth letter, the possibilities (with equal probability) are AAAA, AAAB, ABAA, ABAB, ABBA, ABBB, BABA, BABB, BBAA, BBAB, BBBA, BBBB. Squirrel 1 wins with AAAB. Squirrel 2 wins with ABAA or BBAA, twice the chance.
              – abl
              Nov 25 at 20:43






            • 1




              @Angkor, also, this simplified case is simple enough to be simulated, which I did (see my answer) and the simulation over several million games shows the second squirrel has better chances.
              – abl
              Nov 25 at 20:49















            up vote
            4
            down vote













            I'll try to make the argument that the answer to both parts is:




            The two are just as likely.




            We start from the simple fact that the probability that a window of 7 letters reads COCONUT, or TUNOCOC, or anything else, is equal. (It's 26^(-7), as @Ingix said.). Crucially, it's completely independent of any letters that appear before or after the window, simply because these letters are not in the window...



            Now, let's start with (b). We look at letters 1-7. If they say COCONUT, the first squirrel wins. If they say TUNOCOC (at the same probability!), the second one wins. If they say neither, we move on to letters 2-8 and do the same. At this point, letter 1 is completely irrelevant. You see how we move on and on through the letters, until there's a winner, and the chances of either squirrels to win never differ from each other.



            Further clarification:




            • Of course, if letters 2-7 happened to be COCONU, the first squirrel is in a very good position. Similarly (and in the same probability), if they were TUNOCO, the second squirrel is in a good position. But this is symmetrical, so no one has an overall advantage.

            • Of course, if letters 1-7 were TUNOCOC, the chance to get COCONUT at letters 5-11 increases, but this doesn't matter since, in this instance, the game has already finished...

            • In other words, if we were to ask "Which of the two words are we more likely to see at letters 5-11?" (see @abl's comment below), the answer would be TUNOCOC. But the difference is only due to a situation in which the game has already finished. And this is just not the question in hand.


            Part (a) is a very similar. We look at letters 1-7 of both monkeys. If the first monkey typed COCONUT, it wins. If the second one types TUNOCOC, it wins. If neither won, we move on to look at letters 2-8 of both, and so on.



            No Markov chains :)






            share|improve this answer



















            • 1




              Except that in (b) the probabilities are not independent across windows. The probability of letters 5-11 being COCONUT, conditional to the fact that you reached this window, are lower than the equivalent for TUNOCOC, simply because you have to account for the case "TUNOCOCONUT" (in which case you wouldn't reach window 5-11). Maybe you can believe the discussion now :)
              – abl
              Nov 25 at 0:11












            • @abl Sorry, but this is just not true... Are you saying that, typing randomly, one is more likely to type COCONUT than TUNOCOC? These are just two 7-letter sequences, and they're just as likely. Arguments like "consider the case X" are not really valid, because there are so many other cases to consider too. The trick is to be able to consider all cases together with a general argument.
              – Angkor
              Nov 25 at 13:37








            • 4




              @Angkor Maybe I didn't explain myself well. Consider all possible sequences that start with "****TUNOCOC...". It's obvious that there are just as many of those as sequences of the form "****COCONUT...", i.e. both situations are equally probable, right? Note that all sequences of the form "****TUNOCOC..." are a win for squirrel 2. And all sequences of the form "****COCONUT..." are a win for squirrel 1, except for those that start with "TUNOCOCONUT..." which are a win for squirrel 2. There's an asymmetry here, so at the very least we can say that those are not "just two 7-letter sequences".
              – abl
              Nov 25 at 16:33






            • 2




              @Angkor Point taken, I apologize if I was out of tone. Imagine that the alphabet has only two letters, A and B. The squirrels bet for AAB and BAA respectively. The first three letters are typed. Each squirrel has 1/8 chances of winning at the third letter. If none of them wins, then the sequence is one of AAA, ABA, ABB, BAB, BBA, BBB, with equal probability. At the fourth letter, the possibilities (with equal probability) are AAAA, AAAB, ABAA, ABAB, ABBA, ABBB, BABA, BABB, BBAA, BBAB, BBBA, BBBB. Squirrel 1 wins with AAAB. Squirrel 2 wins with ABAA or BBAA, twice the chance.
              – abl
              Nov 25 at 20:43






            • 1




              @Angkor, also, this simplified case is simple enough to be simulated, which I did (see my answer) and the simulation over several million games shows the second squirrel has better chances.
              – abl
              Nov 25 at 20:49













            up vote
            4
            down vote










            up vote
            4
            down vote









            I'll try to make the argument that the answer to both parts is:




            The two are just as likely.




            We start from the simple fact that the probability that a window of 7 letters reads COCONUT, or TUNOCOC, or anything else, is equal. (It's 26^(-7), as @Ingix said.). Crucially, it's completely independent of any letters that appear before or after the window, simply because these letters are not in the window...



            Now, let's start with (b). We look at letters 1-7. If they say COCONUT, the first squirrel wins. If they say TUNOCOC (at the same probability!), the second one wins. If they say neither, we move on to letters 2-8 and do the same. At this point, letter 1 is completely irrelevant. You see how we move on and on through the letters, until there's a winner, and the chances of either squirrels to win never differ from each other.



            Further clarification:




            • Of course, if letters 2-7 happened to be COCONU, the first squirrel is in a very good position. Similarly (and in the same probability), if they were TUNOCO, the second squirrel is in a good position. But this is symmetrical, so no one has an overall advantage.

            • Of course, if letters 1-7 were TUNOCOC, the chance to get COCONUT at letters 5-11 increases, but this doesn't matter since, in this instance, the game has already finished...

            • In other words, if we were to ask "Which of the two words are we more likely to see at letters 5-11?" (see @abl's comment below), the answer would be TUNOCOC. But the difference is only due to a situation in which the game has already finished. And this is just not the question in hand.


            Part (a) is a very similar. We look at letters 1-7 of both monkeys. If the first monkey typed COCONUT, it wins. If the second one types TUNOCOC, it wins. If neither won, we move on to look at letters 2-8 of both, and so on.



            No Markov chains :)






            share|improve this answer














            I'll try to make the argument that the answer to both parts is:




            The two are just as likely.




            We start from the simple fact that the probability that a window of 7 letters reads COCONUT, or TUNOCOC, or anything else, is equal. (It's 26^(-7), as @Ingix said.). Crucially, it's completely independent of any letters that appear before or after the window, simply because these letters are not in the window...



            Now, let's start with (b). We look at letters 1-7. If they say COCONUT, the first squirrel wins. If they say TUNOCOC (at the same probability!), the second one wins. If they say neither, we move on to letters 2-8 and do the same. At this point, letter 1 is completely irrelevant. You see how we move on and on through the letters, until there's a winner, and the chances of either squirrels to win never differ from each other.



            Further clarification:




            • Of course, if letters 2-7 happened to be COCONU, the first squirrel is in a very good position. Similarly (and in the same probability), if they were TUNOCO, the second squirrel is in a good position. But this is symmetrical, so no one has an overall advantage.

            • Of course, if letters 1-7 were TUNOCOC, the chance to get COCONUT at letters 5-11 increases, but this doesn't matter since, in this instance, the game has already finished...

            • In other words, if we were to ask "Which of the two words are we more likely to see at letters 5-11?" (see @abl's comment below), the answer would be TUNOCOC. But the difference is only due to a situation in which the game has already finished. And this is just not the question in hand.


            Part (a) is a very similar. We look at letters 1-7 of both monkeys. If the first monkey typed COCONUT, it wins. If the second one types TUNOCOC, it wins. If neither won, we move on to look at letters 2-8 of both, and so on.



            No Markov chains :)







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Nov 25 at 20:29

























            answered Nov 24 at 22:27









            Angkor

            1,149316




            1,149316








            • 1




              Except that in (b) the probabilities are not independent across windows. The probability of letters 5-11 being COCONUT, conditional to the fact that you reached this window, are lower than the equivalent for TUNOCOC, simply because you have to account for the case "TUNOCOCONUT" (in which case you wouldn't reach window 5-11). Maybe you can believe the discussion now :)
              – abl
              Nov 25 at 0:11












            • @abl Sorry, but this is just not true... Are you saying that, typing randomly, one is more likely to type COCONUT than TUNOCOC? These are just two 7-letter sequences, and they're just as likely. Arguments like "consider the case X" are not really valid, because there are so many other cases to consider too. The trick is to be able to consider all cases together with a general argument.
              – Angkor
              Nov 25 at 13:37








            • 4




              @Angkor Maybe I didn't explain myself well. Consider all possible sequences that start with "****TUNOCOC...". It's obvious that there are just as many of those as sequences of the form "****COCONUT...", i.e. both situations are equally probable, right? Note that all sequences of the form "****TUNOCOC..." are a win for squirrel 2. And all sequences of the form "****COCONUT..." are a win for squirrel 1, except for those that start with "TUNOCOCONUT..." which are a win for squirrel 2. There's an asymmetry here, so at the very least we can say that those are not "just two 7-letter sequences".
              – abl
              Nov 25 at 16:33






            • 2




              @Angkor Point taken, I apologize if I was out of tone. Imagine that the alphabet has only two letters, A and B. The squirrels bet for AAB and BAA respectively. The first three letters are typed. Each squirrel has 1/8 chances of winning at the third letter. If none of them wins, then the sequence is one of AAA, ABA, ABB, BAB, BBA, BBB, with equal probability. At the fourth letter, the possibilities (with equal probability) are AAAA, AAAB, ABAA, ABAB, ABBA, ABBB, BABA, BABB, BBAA, BBAB, BBBA, BBBB. Squirrel 1 wins with AAAB. Squirrel 2 wins with ABAA or BBAA, twice the chance.
              – abl
              Nov 25 at 20:43






            • 1




              @Angkor, also, this simplified case is simple enough to be simulated, which I did (see my answer) and the simulation over several million games shows the second squirrel has better chances.
              – abl
              Nov 25 at 20:49














            • 1




              Except that in (b) the probabilities are not independent across windows. The probability of letters 5-11 being COCONUT, conditional to the fact that you reached this window, are lower than the equivalent for TUNOCOC, simply because you have to account for the case "TUNOCOCONUT" (in which case you wouldn't reach window 5-11). Maybe you can believe the discussion now :)
              – abl
              Nov 25 at 0:11












            • @abl Sorry, but this is just not true... Are you saying that, typing randomly, one is more likely to type COCONUT than TUNOCOC? These are just two 7-letter sequences, and they're just as likely. Arguments like "consider the case X" are not really valid, because there are so many other cases to consider too. The trick is to be able to consider all cases together with a general argument.
              – Angkor
              Nov 25 at 13:37








            • 4




              @Angkor Maybe I didn't explain myself well. Consider all possible sequences that start with "****TUNOCOC...". It's obvious that there are just as many of those as sequences of the form "****COCONUT...", i.e. both situations are equally probable, right? Note that all sequences of the form "****TUNOCOC..." are a win for squirrel 2. And all sequences of the form "****COCONUT..." are a win for squirrel 1, except for those that start with "TUNOCOCONUT..." which are a win for squirrel 2. There's an asymmetry here, so at the very least we can say that those are not "just two 7-letter sequences".
              – abl
              Nov 25 at 16:33






            • 2




              @Angkor Point taken, I apologize if I was out of tone. Imagine that the alphabet has only two letters, A and B. The squirrels bet for AAB and BAA respectively. The first three letters are typed. Each squirrel has 1/8 chances of winning at the third letter. If none of them wins, then the sequence is one of AAA, ABA, ABB, BAB, BBA, BBB, with equal probability. At the fourth letter, the possibilities (with equal probability) are AAAA, AAAB, ABAA, ABAB, ABBA, ABBB, BABA, BABB, BBAA, BBAB, BBBA, BBBB. Squirrel 1 wins with AAAB. Squirrel 2 wins with ABAA or BBAA, twice the chance.
              – abl
              Nov 25 at 20:43






            • 1




              @Angkor, also, this simplified case is simple enough to be simulated, which I did (see my answer) and the simulation over several million games shows the second squirrel has better chances.
              – abl
              Nov 25 at 20:49








            1




            1




            Except that in (b) the probabilities are not independent across windows. The probability of letters 5-11 being COCONUT, conditional to the fact that you reached this window, are lower than the equivalent for TUNOCOC, simply because you have to account for the case "TUNOCOCONUT" (in which case you wouldn't reach window 5-11). Maybe you can believe the discussion now :)
            – abl
            Nov 25 at 0:11






            Except that in (b) the probabilities are not independent across windows. The probability of letters 5-11 being COCONUT, conditional to the fact that you reached this window, are lower than the equivalent for TUNOCOC, simply because you have to account for the case "TUNOCOCONUT" (in which case you wouldn't reach window 5-11). Maybe you can believe the discussion now :)
            – abl
            Nov 25 at 0:11














            @abl Sorry, but this is just not true... Are you saying that, typing randomly, one is more likely to type COCONUT than TUNOCOC? These are just two 7-letter sequences, and they're just as likely. Arguments like "consider the case X" are not really valid, because there are so many other cases to consider too. The trick is to be able to consider all cases together with a general argument.
            – Angkor
            Nov 25 at 13:37






            @abl Sorry, but this is just not true... Are you saying that, typing randomly, one is more likely to type COCONUT than TUNOCOC? These are just two 7-letter sequences, and they're just as likely. Arguments like "consider the case X" are not really valid, because there are so many other cases to consider too. The trick is to be able to consider all cases together with a general argument.
            – Angkor
            Nov 25 at 13:37






            4




            4




            @Angkor Maybe I didn't explain myself well. Consider all possible sequences that start with "****TUNOCOC...". It's obvious that there are just as many of those as sequences of the form "****COCONUT...", i.e. both situations are equally probable, right? Note that all sequences of the form "****TUNOCOC..." are a win for squirrel 2. And all sequences of the form "****COCONUT..." are a win for squirrel 1, except for those that start with "TUNOCOCONUT..." which are a win for squirrel 2. There's an asymmetry here, so at the very least we can say that those are not "just two 7-letter sequences".
            – abl
            Nov 25 at 16:33




            @Angkor Maybe I didn't explain myself well. Consider all possible sequences that start with "****TUNOCOC...". It's obvious that there are just as many of those as sequences of the form "****COCONUT...", i.e. both situations are equally probable, right? Note that all sequences of the form "****TUNOCOC..." are a win for squirrel 2. And all sequences of the form "****COCONUT..." are a win for squirrel 1, except for those that start with "TUNOCOCONUT..." which are a win for squirrel 2. There's an asymmetry here, so at the very least we can say that those are not "just two 7-letter sequences".
            – abl
            Nov 25 at 16:33




            2




            2




            @Angkor Point taken, I apologize if I was out of tone. Imagine that the alphabet has only two letters, A and B. The squirrels bet for AAB and BAA respectively. The first three letters are typed. Each squirrel has 1/8 chances of winning at the third letter. If none of them wins, then the sequence is one of AAA, ABA, ABB, BAB, BBA, BBB, with equal probability. At the fourth letter, the possibilities (with equal probability) are AAAA, AAAB, ABAA, ABAB, ABBA, ABBB, BABA, BABB, BBAA, BBAB, BBBA, BBBB. Squirrel 1 wins with AAAB. Squirrel 2 wins with ABAA or BBAA, twice the chance.
            – abl
            Nov 25 at 20:43




            @Angkor Point taken, I apologize if I was out of tone. Imagine that the alphabet has only two letters, A and B. The squirrels bet for AAB and BAA respectively. The first three letters are typed. Each squirrel has 1/8 chances of winning at the third letter. If none of them wins, then the sequence is one of AAA, ABA, ABB, BAB, BBA, BBB, with equal probability. At the fourth letter, the possibilities (with equal probability) are AAAA, AAAB, ABAA, ABAB, ABBA, ABBB, BABA, BABB, BBAA, BBAB, BBBA, BBBB. Squirrel 1 wins with AAAB. Squirrel 2 wins with ABAA or BBAA, twice the chance.
            – abl
            Nov 25 at 20:43




            1




            1




            @Angkor, also, this simplified case is simple enough to be simulated, which I did (see my answer) and the simulation over several million games shows the second squirrel has better chances.
            – abl
            Nov 25 at 20:49




            @Angkor, also, this simplified case is simple enough to be simulated, which I did (see my answer) and the simulation over several million games shows the second squirrel has better chances.
            – abl
            Nov 25 at 20:49










            up vote
            3
            down vote













            I have tried to create a Markov chain modelling problem a). I haven't even tried to solve it, but the model (i.e. the states and the transitions) should be correct (with one caveat which I'll explain at the end).



            First, a simplification: I have reduced the problem to getting either the string A-A-B (as in CO-CO-NUT) or B-A-A (as in TUN-OC-OC), using an alphabet that only includes 3 letters: A, B, and C.



            Here's what I've got. This is for AAB:




            Markov chain for AAB




            As you can see, I have separated the state "Only one A" from the state "AA". It is noteworthy that there is no way to go from "AA" to "A".



            And this is for BAA:




            enter image description here




            I'm not sure what this leads to. I was hoping there would be some visible hint about the solution (like: both chains are very similar, with just one difference that tilts the balance in favour of one of them), but this doesn't seem to be the case.



            To be honest, I'm not even sure my simplification from COCONUT to AAB is correct, because in my case A and B are equally likely (both of them are just a letter which is typed with probability P = 1/3), whereas CO and NUT are not (CO is only 2 letters long, NUT is 3). This means that solving my chains could, or could not, help solving the original problem.



            Hopefully someone can pick up from here.






            share|improve this answer

























              up vote
              3
              down vote













              I have tried to create a Markov chain modelling problem a). I haven't even tried to solve it, but the model (i.e. the states and the transitions) should be correct (with one caveat which I'll explain at the end).



              First, a simplification: I have reduced the problem to getting either the string A-A-B (as in CO-CO-NUT) or B-A-A (as in TUN-OC-OC), using an alphabet that only includes 3 letters: A, B, and C.



              Here's what I've got. This is for AAB:




              Markov chain for AAB




              As you can see, I have separated the state "Only one A" from the state "AA". It is noteworthy that there is no way to go from "AA" to "A".



              And this is for BAA:




              enter image description here




              I'm not sure what this leads to. I was hoping there would be some visible hint about the solution (like: both chains are very similar, with just one difference that tilts the balance in favour of one of them), but this doesn't seem to be the case.



              To be honest, I'm not even sure my simplification from COCONUT to AAB is correct, because in my case A and B are equally likely (both of them are just a letter which is typed with probability P = 1/3), whereas CO and NUT are not (CO is only 2 letters long, NUT is 3). This means that solving my chains could, or could not, help solving the original problem.



              Hopefully someone can pick up from here.






              share|improve this answer























                up vote
                3
                down vote










                up vote
                3
                down vote









                I have tried to create a Markov chain modelling problem a). I haven't even tried to solve it, but the model (i.e. the states and the transitions) should be correct (with one caveat which I'll explain at the end).



                First, a simplification: I have reduced the problem to getting either the string A-A-B (as in CO-CO-NUT) or B-A-A (as in TUN-OC-OC), using an alphabet that only includes 3 letters: A, B, and C.



                Here's what I've got. This is for AAB:




                Markov chain for AAB




                As you can see, I have separated the state "Only one A" from the state "AA". It is noteworthy that there is no way to go from "AA" to "A".



                And this is for BAA:




                enter image description here




                I'm not sure what this leads to. I was hoping there would be some visible hint about the solution (like: both chains are very similar, with just one difference that tilts the balance in favour of one of them), but this doesn't seem to be the case.



                To be honest, I'm not even sure my simplification from COCONUT to AAB is correct, because in my case A and B are equally likely (both of them are just a letter which is typed with probability P = 1/3), whereas CO and NUT are not (CO is only 2 letters long, NUT is 3). This means that solving my chains could, or could not, help solving the original problem.



                Hopefully someone can pick up from here.






                share|improve this answer












                I have tried to create a Markov chain modelling problem a). I haven't even tried to solve it, but the model (i.e. the states and the transitions) should be correct (with one caveat which I'll explain at the end).



                First, a simplification: I have reduced the problem to getting either the string A-A-B (as in CO-CO-NUT) or B-A-A (as in TUN-OC-OC), using an alphabet that only includes 3 letters: A, B, and C.



                Here's what I've got. This is for AAB:




                Markov chain for AAB




                As you can see, I have separated the state "Only one A" from the state "AA". It is noteworthy that there is no way to go from "AA" to "A".



                And this is for BAA:




                enter image description here




                I'm not sure what this leads to. I was hoping there would be some visible hint about the solution (like: both chains are very similar, with just one difference that tilts the balance in favour of one of them), but this doesn't seem to be the case.



                To be honest, I'm not even sure my simplification from COCONUT to AAB is correct, because in my case A and B are equally likely (both of them are just a letter which is typed with probability P = 1/3), whereas CO and NUT are not (CO is only 2 letters long, NUT is 3). This means that solving my chains could, or could not, help solving the original problem.



                Hopefully someone can pick up from here.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 24 at 2:37









                Fabio Turati

                1756




                1756






















                    up vote
                    2
                    down vote













                    I tried to make a C# program to do that for me:



                    class Program
                    {
                    static void Main(string args)
                    {
                    Log("Le scimmie si preparano alla sfida");

                    Monkey monkey1 = new Monkey("COCONUT");
                    Monkey monkey2 = new Monkey("TUNOCOC");
                    int counter = 0;

                    while (true)
                    {
                    counter++;
                    Log("Le scimmie digitano una lettera per la {0}° volta.", counter);

                    monkey1.PressLetter();
                    monkey2.PressLetter();

                    if(monkey1.IsTargetReached() && monkey2.IsTargetReached())
                    {
                    Log("Entrambe le scimmie hanno vinto!");
                    break;
                    }
                    else if (monkey1.IsTargetReached())
                    {
                    Log("La scimmia "{0}" ha vinto!", monkey1.Target);
                    break;
                    }
                    else if (monkey2.IsTargetReached())
                    {
                    Log("La scimmia "{0}" ha vinto!", monkey2.Target);
                    break;
                    }
                    }
                    Log("Le scimmie si prendono il meritato riposo dopo {0} turni!", counter);
                    }

                    private static void Log(string message, params object args)
                    {
                    Console.WriteLine(DateTime.Now + " - " + string.Format(message, args));
                    }
                    }

                    public class Monkey
                    {
                    private Random Random { get; set; }
                    public string Target { get; set; }
                    public string Text { get; set; }

                    public Monkey(string target)
                    {
                    Target = target;
                    Text = "";
                    Random = new Random();
                    }

                    public void PressLetter()
                    {
                    int num = Random.Next(0, 26);
                    char let = (char)('a' + num);
                    string lastLetter = let.ToString().ToUpper();
                    Text += lastLetter;
                    }

                    public bool IsTargetReached()
                    {
                    return Text.IndexOf(Target) > -1;
                    }
                    }



                    After millions of Monkeys' tries, I ended up thinking that they have
                    the same possibilities.




                    Anyway my computer still didn't successfully solve the problem.






                    share|improve this answer

















                    • 2




                      That's not surprising. The expected number of keystrokes before either monkey wins is well into the billions.
                      – Sneftel
                      Nov 23 at 15:57






                    • 3




                      You might want to reduce the alphabet you are using...
                      – Evargalo
                      Nov 23 at 17:41






                    • 1




                      +1; this works if you reduce the alphabet.
                      – Tim C
                      Nov 25 at 3:01










                    • May I advice English comment? As a non native speaker I read every latin language as loren ipsum.
                      – Drag and Drop
                      Nov 27 at 13:44










                    • @DragandDrop: Loren Ipsum? Is she an actress?
                      – Laurent LA RIZZA
                      Nov 27 at 13:53















                    up vote
                    2
                    down vote













                    I tried to make a C# program to do that for me:



                    class Program
                    {
                    static void Main(string args)
                    {
                    Log("Le scimmie si preparano alla sfida");

                    Monkey monkey1 = new Monkey("COCONUT");
                    Monkey monkey2 = new Monkey("TUNOCOC");
                    int counter = 0;

                    while (true)
                    {
                    counter++;
                    Log("Le scimmie digitano una lettera per la {0}° volta.", counter);

                    monkey1.PressLetter();
                    monkey2.PressLetter();

                    if(monkey1.IsTargetReached() && monkey2.IsTargetReached())
                    {
                    Log("Entrambe le scimmie hanno vinto!");
                    break;
                    }
                    else if (monkey1.IsTargetReached())
                    {
                    Log("La scimmia "{0}" ha vinto!", monkey1.Target);
                    break;
                    }
                    else if (monkey2.IsTargetReached())
                    {
                    Log("La scimmia "{0}" ha vinto!", monkey2.Target);
                    break;
                    }
                    }
                    Log("Le scimmie si prendono il meritato riposo dopo {0} turni!", counter);
                    }

                    private static void Log(string message, params object args)
                    {
                    Console.WriteLine(DateTime.Now + " - " + string.Format(message, args));
                    }
                    }

                    public class Monkey
                    {
                    private Random Random { get; set; }
                    public string Target { get; set; }
                    public string Text { get; set; }

                    public Monkey(string target)
                    {
                    Target = target;
                    Text = "";
                    Random = new Random();
                    }

                    public void PressLetter()
                    {
                    int num = Random.Next(0, 26);
                    char let = (char)('a' + num);
                    string lastLetter = let.ToString().ToUpper();
                    Text += lastLetter;
                    }

                    public bool IsTargetReached()
                    {
                    return Text.IndexOf(Target) > -1;
                    }
                    }



                    After millions of Monkeys' tries, I ended up thinking that they have
                    the same possibilities.




                    Anyway my computer still didn't successfully solve the problem.






                    share|improve this answer

















                    • 2




                      That's not surprising. The expected number of keystrokes before either monkey wins is well into the billions.
                      – Sneftel
                      Nov 23 at 15:57






                    • 3




                      You might want to reduce the alphabet you are using...
                      – Evargalo
                      Nov 23 at 17:41






                    • 1




                      +1; this works if you reduce the alphabet.
                      – Tim C
                      Nov 25 at 3:01










                    • May I advice English comment? As a non native speaker I read every latin language as loren ipsum.
                      – Drag and Drop
                      Nov 27 at 13:44










                    • @DragandDrop: Loren Ipsum? Is she an actress?
                      – Laurent LA RIZZA
                      Nov 27 at 13:53













                    up vote
                    2
                    down vote










                    up vote
                    2
                    down vote









                    I tried to make a C# program to do that for me:



                    class Program
                    {
                    static void Main(string args)
                    {
                    Log("Le scimmie si preparano alla sfida");

                    Monkey monkey1 = new Monkey("COCONUT");
                    Monkey monkey2 = new Monkey("TUNOCOC");
                    int counter = 0;

                    while (true)
                    {
                    counter++;
                    Log("Le scimmie digitano una lettera per la {0}° volta.", counter);

                    monkey1.PressLetter();
                    monkey2.PressLetter();

                    if(monkey1.IsTargetReached() && monkey2.IsTargetReached())
                    {
                    Log("Entrambe le scimmie hanno vinto!");
                    break;
                    }
                    else if (monkey1.IsTargetReached())
                    {
                    Log("La scimmia "{0}" ha vinto!", monkey1.Target);
                    break;
                    }
                    else if (monkey2.IsTargetReached())
                    {
                    Log("La scimmia "{0}" ha vinto!", monkey2.Target);
                    break;
                    }
                    }
                    Log("Le scimmie si prendono il meritato riposo dopo {0} turni!", counter);
                    }

                    private static void Log(string message, params object args)
                    {
                    Console.WriteLine(DateTime.Now + " - " + string.Format(message, args));
                    }
                    }

                    public class Monkey
                    {
                    private Random Random { get; set; }
                    public string Target { get; set; }
                    public string Text { get; set; }

                    public Monkey(string target)
                    {
                    Target = target;
                    Text = "";
                    Random = new Random();
                    }

                    public void PressLetter()
                    {
                    int num = Random.Next(0, 26);
                    char let = (char)('a' + num);
                    string lastLetter = let.ToString().ToUpper();
                    Text += lastLetter;
                    }

                    public bool IsTargetReached()
                    {
                    return Text.IndexOf(Target) > -1;
                    }
                    }



                    After millions of Monkeys' tries, I ended up thinking that they have
                    the same possibilities.




                    Anyway my computer still didn't successfully solve the problem.






                    share|improve this answer












                    I tried to make a C# program to do that for me:



                    class Program
                    {
                    static void Main(string args)
                    {
                    Log("Le scimmie si preparano alla sfida");

                    Monkey monkey1 = new Monkey("COCONUT");
                    Monkey monkey2 = new Monkey("TUNOCOC");
                    int counter = 0;

                    while (true)
                    {
                    counter++;
                    Log("Le scimmie digitano una lettera per la {0}° volta.", counter);

                    monkey1.PressLetter();
                    monkey2.PressLetter();

                    if(monkey1.IsTargetReached() && monkey2.IsTargetReached())
                    {
                    Log("Entrambe le scimmie hanno vinto!");
                    break;
                    }
                    else if (monkey1.IsTargetReached())
                    {
                    Log("La scimmia "{0}" ha vinto!", monkey1.Target);
                    break;
                    }
                    else if (monkey2.IsTargetReached())
                    {
                    Log("La scimmia "{0}" ha vinto!", monkey2.Target);
                    break;
                    }
                    }
                    Log("Le scimmie si prendono il meritato riposo dopo {0} turni!", counter);
                    }

                    private static void Log(string message, params object args)
                    {
                    Console.WriteLine(DateTime.Now + " - " + string.Format(message, args));
                    }
                    }

                    public class Monkey
                    {
                    private Random Random { get; set; }
                    public string Target { get; set; }
                    public string Text { get; set; }

                    public Monkey(string target)
                    {
                    Target = target;
                    Text = "";
                    Random = new Random();
                    }

                    public void PressLetter()
                    {
                    int num = Random.Next(0, 26);
                    char let = (char)('a' + num);
                    string lastLetter = let.ToString().ToUpper();
                    Text += lastLetter;
                    }

                    public bool IsTargetReached()
                    {
                    return Text.IndexOf(Target) > -1;
                    }
                    }



                    After millions of Monkeys' tries, I ended up thinking that they have
                    the same possibilities.




                    Anyway my computer still didn't successfully solve the problem.







                    share|improve this answer












                    share|improve this answer



                    share|improve this answer










                    answered Nov 23 at 15:28









                    Marco Salerno

                    1292




                    1292








                    • 2




                      That's not surprising. The expected number of keystrokes before either monkey wins is well into the billions.
                      – Sneftel
                      Nov 23 at 15:57






                    • 3




                      You might want to reduce the alphabet you are using...
                      – Evargalo
                      Nov 23 at 17:41






                    • 1




                      +1; this works if you reduce the alphabet.
                      – Tim C
                      Nov 25 at 3:01










                    • May I advice English comment? As a non native speaker I read every latin language as loren ipsum.
                      – Drag and Drop
                      Nov 27 at 13:44










                    • @DragandDrop: Loren Ipsum? Is she an actress?
                      – Laurent LA RIZZA
                      Nov 27 at 13:53














                    • 2




                      That's not surprising. The expected number of keystrokes before either monkey wins is well into the billions.
                      – Sneftel
                      Nov 23 at 15:57






                    • 3




                      You might want to reduce the alphabet you are using...
                      – Evargalo
                      Nov 23 at 17:41






                    • 1




                      +1; this works if you reduce the alphabet.
                      – Tim C
                      Nov 25 at 3:01










                    • May I advice English comment? As a non native speaker I read every latin language as loren ipsum.
                      – Drag and Drop
                      Nov 27 at 13:44










                    • @DragandDrop: Loren Ipsum? Is she an actress?
                      – Laurent LA RIZZA
                      Nov 27 at 13:53








                    2




                    2




                    That's not surprising. The expected number of keystrokes before either monkey wins is well into the billions.
                    – Sneftel
                    Nov 23 at 15:57




                    That's not surprising. The expected number of keystrokes before either monkey wins is well into the billions.
                    – Sneftel
                    Nov 23 at 15:57




                    3




                    3




                    You might want to reduce the alphabet you are using...
                    – Evargalo
                    Nov 23 at 17:41




                    You might want to reduce the alphabet you are using...
                    – Evargalo
                    Nov 23 at 17:41




                    1




                    1




                    +1; this works if you reduce the alphabet.
                    – Tim C
                    Nov 25 at 3:01




                    +1; this works if you reduce the alphabet.
                    – Tim C
                    Nov 25 at 3:01












                    May I advice English comment? As a non native speaker I read every latin language as loren ipsum.
                    – Drag and Drop
                    Nov 27 at 13:44




                    May I advice English comment? As a non native speaker I read every latin language as loren ipsum.
                    – Drag and Drop
                    Nov 27 at 13:44












                    @DragandDrop: Loren Ipsum? Is she an actress?
                    – Laurent LA RIZZA
                    Nov 27 at 13:53




                    @DragandDrop: Loren Ipsum? Is she an actress?
                    – Laurent LA RIZZA
                    Nov 27 at 13:53










                    up vote
                    1
                    down vote













                    There's good arguments for both sides. Here's code that prove it definitely:



                    import numpy as np

                    def main():
                    m1 = np.array([
                    [25, 1, 0, 0, 0, 0, 0, 0],
                    [24, 1, 1, 0, 0, 0, 0, 0],
                    [25, 0, 0, 1, 0, 0, 0, 0],
                    [24, 1, 0, 0, 1, 0, 0, 0],
                    [24, 0, 0, 1, 0, 1, 0, 0],
                    [24, 1, 0, 0, 0, 0, 1, 0],
                    [24, 1, 0, 0, 0, 0, 0, 1],
                    [0, 0, 0, 0, 0, 0, 0, 26]], dtype='object')

                    m2 = np.array([
                    [25, 1, 0, 0, 0, 0, 0, 0],
                    [24, 1, 1, 0, 0, 0, 0, 0],
                    [24, 1, 0, 1, 0, 0, 0, 0],
                    [24, 1, 0, 0, 1, 0, 0, 0],
                    [24, 1, 0, 0, 0, 1, 0, 0],
                    [24, 1, 0, 0, 0, 0, 1, 0],
                    [24, 1, 0, 0, 0, 0, 0, 1],
                    [0, 0, 0, 0, 0, 0, 0, 26]], dtype = 'object')

                    p1 = np.array([1,0,0,0,0,0,0,0])
                    p2 = np.array([1,0,0,0,0,0,0,0])

                    for i in range(1000):
                    p1 = p1.dot(m1)
                    p2 = p2.dot(m2)

                    #this shows that how often the two monkes have 1,2,3 letters correct vary
                    #but it doesn't affact past the 4th letter
                    #print(i,p2 - p1)

                    if p1[7] == p2[7]:
                    print("after {} steps, both monkeys have equal chances".format(i))
                    elif p2[7] > p1[7]:
                    print("crap, this whole arguments just got destroyed")
                    else:
                    print("after {} steps, coconut monkey has {} more ways to have won".format(i, p1[7] - p2[7]))

                    main()


                    I ran it up to:




                    after 99999 steps, both monkeys have equal chances




                    No matter the number of steps, both monkeys have equal chances. Even though one monkey is more likely to have exactly the last 1,2, or 3 letters correct.



                    (mind-boggling)






                    share|improve this answer

























                      up vote
                      1
                      down vote













                      There's good arguments for both sides. Here's code that prove it definitely:



                      import numpy as np

                      def main():
                      m1 = np.array([
                      [25, 1, 0, 0, 0, 0, 0, 0],
                      [24, 1, 1, 0, 0, 0, 0, 0],
                      [25, 0, 0, 1, 0, 0, 0, 0],
                      [24, 1, 0, 0, 1, 0, 0, 0],
                      [24, 0, 0, 1, 0, 1, 0, 0],
                      [24, 1, 0, 0, 0, 0, 1, 0],
                      [24, 1, 0, 0, 0, 0, 0, 1],
                      [0, 0, 0, 0, 0, 0, 0, 26]], dtype='object')

                      m2 = np.array([
                      [25, 1, 0, 0, 0, 0, 0, 0],
                      [24, 1, 1, 0, 0, 0, 0, 0],
                      [24, 1, 0, 1, 0, 0, 0, 0],
                      [24, 1, 0, 0, 1, 0, 0, 0],
                      [24, 1, 0, 0, 0, 1, 0, 0],
                      [24, 1, 0, 0, 0, 0, 1, 0],
                      [24, 1, 0, 0, 0, 0, 0, 1],
                      [0, 0, 0, 0, 0, 0, 0, 26]], dtype = 'object')

                      p1 = np.array([1,0,0,0,0,0,0,0])
                      p2 = np.array([1,0,0,0,0,0,0,0])

                      for i in range(1000):
                      p1 = p1.dot(m1)
                      p2 = p2.dot(m2)

                      #this shows that how often the two monkes have 1,2,3 letters correct vary
                      #but it doesn't affact past the 4th letter
                      #print(i,p2 - p1)

                      if p1[7] == p2[7]:
                      print("after {} steps, both monkeys have equal chances".format(i))
                      elif p2[7] > p1[7]:
                      print("crap, this whole arguments just got destroyed")
                      else:
                      print("after {} steps, coconut monkey has {} more ways to have won".format(i, p1[7] - p2[7]))

                      main()


                      I ran it up to:




                      after 99999 steps, both monkeys have equal chances




                      No matter the number of steps, both monkeys have equal chances. Even though one monkey is more likely to have exactly the last 1,2, or 3 letters correct.



                      (mind-boggling)






                      share|improve this answer























                        up vote
                        1
                        down vote










                        up vote
                        1
                        down vote









                        There's good arguments for both sides. Here's code that prove it definitely:



                        import numpy as np

                        def main():
                        m1 = np.array([
                        [25, 1, 0, 0, 0, 0, 0, 0],
                        [24, 1, 1, 0, 0, 0, 0, 0],
                        [25, 0, 0, 1, 0, 0, 0, 0],
                        [24, 1, 0, 0, 1, 0, 0, 0],
                        [24, 0, 0, 1, 0, 1, 0, 0],
                        [24, 1, 0, 0, 0, 0, 1, 0],
                        [24, 1, 0, 0, 0, 0, 0, 1],
                        [0, 0, 0, 0, 0, 0, 0, 26]], dtype='object')

                        m2 = np.array([
                        [25, 1, 0, 0, 0, 0, 0, 0],
                        [24, 1, 1, 0, 0, 0, 0, 0],
                        [24, 1, 0, 1, 0, 0, 0, 0],
                        [24, 1, 0, 0, 1, 0, 0, 0],
                        [24, 1, 0, 0, 0, 1, 0, 0],
                        [24, 1, 0, 0, 0, 0, 1, 0],
                        [24, 1, 0, 0, 0, 0, 0, 1],
                        [0, 0, 0, 0, 0, 0, 0, 26]], dtype = 'object')

                        p1 = np.array([1,0,0,0,0,0,0,0])
                        p2 = np.array([1,0,0,0,0,0,0,0])

                        for i in range(1000):
                        p1 = p1.dot(m1)
                        p2 = p2.dot(m2)

                        #this shows that how often the two monkes have 1,2,3 letters correct vary
                        #but it doesn't affact past the 4th letter
                        #print(i,p2 - p1)

                        if p1[7] == p2[7]:
                        print("after {} steps, both monkeys have equal chances".format(i))
                        elif p2[7] > p1[7]:
                        print("crap, this whole arguments just got destroyed")
                        else:
                        print("after {} steps, coconut monkey has {} more ways to have won".format(i, p1[7] - p2[7]))

                        main()


                        I ran it up to:




                        after 99999 steps, both monkeys have equal chances




                        No matter the number of steps, both monkeys have equal chances. Even though one monkey is more likely to have exactly the last 1,2, or 3 letters correct.



                        (mind-boggling)






                        share|improve this answer












                        There's good arguments for both sides. Here's code that prove it definitely:



                        import numpy as np

                        def main():
                        m1 = np.array([
                        [25, 1, 0, 0, 0, 0, 0, 0],
                        [24, 1, 1, 0, 0, 0, 0, 0],
                        [25, 0, 0, 1, 0, 0, 0, 0],
                        [24, 1, 0, 0, 1, 0, 0, 0],
                        [24, 0, 0, 1, 0, 1, 0, 0],
                        [24, 1, 0, 0, 0, 0, 1, 0],
                        [24, 1, 0, 0, 0, 0, 0, 1],
                        [0, 0, 0, 0, 0, 0, 0, 26]], dtype='object')

                        m2 = np.array([
                        [25, 1, 0, 0, 0, 0, 0, 0],
                        [24, 1, 1, 0, 0, 0, 0, 0],
                        [24, 1, 0, 1, 0, 0, 0, 0],
                        [24, 1, 0, 0, 1, 0, 0, 0],
                        [24, 1, 0, 0, 0, 1, 0, 0],
                        [24, 1, 0, 0, 0, 0, 1, 0],
                        [24, 1, 0, 0, 0, 0, 0, 1],
                        [0, 0, 0, 0, 0, 0, 0, 26]], dtype = 'object')

                        p1 = np.array([1,0,0,0,0,0,0,0])
                        p2 = np.array([1,0,0,0,0,0,0,0])

                        for i in range(1000):
                        p1 = p1.dot(m1)
                        p2 = p2.dot(m2)

                        #this shows that how often the two monkes have 1,2,3 letters correct vary
                        #but it doesn't affact past the 4th letter
                        #print(i,p2 - p1)

                        if p1[7] == p2[7]:
                        print("after {} steps, both monkeys have equal chances".format(i))
                        elif p2[7] > p1[7]:
                        print("crap, this whole arguments just got destroyed")
                        else:
                        print("after {} steps, coconut monkey has {} more ways to have won".format(i, p1[7] - p2[7]))

                        main()


                        I ran it up to:




                        after 99999 steps, both monkeys have equal chances




                        No matter the number of steps, both monkeys have equal chances. Even though one monkey is more likely to have exactly the last 1,2, or 3 letters correct.



                        (mind-boggling)







                        share|improve this answer












                        share|improve this answer



                        share|improve this answer










                        answered Nov 24 at 19:50









                        Jeffrey

                        1215




                        1215






















                            up vote
                            1
                            down vote













                            I have created the simulation in C# based on Marco's idea.
                            The alphabet is reduced and the performance is improved:



                            public class Monkey
                            {
                            private static readonly char alphabet = "ACNOTUXY".ToCharArray();
                            private static readonly Random random = new Random();

                            public Monkey(string Target)
                            {
                            target = Target.ToCharArray();
                            length = target.Length;
                            input = new char[length];
                            inputPosition = 0;
                            }

                            private readonly char target;
                            private readonly int length;
                            private readonly char input;
                            private int inputPosition;

                            public bool TargetReached { get; private set; }

                            public void PressKey()
                            {
                            var key = alphabet[random.Next(0, 8)];
                            input[inputPosition] = key;
                            inputPosition = ++inputPosition % length;
                            TargetReached = IsTargetReached();
                            }

                            private bool IsTargetReached()
                            {
                            for (var i = 0; i < length; i++)
                            {
                            if (input[i] != target[i]) return false;
                            }
                            return true;
                            }
                            }

                            public class Simulator
                            {
                            private int ties;
                            private int monkey1Wins;
                            private int monkey2Wins;

                            public void Run(int numberOfCompetitons)
                            {
                            for (var i = 0; i < numberOfCompetitons; i++)
                            {
                            RunCompetition();
                            }
                            }

                            private void RunCompetition()
                            {
                            var monkey1 = new Monkey("COCONUT");
                            var monkey2 = new Monkey("TUNOCOC");

                            while (true)
                            {
                            monkey1.PressKey();
                            monkey2.PressKey();

                            if (monkey1.TargetReached && monkey2.TargetReached)
                            {
                            ties++;
                            Console.WriteLine("It's a TIE!");
                            break;
                            }
                            if (monkey1.TargetReached)
                            {
                            monkey1Wins++;
                            Console.WriteLine("Monkey 1 WON!");
                            break;
                            }
                            if (monkey2.TargetReached)
                            {
                            monkey2Wins++;
                            Console.WriteLine("Monkey 2 WON!");
                            break;
                            }
                            }
                            }

                            public void ShowSummary()
                            {
                            Console.WriteLine();
                            Console.WriteLine($"{ties} TIES");
                            Console.WriteLine($
                            "Monkey 1: {monkey1Wins} WINS");
                            Console.WriteLine($"Monkey 2: {monkey2Wins} WINS");
                            }
                            }

                            internal class Program
                            {
                            private static void Main()
                            {
                            var simulator = new Simulator();
                            simulator.Run(1000);
                            simulator.ShowSummary();
                            Console.ReadKey();
                            }
                            }


                            Sample output:



                            ...
                            Monkey 1 WON!
                            Monkey 1 WON!
                            Monkey 2 WON!
                            Monkey 1 WON!
                            Monkey 2 WON!
                            Monkey 2 WON!
                            Monkey 1 WON!
                            Monkey 1 WON!
                            Monkey 2 WON!



                            0 TIES

                            Monkey 1: 498 WINS

                            Monkey 2: 502 WINS


                            So it seems pretty tied to me







                            share|improve this answer



























                              up vote
                              1
                              down vote













                              I have created the simulation in C# based on Marco's idea.
                              The alphabet is reduced and the performance is improved:



                              public class Monkey
                              {
                              private static readonly char alphabet = "ACNOTUXY".ToCharArray();
                              private static readonly Random random = new Random();

                              public Monkey(string Target)
                              {
                              target = Target.ToCharArray();
                              length = target.Length;
                              input = new char[length];
                              inputPosition = 0;
                              }

                              private readonly char target;
                              private readonly int length;
                              private readonly char input;
                              private int inputPosition;

                              public bool TargetReached { get; private set; }

                              public void PressKey()
                              {
                              var key = alphabet[random.Next(0, 8)];
                              input[inputPosition] = key;
                              inputPosition = ++inputPosition % length;
                              TargetReached = IsTargetReached();
                              }

                              private bool IsTargetReached()
                              {
                              for (var i = 0; i < length; i++)
                              {
                              if (input[i] != target[i]) return false;
                              }
                              return true;
                              }
                              }

                              public class Simulator
                              {
                              private int ties;
                              private int monkey1Wins;
                              private int monkey2Wins;

                              public void Run(int numberOfCompetitons)
                              {
                              for (var i = 0; i < numberOfCompetitons; i++)
                              {
                              RunCompetition();
                              }
                              }

                              private void RunCompetition()
                              {
                              var monkey1 = new Monkey("COCONUT");
                              var monkey2 = new Monkey("TUNOCOC");

                              while (true)
                              {
                              monkey1.PressKey();
                              monkey2.PressKey();

                              if (monkey1.TargetReached && monkey2.TargetReached)
                              {
                              ties++;
                              Console.WriteLine("It's a TIE!");
                              break;
                              }
                              if (monkey1.TargetReached)
                              {
                              monkey1Wins++;
                              Console.WriteLine("Monkey 1 WON!");
                              break;
                              }
                              if (monkey2.TargetReached)
                              {
                              monkey2Wins++;
                              Console.WriteLine("Monkey 2 WON!");
                              break;
                              }
                              }
                              }

                              public void ShowSummary()
                              {
                              Console.WriteLine();
                              Console.WriteLine($"{ties} TIES");
                              Console.WriteLine($
                              "Monkey 1: {monkey1Wins} WINS");
                              Console.WriteLine($"Monkey 2: {monkey2Wins} WINS");
                              }
                              }

                              internal class Program
                              {
                              private static void Main()
                              {
                              var simulator = new Simulator();
                              simulator.Run(1000);
                              simulator.ShowSummary();
                              Console.ReadKey();
                              }
                              }


                              Sample output:



                              ...
                              Monkey 1 WON!
                              Monkey 1 WON!
                              Monkey 2 WON!
                              Monkey 1 WON!
                              Monkey 2 WON!
                              Monkey 2 WON!
                              Monkey 1 WON!
                              Monkey 1 WON!
                              Monkey 2 WON!



                              0 TIES

                              Monkey 1: 498 WINS

                              Monkey 2: 502 WINS


                              So it seems pretty tied to me







                              share|improve this answer

























                                up vote
                                1
                                down vote










                                up vote
                                1
                                down vote









                                I have created the simulation in C# based on Marco's idea.
                                The alphabet is reduced and the performance is improved:



                                public class Monkey
                                {
                                private static readonly char alphabet = "ACNOTUXY".ToCharArray();
                                private static readonly Random random = new Random();

                                public Monkey(string Target)
                                {
                                target = Target.ToCharArray();
                                length = target.Length;
                                input = new char[length];
                                inputPosition = 0;
                                }

                                private readonly char target;
                                private readonly int length;
                                private readonly char input;
                                private int inputPosition;

                                public bool TargetReached { get; private set; }

                                public void PressKey()
                                {
                                var key = alphabet[random.Next(0, 8)];
                                input[inputPosition] = key;
                                inputPosition = ++inputPosition % length;
                                TargetReached = IsTargetReached();
                                }

                                private bool IsTargetReached()
                                {
                                for (var i = 0; i < length; i++)
                                {
                                if (input[i] != target[i]) return false;
                                }
                                return true;
                                }
                                }

                                public class Simulator
                                {
                                private int ties;
                                private int monkey1Wins;
                                private int monkey2Wins;

                                public void Run(int numberOfCompetitons)
                                {
                                for (var i = 0; i < numberOfCompetitons; i++)
                                {
                                RunCompetition();
                                }
                                }

                                private void RunCompetition()
                                {
                                var monkey1 = new Monkey("COCONUT");
                                var monkey2 = new Monkey("TUNOCOC");

                                while (true)
                                {
                                monkey1.PressKey();
                                monkey2.PressKey();

                                if (monkey1.TargetReached && monkey2.TargetReached)
                                {
                                ties++;
                                Console.WriteLine("It's a TIE!");
                                break;
                                }
                                if (monkey1.TargetReached)
                                {
                                monkey1Wins++;
                                Console.WriteLine("Monkey 1 WON!");
                                break;
                                }
                                if (monkey2.TargetReached)
                                {
                                monkey2Wins++;
                                Console.WriteLine("Monkey 2 WON!");
                                break;
                                }
                                }
                                }

                                public void ShowSummary()
                                {
                                Console.WriteLine();
                                Console.WriteLine($"{ties} TIES");
                                Console.WriteLine($
                                "Monkey 1: {monkey1Wins} WINS");
                                Console.WriteLine($"Monkey 2: {monkey2Wins} WINS");
                                }
                                }

                                internal class Program
                                {
                                private static void Main()
                                {
                                var simulator = new Simulator();
                                simulator.Run(1000);
                                simulator.ShowSummary();
                                Console.ReadKey();
                                }
                                }


                                Sample output:



                                ...
                                Monkey 1 WON!
                                Monkey 1 WON!
                                Monkey 2 WON!
                                Monkey 1 WON!
                                Monkey 2 WON!
                                Monkey 2 WON!
                                Monkey 1 WON!
                                Monkey 1 WON!
                                Monkey 2 WON!



                                0 TIES

                                Monkey 1: 498 WINS

                                Monkey 2: 502 WINS


                                So it seems pretty tied to me







                                share|improve this answer














                                I have created the simulation in C# based on Marco's idea.
                                The alphabet is reduced and the performance is improved:



                                public class Monkey
                                {
                                private static readonly char alphabet = "ACNOTUXY".ToCharArray();
                                private static readonly Random random = new Random();

                                public Monkey(string Target)
                                {
                                target = Target.ToCharArray();
                                length = target.Length;
                                input = new char[length];
                                inputPosition = 0;
                                }

                                private readonly char target;
                                private readonly int length;
                                private readonly char input;
                                private int inputPosition;

                                public bool TargetReached { get; private set; }

                                public void PressKey()
                                {
                                var key = alphabet[random.Next(0, 8)];
                                input[inputPosition] = key;
                                inputPosition = ++inputPosition % length;
                                TargetReached = IsTargetReached();
                                }

                                private bool IsTargetReached()
                                {
                                for (var i = 0; i < length; i++)
                                {
                                if (input[i] != target[i]) return false;
                                }
                                return true;
                                }
                                }

                                public class Simulator
                                {
                                private int ties;
                                private int monkey1Wins;
                                private int monkey2Wins;

                                public void Run(int numberOfCompetitons)
                                {
                                for (var i = 0; i < numberOfCompetitons; i++)
                                {
                                RunCompetition();
                                }
                                }

                                private void RunCompetition()
                                {
                                var monkey1 = new Monkey("COCONUT");
                                var monkey2 = new Monkey("TUNOCOC");

                                while (true)
                                {
                                monkey1.PressKey();
                                monkey2.PressKey();

                                if (monkey1.TargetReached && monkey2.TargetReached)
                                {
                                ties++;
                                Console.WriteLine("It's a TIE!");
                                break;
                                }
                                if (monkey1.TargetReached)
                                {
                                monkey1Wins++;
                                Console.WriteLine("Monkey 1 WON!");
                                break;
                                }
                                if (monkey2.TargetReached)
                                {
                                monkey2Wins++;
                                Console.WriteLine("Monkey 2 WON!");
                                break;
                                }
                                }
                                }

                                public void ShowSummary()
                                {
                                Console.WriteLine();
                                Console.WriteLine($"{ties} TIES");
                                Console.WriteLine($
                                "Monkey 1: {monkey1Wins} WINS");
                                Console.WriteLine($"Monkey 2: {monkey2Wins} WINS");
                                }
                                }

                                internal class Program
                                {
                                private static void Main()
                                {
                                var simulator = new Simulator();
                                simulator.Run(1000);
                                simulator.ShowSummary();
                                Console.ReadKey();
                                }
                                }


                                Sample output:



                                ...
                                Monkey 1 WON!
                                Monkey 1 WON!
                                Monkey 2 WON!
                                Monkey 1 WON!
                                Monkey 2 WON!
                                Monkey 2 WON!
                                Monkey 1 WON!
                                Monkey 1 WON!
                                Monkey 2 WON!



                                0 TIES

                                Monkey 1: 498 WINS

                                Monkey 2: 502 WINS


                                So it seems pretty tied to me








                                share|improve this answer














                                share|improve this answer



                                share|improve this answer








                                edited Nov 25 at 10:19

























                                answered Nov 25 at 10:03









                                Dusan

                                1194




                                1194






















                                    up vote
                                    0
                                    down vote













                                    Monkey problem



                                    The crux the problem is the following: do repeated groups influence the expected characters needed to complete a sequence? The answer is




                                    No, repeated groups bear no influence on the result




                                    Let's reduce the problem to something similar - AAB vs ABC with only these letters being on the keyboard.



                                    The probability of obtaining AAB in a sequence is:




                                    starting at 1: P(A) * P(A) * P(B) = (1/3 * 1/3 * 1/3) +

                                    starting at 2: P(!A) * P(A) * P(A) * P(B) = 2/3 * (1/3 * 1/3 * 1/3) +

                                    starting at 3: P(!A) * P(!A) * P(A) * P(A) * P(B) = 2/3 * 2/3 * (1/3 * 1/3 * 1/3) + ...




                                    The probability of obtaining ABC in a sequence is:




                                    starting at 1: P(A) * P(B) * P(C) = (1/3 * 1/3 * 1/3) +

                                    starting at 2: P(!A) * P(A) * P(B) * P(C) = 2/3 * (1/3 * 1/3 * 1/3) +

                                    starting at 3: P(!A) * P(!A) * P(A) * P(B) * P(C) = 2/3 * 2/3 * (1/3 * 1/3 * 1/3) + ...




                                    This makes it easy to see that




                                    No matter how you shuffle the desired sequence around, the only thing that influences the probability of it occurring is the length of the sequence.




                                    Alternative proof:
                                    All of the possible sequences of 4 letters are:




                                    AAAA ABAA ACAA BAAA BBAA BCAA CAAA CBAA CCAA

                                    AAAB ABAB ACAB BAAB BBAB BCAB CAAB CBAB CCAB

                                    AAAC ABAC ACAC BAAC BBAC BCAC CAAC CBAC CCAC

                                    AABA ABBA ACBA BABA BBBA BCBA CABA CBBA CCBA

                                    AABB ABBB ACBB BABB BBBB BCBB CABB CBBB CCBB

                                    AABC ABBC ACBC BABC BBBC BCBC CABC CBBC CCBC

                                    AACA ABCA ACCA BACA BBCA BCCA CACA CBCA CCCA

                                    AACB ABCB ACCB BACB BBCB BCCB CACB CBCB CCCB

                                    AACC ABCC ACCC BACC BBCC BCCC CACC CBCC CCCC




                                    And here is the same table with the uninteresting sequences cut out:




                                    XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX

                                    AAAB XXXX XXXX BAAB XXXX XXXX CAAB XXXX XXXX

                                    XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX

                                    AABA XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX

                                    AABB XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX

                                    AABC XXXX XXXX BABC XXXX XXXX CABC XXXX XXXX

                                    XXXX ABCA XXXX XXXX XXXX XXXX XXXX XXXX XXXX

                                    XXXX ABCB XXXX XXXX XXXX XXXX XXXX XXXX XXXX

                                    XXXX ABCC XXXX XXXX XXXX XXXX XXXX XXXX XXXX




                                    There are




                                    3 sequences where AAB starts at position 1: AABA, AABB, AABC

                                    3 sequences where AAB starts at position 2: AAAB, BAAB, CAAB

                                    3 sequences where ABC starts at position 1: ABCA, ABCB, ABCC

                                    3 sequences where ABC starts at position 2: AABC, BABC, CABC




                                    Thus the answer to the monkey problem is that




                                    both monkeys have the same chance of finishing first




                                    Squirrel problem



                                    Let's consider the string typed by the monkey after a long (infinite) time and make some simplifications that don't affect the outcome to understand the problem better.




                                    Replace any letter not in COCONUT with X

                                    Replace all groups of COC with A

                                    Replace all remaining C's with X

                                    Replace all remaining O's with B, N's with C, U's with D and T's with E




                                    The problem now is:




                                    in the new string will we more likely first find ABCDE or EDCBA?




                                    And the solution:




                                    If A, B, C, D and E had the same chances of occurring, the answer would be "both are equally likely", but C, D and E have normal chances of occurring, A has a very low chance of occurring, B has a slightly lower than normal chance.

                                    Considering a simplified case - 2 letters - P and Q with P occurring more often than Q, PQ has more chances of appearing first than QP.
                                    This means EDCBA is more likely to appear first than ABCDE.




                                    Looking at it from a different angle:




                                    When the unlikely event that A appears in the string, it is equally likely that it is followed by a B as it is that it is preceded by one.

                                    In the even more unlikely event that an AB or BA appears in the string, it is equally likely that it is followed by a C as it is that it is preceded by one.

                                    ...

                                    Which makes it equally likely that ABCDE and EDCBA appear in the string, so when an A appears in either of these sequences, it is equally likely that it starts the ABCDE sequence as it is that it ends the EDCBA sequence.




                                    Thus, the answer to the squirrel problem is that




                                    the second squirrel's sequence (TUNOCOC) appears first







                                    share|improve this answer

























                                      up vote
                                      0
                                      down vote













                                      Monkey problem



                                      The crux the problem is the following: do repeated groups influence the expected characters needed to complete a sequence? The answer is




                                      No, repeated groups bear no influence on the result




                                      Let's reduce the problem to something similar - AAB vs ABC with only these letters being on the keyboard.



                                      The probability of obtaining AAB in a sequence is:




                                      starting at 1: P(A) * P(A) * P(B) = (1/3 * 1/3 * 1/3) +

                                      starting at 2: P(!A) * P(A) * P(A) * P(B) = 2/3 * (1/3 * 1/3 * 1/3) +

                                      starting at 3: P(!A) * P(!A) * P(A) * P(A) * P(B) = 2/3 * 2/3 * (1/3 * 1/3 * 1/3) + ...




                                      The probability of obtaining ABC in a sequence is:




                                      starting at 1: P(A) * P(B) * P(C) = (1/3 * 1/3 * 1/3) +

                                      starting at 2: P(!A) * P(A) * P(B) * P(C) = 2/3 * (1/3 * 1/3 * 1/3) +

                                      starting at 3: P(!A) * P(!A) * P(A) * P(B) * P(C) = 2/3 * 2/3 * (1/3 * 1/3 * 1/3) + ...




                                      This makes it easy to see that




                                      No matter how you shuffle the desired sequence around, the only thing that influences the probability of it occurring is the length of the sequence.




                                      Alternative proof:
                                      All of the possible sequences of 4 letters are:




                                      AAAA ABAA ACAA BAAA BBAA BCAA CAAA CBAA CCAA

                                      AAAB ABAB ACAB BAAB BBAB BCAB CAAB CBAB CCAB

                                      AAAC ABAC ACAC BAAC BBAC BCAC CAAC CBAC CCAC

                                      AABA ABBA ACBA BABA BBBA BCBA CABA CBBA CCBA

                                      AABB ABBB ACBB BABB BBBB BCBB CABB CBBB CCBB

                                      AABC ABBC ACBC BABC BBBC BCBC CABC CBBC CCBC

                                      AACA ABCA ACCA BACA BBCA BCCA CACA CBCA CCCA

                                      AACB ABCB ACCB BACB BBCB BCCB CACB CBCB CCCB

                                      AACC ABCC ACCC BACC BBCC BCCC CACC CBCC CCCC




                                      And here is the same table with the uninteresting sequences cut out:




                                      XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX

                                      AAAB XXXX XXXX BAAB XXXX XXXX CAAB XXXX XXXX

                                      XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX

                                      AABA XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX

                                      AABB XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX

                                      AABC XXXX XXXX BABC XXXX XXXX CABC XXXX XXXX

                                      XXXX ABCA XXXX XXXX XXXX XXXX XXXX XXXX XXXX

                                      XXXX ABCB XXXX XXXX XXXX XXXX XXXX XXXX XXXX

                                      XXXX ABCC XXXX XXXX XXXX XXXX XXXX XXXX XXXX




                                      There are




                                      3 sequences where AAB starts at position 1: AABA, AABB, AABC

                                      3 sequences where AAB starts at position 2: AAAB, BAAB, CAAB

                                      3 sequences where ABC starts at position 1: ABCA, ABCB, ABCC

                                      3 sequences where ABC starts at position 2: AABC, BABC, CABC




                                      Thus the answer to the monkey problem is that




                                      both monkeys have the same chance of finishing first




                                      Squirrel problem



                                      Let's consider the string typed by the monkey after a long (infinite) time and make some simplifications that don't affect the outcome to understand the problem better.




                                      Replace any letter not in COCONUT with X

                                      Replace all groups of COC with A

                                      Replace all remaining C's with X

                                      Replace all remaining O's with B, N's with C, U's with D and T's with E




                                      The problem now is:




                                      in the new string will we more likely first find ABCDE or EDCBA?




                                      And the solution:




                                      If A, B, C, D and E had the same chances of occurring, the answer would be "both are equally likely", but C, D and E have normal chances of occurring, A has a very low chance of occurring, B has a slightly lower than normal chance.

                                      Considering a simplified case - 2 letters - P and Q with P occurring more often than Q, PQ has more chances of appearing first than QP.
                                      This means EDCBA is more likely to appear first than ABCDE.




                                      Looking at it from a different angle:




                                      When the unlikely event that A appears in the string, it is equally likely that it is followed by a B as it is that it is preceded by one.

                                      In the even more unlikely event that an AB or BA appears in the string, it is equally likely that it is followed by a C as it is that it is preceded by one.

                                      ...

                                      Which makes it equally likely that ABCDE and EDCBA appear in the string, so when an A appears in either of these sequences, it is equally likely that it starts the ABCDE sequence as it is that it ends the EDCBA sequence.




                                      Thus, the answer to the squirrel problem is that




                                      the second squirrel's sequence (TUNOCOC) appears first







                                      share|improve this answer























                                        up vote
                                        0
                                        down vote










                                        up vote
                                        0
                                        down vote









                                        Monkey problem



                                        The crux the problem is the following: do repeated groups influence the expected characters needed to complete a sequence? The answer is




                                        No, repeated groups bear no influence on the result




                                        Let's reduce the problem to something similar - AAB vs ABC with only these letters being on the keyboard.



                                        The probability of obtaining AAB in a sequence is:




                                        starting at 1: P(A) * P(A) * P(B) = (1/3 * 1/3 * 1/3) +

                                        starting at 2: P(!A) * P(A) * P(A) * P(B) = 2/3 * (1/3 * 1/3 * 1/3) +

                                        starting at 3: P(!A) * P(!A) * P(A) * P(A) * P(B) = 2/3 * 2/3 * (1/3 * 1/3 * 1/3) + ...




                                        The probability of obtaining ABC in a sequence is:




                                        starting at 1: P(A) * P(B) * P(C) = (1/3 * 1/3 * 1/3) +

                                        starting at 2: P(!A) * P(A) * P(B) * P(C) = 2/3 * (1/3 * 1/3 * 1/3) +

                                        starting at 3: P(!A) * P(!A) * P(A) * P(B) * P(C) = 2/3 * 2/3 * (1/3 * 1/3 * 1/3) + ...




                                        This makes it easy to see that




                                        No matter how you shuffle the desired sequence around, the only thing that influences the probability of it occurring is the length of the sequence.




                                        Alternative proof:
                                        All of the possible sequences of 4 letters are:




                                        AAAA ABAA ACAA BAAA BBAA BCAA CAAA CBAA CCAA

                                        AAAB ABAB ACAB BAAB BBAB BCAB CAAB CBAB CCAB

                                        AAAC ABAC ACAC BAAC BBAC BCAC CAAC CBAC CCAC

                                        AABA ABBA ACBA BABA BBBA BCBA CABA CBBA CCBA

                                        AABB ABBB ACBB BABB BBBB BCBB CABB CBBB CCBB

                                        AABC ABBC ACBC BABC BBBC BCBC CABC CBBC CCBC

                                        AACA ABCA ACCA BACA BBCA BCCA CACA CBCA CCCA

                                        AACB ABCB ACCB BACB BBCB BCCB CACB CBCB CCCB

                                        AACC ABCC ACCC BACC BBCC BCCC CACC CBCC CCCC




                                        And here is the same table with the uninteresting sequences cut out:




                                        XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX

                                        AAAB XXXX XXXX BAAB XXXX XXXX CAAB XXXX XXXX

                                        XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX

                                        AABA XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX

                                        AABB XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX

                                        AABC XXXX XXXX BABC XXXX XXXX CABC XXXX XXXX

                                        XXXX ABCA XXXX XXXX XXXX XXXX XXXX XXXX XXXX

                                        XXXX ABCB XXXX XXXX XXXX XXXX XXXX XXXX XXXX

                                        XXXX ABCC XXXX XXXX XXXX XXXX XXXX XXXX XXXX




                                        There are




                                        3 sequences where AAB starts at position 1: AABA, AABB, AABC

                                        3 sequences where AAB starts at position 2: AAAB, BAAB, CAAB

                                        3 sequences where ABC starts at position 1: ABCA, ABCB, ABCC

                                        3 sequences where ABC starts at position 2: AABC, BABC, CABC




                                        Thus the answer to the monkey problem is that




                                        both monkeys have the same chance of finishing first




                                        Squirrel problem



                                        Let's consider the string typed by the monkey after a long (infinite) time and make some simplifications that don't affect the outcome to understand the problem better.




                                        Replace any letter not in COCONUT with X

                                        Replace all groups of COC with A

                                        Replace all remaining C's with X

                                        Replace all remaining O's with B, N's with C, U's with D and T's with E




                                        The problem now is:




                                        in the new string will we more likely first find ABCDE or EDCBA?




                                        And the solution:




                                        If A, B, C, D and E had the same chances of occurring, the answer would be "both are equally likely", but C, D and E have normal chances of occurring, A has a very low chance of occurring, B has a slightly lower than normal chance.

                                        Considering a simplified case - 2 letters - P and Q with P occurring more often than Q, PQ has more chances of appearing first than QP.
                                        This means EDCBA is more likely to appear first than ABCDE.




                                        Looking at it from a different angle:




                                        When the unlikely event that A appears in the string, it is equally likely that it is followed by a B as it is that it is preceded by one.

                                        In the even more unlikely event that an AB or BA appears in the string, it is equally likely that it is followed by a C as it is that it is preceded by one.

                                        ...

                                        Which makes it equally likely that ABCDE and EDCBA appear in the string, so when an A appears in either of these sequences, it is equally likely that it starts the ABCDE sequence as it is that it ends the EDCBA sequence.




                                        Thus, the answer to the squirrel problem is that




                                        the second squirrel's sequence (TUNOCOC) appears first







                                        share|improve this answer












                                        Monkey problem



                                        The crux the problem is the following: do repeated groups influence the expected characters needed to complete a sequence? The answer is




                                        No, repeated groups bear no influence on the result




                                        Let's reduce the problem to something similar - AAB vs ABC with only these letters being on the keyboard.



                                        The probability of obtaining AAB in a sequence is:




                                        starting at 1: P(A) * P(A) * P(B) = (1/3 * 1/3 * 1/3) +

                                        starting at 2: P(!A) * P(A) * P(A) * P(B) = 2/3 * (1/3 * 1/3 * 1/3) +

                                        starting at 3: P(!A) * P(!A) * P(A) * P(A) * P(B) = 2/3 * 2/3 * (1/3 * 1/3 * 1/3) + ...




                                        The probability of obtaining ABC in a sequence is:




                                        starting at 1: P(A) * P(B) * P(C) = (1/3 * 1/3 * 1/3) +

                                        starting at 2: P(!A) * P(A) * P(B) * P(C) = 2/3 * (1/3 * 1/3 * 1/3) +

                                        starting at 3: P(!A) * P(!A) * P(A) * P(B) * P(C) = 2/3 * 2/3 * (1/3 * 1/3 * 1/3) + ...




                                        This makes it easy to see that




                                        No matter how you shuffle the desired sequence around, the only thing that influences the probability of it occurring is the length of the sequence.




                                        Alternative proof:
                                        All of the possible sequences of 4 letters are:




                                        AAAA ABAA ACAA BAAA BBAA BCAA CAAA CBAA CCAA

                                        AAAB ABAB ACAB BAAB BBAB BCAB CAAB CBAB CCAB

                                        AAAC ABAC ACAC BAAC BBAC BCAC CAAC CBAC CCAC

                                        AABA ABBA ACBA BABA BBBA BCBA CABA CBBA CCBA

                                        AABB ABBB ACBB BABB BBBB BCBB CABB CBBB CCBB

                                        AABC ABBC ACBC BABC BBBC BCBC CABC CBBC CCBC

                                        AACA ABCA ACCA BACA BBCA BCCA CACA CBCA CCCA

                                        AACB ABCB ACCB BACB BBCB BCCB CACB CBCB CCCB

                                        AACC ABCC ACCC BACC BBCC BCCC CACC CBCC CCCC




                                        And here is the same table with the uninteresting sequences cut out:




                                        XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX

                                        AAAB XXXX XXXX BAAB XXXX XXXX CAAB XXXX XXXX

                                        XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX

                                        AABA XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX

                                        AABB XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX

                                        AABC XXXX XXXX BABC XXXX XXXX CABC XXXX XXXX

                                        XXXX ABCA XXXX XXXX XXXX XXXX XXXX XXXX XXXX

                                        XXXX ABCB XXXX XXXX XXXX XXXX XXXX XXXX XXXX

                                        XXXX ABCC XXXX XXXX XXXX XXXX XXXX XXXX XXXX




                                        There are




                                        3 sequences where AAB starts at position 1: AABA, AABB, AABC

                                        3 sequences where AAB starts at position 2: AAAB, BAAB, CAAB

                                        3 sequences where ABC starts at position 1: ABCA, ABCB, ABCC

                                        3 sequences where ABC starts at position 2: AABC, BABC, CABC




                                        Thus the answer to the monkey problem is that




                                        both monkeys have the same chance of finishing first




                                        Squirrel problem



                                        Let's consider the string typed by the monkey after a long (infinite) time and make some simplifications that don't affect the outcome to understand the problem better.




                                        Replace any letter not in COCONUT with X

                                        Replace all groups of COC with A

                                        Replace all remaining C's with X

                                        Replace all remaining O's with B, N's with C, U's with D and T's with E




                                        The problem now is:




                                        in the new string will we more likely first find ABCDE or EDCBA?




                                        And the solution:




                                        If A, B, C, D and E had the same chances of occurring, the answer would be "both are equally likely", but C, D and E have normal chances of occurring, A has a very low chance of occurring, B has a slightly lower than normal chance.

                                        Considering a simplified case - 2 letters - P and Q with P occurring more often than Q, PQ has more chances of appearing first than QP.
                                        This means EDCBA is more likely to appear first than ABCDE.




                                        Looking at it from a different angle:




                                        When the unlikely event that A appears in the string, it is equally likely that it is followed by a B as it is that it is preceded by one.

                                        In the even more unlikely event that an AB or BA appears in the string, it is equally likely that it is followed by a C as it is that it is preceded by one.

                                        ...

                                        Which makes it equally likely that ABCDE and EDCBA appear in the string, so when an A appears in either of these sequences, it is equally likely that it starts the ABCDE sequence as it is that it ends the EDCBA sequence.




                                        Thus, the answer to the squirrel problem is that




                                        the second squirrel's sequence (TUNOCOC) appears first








                                        share|improve this answer












                                        share|improve this answer



                                        share|improve this answer










                                        answered Nov 24 at 11:25









                                        Tibos

                                        58628




                                        58628






















                                            up vote
                                            0
                                            down vote













                                            Similar to Marco Salerno's answer, I wrote a computer program to solve the monkey problem.



                                            static void Coconuts(int iterations)
                                            {
                                            Tuple<double, double> outcomes = new Tuple<double, double>[iterations];
                                            char alphabet = new char { 'C', 'O', 'N', 'U', 'T' };
                                            string target1 = "COCONUT";
                                            string target2 = "TUNOCOC";
                                            for(int i = 0; i < iterations; i++)
                                            {
                                            outcomes[i] = new Tuple<double, double>(RandomLettersToTarget(alphabet, target1), RandomLettersToTarget(alphabet, target2));
                                            Console.WriteLine($"{outcomes[i].Item1:0.0}t{outcomes[i].Item2:0.0}");
                                            }

                                            double average1 = outcomes.Select(pair => pair.Item1).Average();
                                            double average2 = outcomes.Select(pair => pair.Item2).Average();

                                            Console.WriteLine($"Average durations are {average1:00000.0}, {average2:00000.0}");
                                            }

                                            static int RandomLettersToTarget(char alphabet, string target, int stepsToInsanity = 100000000)
                                            {
                                            string soFar = "";
                                            int steps = 0;
                                            while(steps < stepsToInsanity)
                                            {
                                            soFar += alphabet[rng.Next(0, alphabet.Length)];
                                            if(soFar.Length >= target.Length)
                                            {
                                            soFar = soFar.Substring(soFar.Length - target.Length, target.Length);
                                            }
                                            if(target.Equals(soFar))
                                            {
                                            return steps;
                                            }
                                            steps++;
                                            }
                                            return stepsToInsanity;
                                            }


                                            With the reduced alphabet (containing only 5 letters) this program does complete, and the results over thousands of iterations




                                            show no clear winner. Thus, I conclude that each monkey has equal probability of winning the first game.







                                            share|improve this answer























                                            • The difference in odds is so slight that a computer program will give you no reliable answer. See my answer for details.
                                              – Laurent LA RIZZA
                                              Nov 27 at 13:35















                                            up vote
                                            0
                                            down vote













                                            Similar to Marco Salerno's answer, I wrote a computer program to solve the monkey problem.



                                            static void Coconuts(int iterations)
                                            {
                                            Tuple<double, double> outcomes = new Tuple<double, double>[iterations];
                                            char alphabet = new char { 'C', 'O', 'N', 'U', 'T' };
                                            string target1 = "COCONUT";
                                            string target2 = "TUNOCOC";
                                            for(int i = 0; i < iterations; i++)
                                            {
                                            outcomes[i] = new Tuple<double, double>(RandomLettersToTarget(alphabet, target1), RandomLettersToTarget(alphabet, target2));
                                            Console.WriteLine($"{outcomes[i].Item1:0.0}t{outcomes[i].Item2:0.0}");
                                            }

                                            double average1 = outcomes.Select(pair => pair.Item1).Average();
                                            double average2 = outcomes.Select(pair => pair.Item2).Average();

                                            Console.WriteLine($"Average durations are {average1:00000.0}, {average2:00000.0}");
                                            }

                                            static int RandomLettersToTarget(char alphabet, string target, int stepsToInsanity = 100000000)
                                            {
                                            string soFar = "";
                                            int steps = 0;
                                            while(steps < stepsToInsanity)
                                            {
                                            soFar += alphabet[rng.Next(0, alphabet.Length)];
                                            if(soFar.Length >= target.Length)
                                            {
                                            soFar = soFar.Substring(soFar.Length - target.Length, target.Length);
                                            }
                                            if(target.Equals(soFar))
                                            {
                                            return steps;
                                            }
                                            steps++;
                                            }
                                            return stepsToInsanity;
                                            }


                                            With the reduced alphabet (containing only 5 letters) this program does complete, and the results over thousands of iterations




                                            show no clear winner. Thus, I conclude that each monkey has equal probability of winning the first game.







                                            share|improve this answer























                                            • The difference in odds is so slight that a computer program will give you no reliable answer. See my answer for details.
                                              – Laurent LA RIZZA
                                              Nov 27 at 13:35













                                            up vote
                                            0
                                            down vote










                                            up vote
                                            0
                                            down vote









                                            Similar to Marco Salerno's answer, I wrote a computer program to solve the monkey problem.



                                            static void Coconuts(int iterations)
                                            {
                                            Tuple<double, double> outcomes = new Tuple<double, double>[iterations];
                                            char alphabet = new char { 'C', 'O', 'N', 'U', 'T' };
                                            string target1 = "COCONUT";
                                            string target2 = "TUNOCOC";
                                            for(int i = 0; i < iterations; i++)
                                            {
                                            outcomes[i] = new Tuple<double, double>(RandomLettersToTarget(alphabet, target1), RandomLettersToTarget(alphabet, target2));
                                            Console.WriteLine($"{outcomes[i].Item1:0.0}t{outcomes[i].Item2:0.0}");
                                            }

                                            double average1 = outcomes.Select(pair => pair.Item1).Average();
                                            double average2 = outcomes.Select(pair => pair.Item2).Average();

                                            Console.WriteLine($"Average durations are {average1:00000.0}, {average2:00000.0}");
                                            }

                                            static int RandomLettersToTarget(char alphabet, string target, int stepsToInsanity = 100000000)
                                            {
                                            string soFar = "";
                                            int steps = 0;
                                            while(steps < stepsToInsanity)
                                            {
                                            soFar += alphabet[rng.Next(0, alphabet.Length)];
                                            if(soFar.Length >= target.Length)
                                            {
                                            soFar = soFar.Substring(soFar.Length - target.Length, target.Length);
                                            }
                                            if(target.Equals(soFar))
                                            {
                                            return steps;
                                            }
                                            steps++;
                                            }
                                            return stepsToInsanity;
                                            }


                                            With the reduced alphabet (containing only 5 letters) this program does complete, and the results over thousands of iterations




                                            show no clear winner. Thus, I conclude that each monkey has equal probability of winning the first game.







                                            share|improve this answer














                                            Similar to Marco Salerno's answer, I wrote a computer program to solve the monkey problem.



                                            static void Coconuts(int iterations)
                                            {
                                            Tuple<double, double> outcomes = new Tuple<double, double>[iterations];
                                            char alphabet = new char { 'C', 'O', 'N', 'U', 'T' };
                                            string target1 = "COCONUT";
                                            string target2 = "TUNOCOC";
                                            for(int i = 0; i < iterations; i++)
                                            {
                                            outcomes[i] = new Tuple<double, double>(RandomLettersToTarget(alphabet, target1), RandomLettersToTarget(alphabet, target2));
                                            Console.WriteLine($"{outcomes[i].Item1:0.0}t{outcomes[i].Item2:0.0}");
                                            }

                                            double average1 = outcomes.Select(pair => pair.Item1).Average();
                                            double average2 = outcomes.Select(pair => pair.Item2).Average();

                                            Console.WriteLine($"Average durations are {average1:00000.0}, {average2:00000.0}");
                                            }

                                            static int RandomLettersToTarget(char alphabet, string target, int stepsToInsanity = 100000000)
                                            {
                                            string soFar = "";
                                            int steps = 0;
                                            while(steps < stepsToInsanity)
                                            {
                                            soFar += alphabet[rng.Next(0, alphabet.Length)];
                                            if(soFar.Length >= target.Length)
                                            {
                                            soFar = soFar.Substring(soFar.Length - target.Length, target.Length);
                                            }
                                            if(target.Equals(soFar))
                                            {
                                            return steps;
                                            }
                                            steps++;
                                            }
                                            return stepsToInsanity;
                                            }


                                            With the reduced alphabet (containing only 5 letters) this program does complete, and the results over thousands of iterations




                                            show no clear winner. Thus, I conclude that each monkey has equal probability of winning the first game.








                                            share|improve this answer














                                            share|improve this answer



                                            share|improve this answer








                                            edited Nov 25 at 3:03

























                                            answered Nov 25 at 2:58









                                            Tim C

                                            76248




                                            76248












                                            • The difference in odds is so slight that a computer program will give you no reliable answer. See my answer for details.
                                              – Laurent LA RIZZA
                                              Nov 27 at 13:35


















                                            • The difference in odds is so slight that a computer program will give you no reliable answer. See my answer for details.
                                              – Laurent LA RIZZA
                                              Nov 27 at 13:35
















                                            The difference in odds is so slight that a computer program will give you no reliable answer. See my answer for details.
                                            – Laurent LA RIZZA
                                            Nov 27 at 13:35




                                            The difference in odds is so slight that a computer program will give you no reliable answer. See my answer for details.
                                            – Laurent LA RIZZA
                                            Nov 27 at 13:35










                                            up vote
                                            0
                                            down vote













                                            (a)




                                            The fact that two monkeys are typing is irrelevant to the study of the sequence. They're both typing on the same computer (see title), so there is only one sequence of letters. There is no bias, both monkeys have the exact same chance of having their word appear first. Winning, in the clear text, means "having my word come out first".




                                            We state obviously that, if less than $7$ letters are typed, both monkeys have equally $0$ chance of winning.



                                            Let's do some recurrence reasoning, trying to prove that if both monkeys have the same chance of winning after $n$ letters are typed, they have the same chance of winning after $n+1$ letters are typed.



                                            Let $n$ be greater than or equal to $6$. Given there is no winner after the $n$th letter is typed, and both monkeys have had the same chance of winning up until now, (which is trivially established for $n=6$) there are three cases:




                                            • the letters $n-5$ to $n$ say COCONU and the letter typed is T, monkey 1 wins. $1$ chance out of $26^7$ ($26^6$ suffixes times $26$ possible letters)

                                            • the letters $n-5$ to $n$ say TUNOCO and the letter typed is C, monkey 2 wins. $1$ chance out of $26^7$ ($26^6$ suffixes times $26$ possible letters)

                                            • all other cases, no winner. $26^7 - 2$ out of $26^7$


                                            And the monkeys have the same chance of winning upon the $n+1$th letter typed. This completes the proof by recurrence.




                                            So, reading the question correctly, the one expected to type more will be the one that stops last, assuming they type at the same frequency. There are two cases. Either the first word out is COCONUT, in which case, once monkey 1 stops typing, monkey 2 has T as a prefix for his word to come out. Or the first word out is TUNOCOC, in which case, once monkey 2 stops typing, monkey 1 has COC as a prefix for his word to come out. Therefore, monkey 2 is expected to have to type more than monkey 1 in this rare case, and then same as monkey 1 once he spoils his opportunity to finish the word.
                                            So, yes, monkey 2 is expected to type slightly more, due to these opportunities being unequal.




                                            (b)




                                            If "before" means "before in time", we are in the situation described in the clear text. Both squirrels have the same odds, if they stop watching after the first word appears. If they wait for both words to appear, well, this does not change anything, the word appearing first is the word appearing first.




                                            Whereas




                                            If "before" means "before in the lexical order", squirrel 1 wins, and the fact that a monkey is typing on a computer is irrelevant. This shows that squirrel 1 has better chances of winning anyway.




                                            Look, Ma! No Markov chains!






                                            share|improve this answer



























                                              up vote
                                              0
                                              down vote













                                              (a)




                                              The fact that two monkeys are typing is irrelevant to the study of the sequence. They're both typing on the same computer (see title), so there is only one sequence of letters. There is no bias, both monkeys have the exact same chance of having their word appear first. Winning, in the clear text, means "having my word come out first".




                                              We state obviously that, if less than $7$ letters are typed, both monkeys have equally $0$ chance of winning.



                                              Let's do some recurrence reasoning, trying to prove that if both monkeys have the same chance of winning after $n$ letters are typed, they have the same chance of winning after $n+1$ letters are typed.



                                              Let $n$ be greater than or equal to $6$. Given there is no winner after the $n$th letter is typed, and both monkeys have had the same chance of winning up until now, (which is trivially established for $n=6$) there are three cases:




                                              • the letters $n-5$ to $n$ say COCONU and the letter typed is T, monkey 1 wins. $1$ chance out of $26^7$ ($26^6$ suffixes times $26$ possible letters)

                                              • the letters $n-5$ to $n$ say TUNOCO and the letter typed is C, monkey 2 wins. $1$ chance out of $26^7$ ($26^6$ suffixes times $26$ possible letters)

                                              • all other cases, no winner. $26^7 - 2$ out of $26^7$


                                              And the monkeys have the same chance of winning upon the $n+1$th letter typed. This completes the proof by recurrence.




                                              So, reading the question correctly, the one expected to type more will be the one that stops last, assuming they type at the same frequency. There are two cases. Either the first word out is COCONUT, in which case, once monkey 1 stops typing, monkey 2 has T as a prefix for his word to come out. Or the first word out is TUNOCOC, in which case, once monkey 2 stops typing, monkey 1 has COC as a prefix for his word to come out. Therefore, monkey 2 is expected to have to type more than monkey 1 in this rare case, and then same as monkey 1 once he spoils his opportunity to finish the word.
                                              So, yes, monkey 2 is expected to type slightly more, due to these opportunities being unequal.




                                              (b)




                                              If "before" means "before in time", we are in the situation described in the clear text. Both squirrels have the same odds, if they stop watching after the first word appears. If they wait for both words to appear, well, this does not change anything, the word appearing first is the word appearing first.




                                              Whereas




                                              If "before" means "before in the lexical order", squirrel 1 wins, and the fact that a monkey is typing on a computer is irrelevant. This shows that squirrel 1 has better chances of winning anyway.




                                              Look, Ma! No Markov chains!






                                              share|improve this answer

























                                                up vote
                                                0
                                                down vote










                                                up vote
                                                0
                                                down vote









                                                (a)




                                                The fact that two monkeys are typing is irrelevant to the study of the sequence. They're both typing on the same computer (see title), so there is only one sequence of letters. There is no bias, both monkeys have the exact same chance of having their word appear first. Winning, in the clear text, means "having my word come out first".




                                                We state obviously that, if less than $7$ letters are typed, both monkeys have equally $0$ chance of winning.



                                                Let's do some recurrence reasoning, trying to prove that if both monkeys have the same chance of winning after $n$ letters are typed, they have the same chance of winning after $n+1$ letters are typed.



                                                Let $n$ be greater than or equal to $6$. Given there is no winner after the $n$th letter is typed, and both monkeys have had the same chance of winning up until now, (which is trivially established for $n=6$) there are three cases:




                                                • the letters $n-5$ to $n$ say COCONU and the letter typed is T, monkey 1 wins. $1$ chance out of $26^7$ ($26^6$ suffixes times $26$ possible letters)

                                                • the letters $n-5$ to $n$ say TUNOCO and the letter typed is C, monkey 2 wins. $1$ chance out of $26^7$ ($26^6$ suffixes times $26$ possible letters)

                                                • all other cases, no winner. $26^7 - 2$ out of $26^7$


                                                And the monkeys have the same chance of winning upon the $n+1$th letter typed. This completes the proof by recurrence.




                                                So, reading the question correctly, the one expected to type more will be the one that stops last, assuming they type at the same frequency. There are two cases. Either the first word out is COCONUT, in which case, once monkey 1 stops typing, monkey 2 has T as a prefix for his word to come out. Or the first word out is TUNOCOC, in which case, once monkey 2 stops typing, monkey 1 has COC as a prefix for his word to come out. Therefore, monkey 2 is expected to have to type more than monkey 1 in this rare case, and then same as monkey 1 once he spoils his opportunity to finish the word.
                                                So, yes, monkey 2 is expected to type slightly more, due to these opportunities being unequal.




                                                (b)




                                                If "before" means "before in time", we are in the situation described in the clear text. Both squirrels have the same odds, if they stop watching after the first word appears. If they wait for both words to appear, well, this does not change anything, the word appearing first is the word appearing first.




                                                Whereas




                                                If "before" means "before in the lexical order", squirrel 1 wins, and the fact that a monkey is typing on a computer is irrelevant. This shows that squirrel 1 has better chances of winning anyway.




                                                Look, Ma! No Markov chains!






                                                share|improve this answer














                                                (a)




                                                The fact that two monkeys are typing is irrelevant to the study of the sequence. They're both typing on the same computer (see title), so there is only one sequence of letters. There is no bias, both monkeys have the exact same chance of having their word appear first. Winning, in the clear text, means "having my word come out first".




                                                We state obviously that, if less than $7$ letters are typed, both monkeys have equally $0$ chance of winning.



                                                Let's do some recurrence reasoning, trying to prove that if both monkeys have the same chance of winning after $n$ letters are typed, they have the same chance of winning after $n+1$ letters are typed.



                                                Let $n$ be greater than or equal to $6$. Given there is no winner after the $n$th letter is typed, and both monkeys have had the same chance of winning up until now, (which is trivially established for $n=6$) there are three cases:




                                                • the letters $n-5$ to $n$ say COCONU and the letter typed is T, monkey 1 wins. $1$ chance out of $26^7$ ($26^6$ suffixes times $26$ possible letters)

                                                • the letters $n-5$ to $n$ say TUNOCO and the letter typed is C, monkey 2 wins. $1$ chance out of $26^7$ ($26^6$ suffixes times $26$ possible letters)

                                                • all other cases, no winner. $26^7 - 2$ out of $26^7$


                                                And the monkeys have the same chance of winning upon the $n+1$th letter typed. This completes the proof by recurrence.




                                                So, reading the question correctly, the one expected to type more will be the one that stops last, assuming they type at the same frequency. There are two cases. Either the first word out is COCONUT, in which case, once monkey 1 stops typing, monkey 2 has T as a prefix for his word to come out. Or the first word out is TUNOCOC, in which case, once monkey 2 stops typing, monkey 1 has COC as a prefix for his word to come out. Therefore, monkey 2 is expected to have to type more than monkey 1 in this rare case, and then same as monkey 1 once he spoils his opportunity to finish the word.
                                                So, yes, monkey 2 is expected to type slightly more, due to these opportunities being unequal.




                                                (b)




                                                If "before" means "before in time", we are in the situation described in the clear text. Both squirrels have the same odds, if they stop watching after the first word appears. If they wait for both words to appear, well, this does not change anything, the word appearing first is the word appearing first.




                                                Whereas




                                                If "before" means "before in the lexical order", squirrel 1 wins, and the fact that a monkey is typing on a computer is irrelevant. This shows that squirrel 1 has better chances of winning anyway.




                                                Look, Ma! No Markov chains!







                                                share|improve this answer














                                                share|improve this answer



                                                share|improve this answer








                                                edited Nov 27 at 13:40

























                                                answered Nov 27 at 13:19









                                                Laurent LA RIZZA

                                                1415




                                                1415















                                                    Popular posts from this blog

                                                    Plaza Victoria

                                                    Puebla de Zaragoza

                                                    Musa