I decided to try OpenAI out. Having seen some examples of what it can do, I figured a reasonable test case is to see if the AI is able to prove that the composition of two cellular automata is a cellular automaton, with some hints. I used the abstract definition of a cellular automaton — a cellular automaton is a shift-commuting continuous map — because it makes the proof trivial if you know basic topology and basic algebra. Also, I wanted a proof that talks about two very different things to see if the AI can keep up with the context.

Getting OpenAI going took me only a few minutes, which surprised me as I’m not very good with computers. I just made an account and ran “python -m pip install openai” in random folders until I found the correct Python installation. Then I just copied the Q&A example from the website:

```
import os
import openai
openai.api_key = os.getenv("OPENAI_API_KEY")
response = openai.Completion.create(
model="text-davinci-002",
prompt="...",
temperature=0,
max_tokens=100,
top_p=1,
frequency_penalty=0.0,
presence_penalty=0.0,
stop=["\n"]
)
```

and tried various prompts in place of “…”. I have no idea what all the other parameters mean. Below, I write some prompt-answer pairs in the following format:

My prompt, i.e. what I sent to OpenAI, is written in normal font, and

the AI then continues as it sees fit, and the response is shown in bold.

Often the AI produced its answer in many pieces, and I appended the response of the AI to my prompt and sent it again, and iterated until fixed point: if the AI thinks the prompt needs no additions, it gives an empty answer, and usually it satisfies itself eventually. A technical detail here is that my prompts were cut by newlines after about 80 characters, and the AI mimicked these cuts, usually sending about 80 characters back. I don’t know if it would have produced better results if I had fed in everything as one paragraph, it was just easier to edit things in the Python IDLE with the newlines. I.e. if it thinks it just needs to think 80 characters ahead, does it still think about what’ll come after? I have no idea.

My general strategy for the experiment was to write a proof, and then remove lines at the end and see if the AI can continue. I then try to humansplain what’s going on, and on some occasions try to make it explain what it’s thinking.

Let’s start by checking if the AI can finish off the proof if I give it the

shift-commutation part and leave out just the very end of the topology part.

A cellular automaton is a shift-commuting continuous function on the full shift $S^{\mathbb{Z}}$ consisting two-way infinite sequences over a finite alphabet $S$. Here is the proof that the function composition of two cellular automata is itself a cellular automaton: First, we check shift-commutation: if $f$ and $g$ are cellular automata, we of course have $f \circ g \circ \sigma = f \circ sigma \circ g = \sigma \circ f \circ g$. Second, for continuity we simply recall from basic topology that the composition of two continuous functions

is continuous.

Some variations on multiple runs were “**is again continuous.” **and** “is itself continuous**.”

A technical note here: Above, I left the LaTeX uncompiled, to show how the AI sees the prompt. Once it turned out that the AI makes fewer LaTeX errors than me, I decided to compile most of the LaTeX in later examples. Unfortunately, WordPress does not allow LaTeX in bold, so you cannot tell which formulas are part of the prompt and which are written by the AI. This should always be clear from context, or I’ve added [formula by AI] before the formula.

The AI’s answer here is perfect. Does this show an understanding of topology, or even basic logic? I asked a human who’s taken some basic math courses if they could fill in the prompt. They said no, but once I told them the answer they said they definitely should’ve been able to deduce it.

I then tried spoonfeeding the AI little less:

A cellular automaton is […] Second, for continuity

we need to show that for anyand any, there is a neighborhoodofsuch that for all,. But this is true becauseandare continuous.

I have omitted the definition and proof of shift-commutation and written […] instead, but I of course fed in the same stuff to the AI. It wrote out the definition correctly, but then realized it can’t easily do the math, so it just wrote that we’re done.

Let’s explore this, and try feeding it a proof, and see if it even recognizes we’re done if I write the entire proof. I hope I didn’t screw up the proof too bad, it’s a little hard to write with the mixed open sets and epsilons, but I went with the AI’s artistic choice.

A cellular automaton is […] Second, for continuity

we need to show that for anyand any, there is a neighborhoodofsuch that for all,.To explain in more detail, suppose the – claim is true for and separately. We first use the continuity of to find an open set containing such that if then . We of course have some such that the -ball around is contained in . Now we use the continuity of to find open containing such that whenever we have . Namely, then whenever , we have so whichimplies.

Great. But let’s start stripping sentences from the end.

A cellular automaton is […] Second, for continuity

we need to show that for anyand any,

there is a neighborhoodofsuch that for all,.To explain in more detail, suppose the – claim is true for and separately. We first use the continuity of to find an open set containing such that if then . We of course have some such that the -ball around is contained in . Now we use the continuity of to find open containing such that whenever we have . Namely, then [formula by AI]we haveand hence.Soas desired.

The AI was happy after finishing the first sentence, and I had to add “So” to make it finish the proof. Other than that, still going great. Let’s strip more:

A cellular automaton is […] Second, for continuity

we need to show that for anyand any, there is a neighborhoodofsuch that for all,.To explain in more detail, suppose the – claim is true for and separately. We first use the continuity of to find an open set containing such that if then . We of course have some such that the -ball around is contained in . Now we use the continuity of to find open containing such thatifthen. We then havefor all, sofor all.

Here it writes the incorrect “” instead of “” (and I tried it a few times so it seems to mean it :P). If written by a human I would probably just assume it is a typo. Let’s strip more.

A cellular automaton is […] Second, for continuity we need to show that for any and any , there is a neighborhood of such that for all , . To explain in more detail, suppose the – claim is true for and separately. We first use the continuity of to find an open set containing such that if then . We of course have some such that the -ball around is contained in .

Now we use the continuity ofto find an open setcontainingsuch that ifthen. We then havefor all, sofor all.

Instead of “ for all ” it should of course write for all . Again it looks like a typo. If were a homeomorphism this would be basically correct… Let’s strip more:

A cellular automaton is […] Second, for continuity we need to show that for any and any , there is a neighborhood of such that for all , . To explain in more detail, suppose the – claim is true for and separately. We first use the continuity of to find an open set containing such that if then . We of course have some

open setcontainingsuch that ifthen. Now we can taketo be the desired open set containingsuch that ifthen.

This no longer looks meaningful. But let’s remove even more:

A cellular automaton is […] Second, for continuity we need to show that for any and any , there is a neighborhood of such that for all , . To explain in more detail, suppose the – claim is true for and separately. We first use the continuity of to find an open set containing such

that for all,. Then we use thecontinuity ofto find an open setcontainingsuch that for all,. Finally, we set. Then for all, we haveand, so.

Suddenly, we see what appears to be some triangle inequality pattern, which indeed can appear when you are considering two functions. It doesn’t make sense here though. This is starting to feel like abuse, but let’s force it to tell us the entire proof of continuity of composition from scratch, using the definition the AI chose.

A cellular automaton is […] Second, for continuity we need to show that for any and any , there is a neighborhood of such that for all , . To explain in more detail, suppose the – claim is true for and separately. We first

show that it is true for:

Errr… we show what is true for ? Reminds one of the famous “Show that are isomorphic.” joke/story. The AI thinks it’s done at this point, but I add “We” at the end to prompt it to continue, and it continues as follows:

[…]

show that it is true for:Wehavebecauseis continuous. So, if, thenbecauseis continuous. So, it is enough to show that for anyand any, there is a neighborhoodofsuch that for all,.

Ok buddy, you tried your best.

I think the AI may been kind of unlucky here. It’s perhaps outputting line by line due to my ~80 character splitting, and it definitely makes sense that when it is asked to show continuity it writes out the epsilon-delta definition, or rather the epsilon-U definition.

And then, once you write the epsilon-U definition, you have put yourself in a position where you have to start doing basic topology. The AI clearly knows a lot of the relevant things but it doesn’t know how to do this hard a proof, so you just get a lot of nonsense out. A quite human interaction in a way…

<tangent> Side note: why does the AI write an epsilon-U definition? Well, using kind of makes sense because is not obviously a metric space. Indeed, I haven’t defined its topology to the AI, and in fact the proof goes through even if we use a non-metrizable topology, just that our cellular automata would not be the standard ones. But then why isn’t it using a – definition? Maybe the AI sometimes “thinks” cellular automata are functions from to real numbers (given what happens in some of the wilder examples below). By the way, if you ask directly, it definitely knows the type signature of a cellular automaton, and more:

A cellular automaton is a shift-commuting continuous function on the full

shift consisting two-way infinite sequences over a finite alphabet . The image of a two-way infinite sequence under a cellular automaton is of coursealso a two-way infinite sequence.Or in notation, [formula by AI].In even more detail, this means that if then [formula by AI]for some(see \cite{L}).Here, \cite{L} is the referencefor the definition of cellular automata.

Note that I did not actually say what the type of a cellular automaton is: “a function on ” really only specifies the domain. In fact, human referees have complained to me sometimes that the codomain of a cellular automaton is not given, as I always forget this. I’m therefore glad to see it’s obvious even to OpenAI… Of course as the end of the paragraph shows, it even knows the combinatorial characterization of cellular automata (and even abuses notation like a champ). Of course it’s not deducing the characterization, even better, it cites the standard reference “**\cite{L}**“, which we of course all love, cherish, and have in our bookshelf.</tangent>

Anyway as mentioned, I think the AI was a little unlucky. At the outset (right after stating what we need to prove), we can simply add “simply” to clarify we just want a quick proof, and the AI delivers:

A cellular automaton is […] Second, for continuity we simply

note that the composition of two continuous functions is continuous.

This is perfect. But let’s try to get it to somehow indicate that this is something non-trivial that we simply know. I added “We” at the end, and get:

A cellular automaton is […] Second, for continuity we simply note that the composition of two continuous functions is continuous.

Weconclude that the composition of two cellular automata is a cellular automaton.

Well, yes, but not what I was looking for. Another attempt:

A cellular automaton is […] Second, for continuity we simply note that the composition of two continuous functions is continuous. In more detail,

letandbe cellular automata. Letand. Sinceis continuous, there is asuch thatwhenever. Sinceis continuous, there is asuch thatwhenever. Thenwhenever.

\end{proof}

This is pretty much perfect. At the end the “**\end{proof}**” is kind of funny since it didn’t start a proof, but it’s doing this line by line so kind of makes sense: at some point it realizes it’s a long proof so it should be in a proof environment. It just didn’t realize it would be long when it started… Great stuff, though here it’s not clear to me that this is not simply plagiarized from somewhere. I mean, “composition of two continuous functions is continuous” is proved in many places, and the proof that was written is very textbook.

So let’s try renaming our cellular automata:

A cellular automaton is […] Second, for continuity we simply note that the composition of two continuous functions is continuous. In more detail,

letandbe cellular automata. Letandbe two-way infinite sequences over a finite alphabet. Let. Sinceis continuous, there existssuch that for allwith, we have. Sinceis continuous, there existssuch that for allwith, we have. Now, for allwith, we have. Sinceis continuous, we have.

Looks great, apart from the last sentence, where the AI suddenly starts talking about shift-commutation again. Probably it’s just getting fed up with me.

Ok, what about the algebra half of the proof? I switch the order of the defining properties and now drop the algebra part:

A cellular automaton is a continuous shift-commuting function on

the full shift consisting two-way infinite sequences over a finite alphabet . Here is the proof that the function composition of two cellular automata is itself a cellular automaton: First, we check continuity: if and are cellular automata, we simply recall from basic topology that the composition of two continuous functions is continuous. For shift-commutation, writing for the shift, we combine and to move the shift over and separately as follows: $\sigma \circ f \circ g =\sigma \circ (g \circ f) = (\sigma \circ g) \circ (\sigma \circ f) = g \circ (\sigma \circ f) = g \circ (f \circ \sigma) = (g \circ f) \circ \sigma = f \circ g \circ \sigma$.

As you can see, the response is ok except it randomly writes the formula

for distributivity in there. (I’ve left the LaTeX uncompiled here to show where my prompt ends and the answer starts, but it should compile fine.) Another attempt gives “.” which is perfect.

Let’s start stripping.

A cellular automaton is […] For shift-commutation, writing

for the shift, we combine and to move the shift over and [formula by AI].

It thinks it’s done. Very funny. Let’s ask for details:

A cellular automaton is […] For shift-commutation, writing

for the shift, we combine and to move the shift over and . Explicitly, [formula by AI]

Looks perfect. Let’s try again:

A cellular automaton is […] For shift-commutation, writing

for the shift, we combine and to move the shift over and . More explicitly, we calculate\begin{align*}

\sigma \circ (f \circ g) &= (\sigma \circ f) \circ g \

\text{(by definition of composition)} \

(f \circ \sigma) \circ g \

\text{(by shift-commutation of $f$)} \

f \circ (\sigma \circ g) \

\text{(by definition of composition)} \

f \circ (g \circ \sigma) \

\text{(by shift-commutation of $g$)} \

(f \circ g) \circ \sigma \

\text{(by definition of composition)} \

(f \circ g) \circ \sigma \

\text{(by definition of composition)} \

(f \circ g) \circ \sigma \

…

The AI seems to go into an infinite loop, but the proof is basically correct otherwise. If you add the \end, it compiles and even looks remotely reasonable when compiled (yet unreasonable enough that I can’t imagine it’s directly copied from anywhere). So this is also quite nice.

Let’s remove even more:

A cellular automaton is […] For shift-commutation, writing

for the shift, we combine $\sigma \circ f =f \circ \sigma$ with $\sigma \circ g = g \circ \sigma$ to get

$\sigma \circ (f \circ g) = (f \circ g) \circ \sigma$.

Note that it produces from a word description of shift-commutation

the correct formulas, and then states the correct conclusion.

Now it did not do the actual algebra, but this argument is again quite passable.

A cellular automaton is […] For shift-commutation, writing

for the shift, wehave

where the first and third equalities are by definition of cellular

automaton, the second and fourth equalities are by shift-commutation of and , respectively, and the last equality is by definition of composition.

This one is amazing. It writes the full computation, and at the end also explains all the steps. But it seems to be mostly garbage. Let’s ask it to explain

shift-commutation from scratch, and see if it does better.

A cellular automaton is […] For shift-commutation,

we have to show thatis shift-equivalent tofor all. This is a simple consequence of the fact thatandare shift-commuting:for alland.

This is of course nonsense, but interesting. Of course “ **is shift-equivalent to** ” is not what we should show. It would have been quite striking if it had named the shift and given an algebraic proof, and I did not expect that to happen. Instead it just writes a commutation type formula, which is really the only simple thing you can do without naming the shift, if you want to write something where the “types” are correct. And of course we need to show commutation of some kind, making this even more understandable.

The rest of the proof is just sort of random formulas. There is certainly some structure: the AI starts using subscripts with a , and it remembers to quantify afterward, and the formulas are indeed very shifty, although they do not quite parse.

Of course, the AI also uses the strange word “**shift-equivalent**“. This is a technical term in symbolic dynamics, and if the OpenAI is widely-read in this field, it would’ve seen “shift-equivalent” near objects like . Is it using this term because it knows symbolic dynamics, or is it pure chance that it guessed a term like this? I am guessing the latter, but who knows. This term doesn’t really come up in the rest of the proof. Let’s press on this term, and ask what this means:

A cellular automaton is […] For shift-commutation, we have to show that is shift-equivalent to , where we recall that shift-equivalent means

that there is a homeomorphism ofthat mapstoand commutes with the shift. To show this, we define the homeomorphismby. Thenis a homeomorphism ofthat mapstoand commutes with the shift. Thus, the composition of two cellular automata is a cellular automaton.

Wow. So a shift-equivalence between two points , means that a reversible CA maps to . Well, why not…

This is probably a good point to end the post. In conclusion, the AI seems to understand language and basic logic, i.e. what it’s supposed to prove at each point. It is also pretty good at remembering the context, knows basic algebra, and is good at LaTeX. It has also clearly read some cellular automata and topology literature. However, it is not able to make a nontrivial proof, and there are frequent hints that it does not really understand what’s going on. I don’t know what the takeaway is.