I decided to try OpenAI out. Having seen some examples of what it can do, I figured a reasonable test case is to see if the AI is able to prove that the composition of two cellular automata is a cellular automaton, with some hints. I used the abstract definition of a cellular automaton — a cellular automaton is a shift-commuting continuous map — because it makes the proof trivial if you know basic topology and basic algebra. Also, I wanted a proof that talks about two very different things to see if the AI can keep up with the context.
Getting OpenAI going took me only a few minutes, which surprised me as I’m not very good with computers. I just made an account and ran “python -m pip install openai” in random folders until I found the correct Python installation. Then I just copied the Q&A example from the website:
import os
import openai
openai.api_key = os.getenv("OPENAI_API_KEY")
response = openai.Completion.create(
model="text-davinci-002",
prompt="...",
temperature=0,
max_tokens=100,
top_p=1,
frequency_penalty=0.0,
presence_penalty=0.0,
stop=["\n"]
)
and tried various prompts in place of “…”. I have no idea what all the other parameters mean. Below, I write some prompt-answer pairs in the following format:
My prompt, i.e. what I sent to OpenAI, is written in normal font, and
the AI then continues as it sees fit, and the response is shown in bold.
Often the AI produced its answer in many pieces, and I appended the response of the AI to my prompt and sent it again, and iterated until fixed point: if the AI thinks the prompt needs no additions, it gives an empty answer, and usually it satisfies itself eventually. A technical detail here is that my prompts were cut by newlines after about 80 characters, and the AI mimicked these cuts, usually sending about 80 characters back. I don’t know if it would have produced better results if I had fed in everything as one paragraph, it was just easier to edit things in the Python IDLE with the newlines. I.e. if it thinks it just needs to think 80 characters ahead, does it still think about what’ll come after? I have no idea.
My general strategy for the experiment was to write a proof, and then remove lines at the end and see if the AI can continue. I then try to humansplain what’s going on, and on some occasions try to make it explain what it’s thinking.
Let’s start by checking if the AI can finish off the proof if I give it the
shift-commutation part and leave out just the very end of the topology part.
A cellular automaton is a shift-commuting continuous function on the full shift $S^{\mathbb{Z}}$ consisting two-way infinite sequences over a finite alphabet $S$. Here is the proof that the function composition of two cellular automata is itself a cellular automaton: First, we check shift-commutation: if $f$ and $g$ are cellular automata, we of course have $f \circ g \circ \sigma = f \circ sigma \circ g = \sigma \circ f \circ g$. Second, for continuity we simply recall from basic topology that the composition of two continuous functions is continuous.
Some variations on multiple runs were “is again continuous.” and “is itself continuous.”
A technical note here: Above, I left the LaTeX uncompiled, to show how the AI sees the prompt. Once it turned out that the AI makes fewer LaTeX errors than me, I decided to compile most of the LaTeX in later examples. Unfortunately, WordPress does not allow LaTeX in bold, so you cannot tell which formulas are part of the prompt and which are written by the AI. This should always be clear from context, or I’ve added [formula by AI] before the formula.
The AI’s answer here is perfect. Does this show an understanding of topology, or even basic logic? I asked a human who’s taken some basic math courses if they could fill in the prompt. They said no, but once I told them the answer they said they definitely should’ve been able to deduce it.
I then tried spoonfeeding the AI little less:
A cellular automaton is […] Second, for continuity we need to show that for any
and any
, there is a neighborhood
of
such that for all
,
. But this is true because
and
are continuous.
I have omitted the definition and proof of shift-commutation and written […] instead, but I of course fed in the same stuff to the AI. It wrote out the definition correctly, but then realized it can’t easily do the math, so it just wrote that we’re done.
Let’s explore this, and try feeding it a proof, and see if it even recognizes we’re done if I write the entire proof. I hope I didn’t screw up the proof too bad, it’s a little hard to write with the mixed open sets and epsilons, but I went with the AI’s artistic choice.
A cellular automaton is […] Second, for continuity we need to show that for any
and any
, there is a neighborhood
of
such that for all
,
. To explain in more detail, suppose the
–
claim is true for
and
separately. We first use the continuity of
to find an open set
containing
such that if
then
. We of course have some
such that the
-ball around
is contained in
. Now we use the continuity of
to find open
containing
such that whenever
we have
. Namely, then whenever
, we have
so
which implies
.
Great. But let’s start stripping sentences from the end.
A cellular automaton is […] Second, for continuity we need to show that for any
and any
,
there is a neighborhood
of
such that for all
,
. To explain in more detail, suppose the
–
claim is true for
and
separately. We first use the continuity of
to find an open set
containing
such that if
then
. We of course have some
such that the
-ball around
is contained in
. Now we use the continuity of
to find open
containing
such that whenever
we have
. Namely, then [formula by AI]
we have
and hence
. So
as desired.
The AI was happy after finishing the first sentence, and I had to add “So” to make it finish the proof. Other than that, still going great. Let’s strip more:
A cellular automaton is […] Second, for continuity we need to show that for any
and any
, there is a neighborhood
of
such that for all
,
. To explain in more detail, suppose the
–
claim is true for
and
separately. We first use the continuity of
to find an open set
containing
such that if
then
. We of course have some
such that the
-ball around
is contained in
. Now we use the continuity of
to find open
containing
such that if
then
. We then have
for all
, so
for all
.
Here it writes the incorrect “
” instead of “
” (and I tried it a few times so it seems to mean it :P). If written by a human I would probably just assume it is a typo. Let’s strip more.
A cellular automaton is […] Second, for continuity we need to show that for any
and any
, there is a neighborhood
of
such that for all
,
. To explain in more detail, suppose the
–
claim is true for
and
separately. We first use the continuity of
to find an open set
containing
such that if
then
. We of course have some
such that the
-ball around
is contained in
. Now we use the continuity of
to find an open set
containing
such that if
then
. We then have
for all
, so
for all
.
Instead of “
for all
” it should of course write
for all
. Again it looks like a typo. If
were a homeomorphism this would be basically correct… Let’s strip more:
A cellular automaton is […] Second, for continuity we need to show that for any
and any
, there is a neighborhood
of
such that for all
,
. To explain in more detail, suppose the
–
claim is true for
and
separately. We first use the continuity of
to find an open set
containing
such that if
then
. We of course have some open set
containing
such that if
then
. Now we can take
to be the desired open set containing
such that if
then
.
This no longer looks meaningful. But let’s remove even more:
A cellular automaton is […] Second, for continuity we need to show that for any
and any
, there is a neighborhood
of
such that for all
,
. To explain in more detail, suppose the
–
claim is true for
and
separately. We first use the continuity of
to find an open set
containing
such that for all
,
. Then we use the continuity of
to find an open set
containing
such that for all
,
. Finally, we set
. Then for all
, we have
and
, so
.
Suddenly, we see what appears to be some triangle inequality pattern, which indeed can appear when you are considering two functions. It doesn’t make sense here though. This is starting to feel like abuse, but let’s force it to tell us the entire proof of continuity of composition from scratch, using the definition the AI chose.
A cellular automaton is […] Second, for continuity we need to show that for any
and any
, there is a neighborhood
of
such that for all
,
. To explain in more detail, suppose the
–
claim is true for
and
separately. We first show that it is true for
:
Errr… we show what is true for
? Reminds one of the famous “Show that
are isomorphic.” joke/story. The AI thinks it’s done at this point, but I add “We” at the end to prompt it to continue, and it continues as follows:
[…] show that it is true for
: We have
because
is continuous. So, if
, then
because
is continuous. So, it is enough to show that for any
and any
, there is a neighborhood
of
such that for all
,
.
Ok buddy, you tried your best.
I think the AI may been kind of unlucky here. It’s perhaps outputting line by line due to my ~80 character splitting, and it definitely makes sense that when it is asked to show continuity it writes out the epsilon-delta definition, or rather the epsilon-U definition.
And then, once you write the epsilon-U definition, you have put yourself in a position where you have to start doing basic topology. The AI clearly knows a lot of the relevant things but it doesn’t know how to do this hard a proof, so you just get a lot of nonsense out. A quite human interaction in a way…
<tangent> Side note: why does the AI write an epsilon-U definition? Well, using
kind of makes sense because
is not obviously a metric space. Indeed, I haven’t defined its topology to the AI, and in fact the proof goes through even if we use a non-metrizable topology, just that our cellular automata would not be the standard ones. But then why isn’t it using a
–
definition? Maybe the AI sometimes “thinks” cellular automata are functions from
to real numbers (given what happens in some of the wilder examples below). By the way, if you ask directly, it definitely knows the type signature of a cellular automaton, and more:
A cellular automaton is a shift-commuting continuous function on the full
shift
consisting two-way infinite sequences over a finite alphabet
. The image of a two-way infinite sequence under a cellular automaton is of course also a two-way infinite sequence. Or in notation, [formula by AI]
. In even more detail, this means that if
then [formula by AI]
for some
(see \cite{L}). Here, \cite{L} is the reference for the definition of cellular automata.
Note that I did not actually say what the type of a cellular automaton is: “a function on
” really only specifies the domain. In fact, human referees have complained to me sometimes that the codomain of a cellular automaton is not given, as I always forget this. I’m therefore glad to see it’s obvious even to OpenAI… Of course as the end of the paragraph shows, it even knows the combinatorial characterization of cellular automata (and even abuses notation like a champ). Of course it’s not deducing the characterization, even better, it cites the standard reference “\cite{L}“, which we of course all love, cherish, and have in our bookshelf.</tangent>
Anyway as mentioned, I think the AI was a little unlucky. At the outset (right after stating what we need to prove), we can simply add “simply” to clarify we just want a quick proof, and the AI delivers:
A cellular automaton is […] Second, for continuity we simply note that the composition of two continuous functions is continuous.
This is perfect. But let’s try to get it to somehow indicate that this is something non-trivial that we simply know. I added “We” at the end, and get:
A cellular automaton is […] Second, for continuity we simply note that the composition of two continuous functions is continuous.
We conclude that the composition of two cellular automata is a cellular automaton.
Well, yes, but not what I was looking for. Another attempt:
A cellular automaton is […] Second, for continuity we simply note that the composition of two continuous functions is continuous. In more detail, let
and
be cellular automata. Let
and
. Since
is continuous, there is a
such that
whenever
. Since
is continuous, there is a
such that
whenever
. Then
whenever
.
\end{proof}
This is pretty much perfect. At the end the “\end{proof}” is kind of funny since it didn’t start a proof, but it’s doing this line by line so kind of makes sense: at some point it realizes it’s a long proof so it should be in a proof environment. It just didn’t realize it would be long when it started… Great stuff, though here it’s not clear to me that this is not simply plagiarized from somewhere. I mean, “composition of two continuous functions is continuous” is proved in many places, and the proof that was written is very textbook.
So let’s try renaming our cellular automata:
A cellular automaton is […] Second, for continuity we simply note that the composition of two continuous functions is continuous. In more detail, let
and
be cellular automata. Let
and
be two-way infinite sequences over a finite alphabet
. Let
. Since
is continuous, there exists
such that for all
with
, we have
. Since
is continuous, there exists
such that for all
with
, we have
. Now, for all
with
, we have
. Since
is continuous, we have
.
Looks great, apart from the last sentence, where the AI suddenly starts talking about shift-commutation again. Probably it’s just getting fed up with me.
Ok, what about the algebra half of the proof? I switch the order of the defining properties and now drop the algebra part:
A cellular automaton is a continuous shift-commuting function on
the full shift
consisting two-way infinite sequences over a finite alphabet
. Here is the proof that the function composition of two cellular automata is itself a cellular automaton: First, we check continuity: if
and
are cellular automata, we simply recall from basic topology that the composition of two continuous functions is continuous. For shift-commutation, writing
for the shift, we combine
and
to move the shift over
and
separately as follows: $\sigma \circ f \circ g = \sigma \circ (g \circ f) = (\sigma \circ g) \circ (\sigma \circ f) = g \circ (\sigma \circ f) = g \circ (f \circ \sigma) = (g \circ f) \circ \sigma = f \circ g \circ \sigma$.
As you can see, the response is ok except it randomly writes the formula
for distributivity in there. (I’ve left the LaTeX uncompiled here to show where my prompt ends and the answer starts, but it should compile fine.) Another attempt gives “
.” which is perfect.
Let’s start stripping.
A cellular automaton is […] For shift-commutation, writing
for the shift, we combine
and
to move the shift over
and [formula by AI]
.
It thinks it’s done. Very funny. Let’s ask for details:
A cellular automaton is […] For shift-commutation, writing
for the shift, we combine
and
to move the shift over
and
. Explicitly, [formula by AI] 
Looks perfect. Let’s try again:
A cellular automaton is […] For shift-commutation, writing
for the shift, we combine
and
to move the shift over
and
. More explicitly, we calculate
\begin{align*}
\sigma \circ (f \circ g) &= (\sigma \circ f) \circ g \
\text{(by definition of composition)} \
(f \circ \sigma) \circ g \
\text{(by shift-commutation of $f$)} \
f \circ (\sigma \circ g) \
\text{(by definition of composition)} \
f \circ (g \circ \sigma) \
\text{(by shift-commutation of $g$)} \
(f \circ g) \circ \sigma \
\text{(by definition of composition)} \
(f \circ g) \circ \sigma \
\text{(by definition of composition)} \
(f \circ g) \circ \sigma \
…
The AI seems to go into an infinite loop, but the proof is basically correct otherwise. If you add the \end, it compiles and even looks remotely reasonable when compiled (yet unreasonable enough that I can’t imagine it’s directly copied from anywhere). So this is also quite nice.
Let’s remove even more:
A cellular automaton is […] For shift-commutation, writing
for the shift, we combine $\sigma \circ f =
f \circ \sigma$ with $\sigma \circ g = g \circ \sigma$ to get
$\sigma \circ (f \circ g) = (f \circ g) \circ \sigma$.
Note that it produces from a word description of shift-commutation
the correct formulas, and then states the correct conclusion.
Now it did not do the actual algebra, but this argument is again quite passable.
A cellular automaton is […] For shift-commutation, writing
for the shift, we have

where the first and third equalities are by definition of cellular
automaton, the second and fourth equalities are by shift-commutation of
and
, respectively, and the last equality is by definition of composition.
This one is amazing. It writes the full computation, and at the end also explains all the steps. But it seems to be mostly garbage. Let’s ask it to explain
shift-commutation from scratch, and see if it does better.
A cellular automaton is […] For shift-commutation, we have to show that
is shift-equivalent to
for all
. This is a simple consequence of the fact that
and
are shift-commuting:
for all
and
.
This is of course nonsense, but interesting. Of course “
is shift-equivalent to
” is not what we should show. It would have been quite striking if it had named the shift and given an algebraic proof, and I did not expect that to happen. Instead it just writes a commutation type formula, which is really the only simple thing you can do without naming the shift, if you want to write something where the “types” are correct. And of course we need to show commutation of some kind, making this even more understandable.
The rest of the proof is just sort of random formulas. There is certainly some structure: the AI starts using subscripts with a
, and it remembers to quantify
afterward, and the formulas are indeed very shifty, although they do not quite parse.
Of course, the AI also uses the strange word “shift-equivalent“. This is a technical term in symbolic dynamics, and if the OpenAI is widely-read in this field, it would’ve seen “shift-equivalent” near objects like
. Is it using this term because it knows symbolic dynamics, or is it pure chance that it guessed a term like this? I am guessing the latter, but who knows. This term doesn’t really come up in the rest of the proof. Let’s press on this term, and ask what this means:
A cellular automaton is […] For shift-commutation, we have to show that
is shift-equivalent to
, where we recall that shift-equivalent means that there is a homeomorphism of
that maps
to
and commutes with the shift. To show this, we define the homeomorphism
by
. Then
is a homeomorphism of
that maps
to
and commutes with the shift. Thus, the composition of two cellular automata is a cellular automaton.
Wow. So a shift-equivalence between two points
,
means that a reversible CA maps
to
. Well, why not…
This is probably a good point to end the post. In conclusion, the AI seems to understand language and basic logic, i.e. what it’s supposed to prove at each point. It is also pretty good at remembering the context, knows basic algebra, and is good at LaTeX. It has also clearly read some cellular automata and topology literature. However, it is not able to make a nontrivial proof, and there are frequent hints that it does not really understand what’s going on. I don’t know what the takeaway is.