Simulating a Universal Turing Machine in Google Sheets
2024-03-21 • 9 min read
In this article I will explain how I implemented a Universal Turing Machine in Google Sheets using formulas only. A Turing machine is an abstract machine that consists of a tape of infinite length divided into discrete cells, a read/write head that can move left or right along the tape, and a finite set of states. Each cell on the tape can hold a symbol from a finite alphabet, including a blank symbol, which we will be denoting with an underscore _. A Universal Turing Machine (UTM) is a Turing machine that is capable of simulating the behavior of any other Turing machine.
You can make a copy of the spreadsheet from here and play with it. The simulator is heavily inspired by this online simulator and most of the example programs are taken from there.
Demo
Below is a demo of the machine. The program is loaded into A1:A with the following notation:
- (A1) Name of the program - [name_of_the_program]
- (A2) Initial state - init: [initial_state]
- (A3) Accepting state - accept: [accepting_state]
- (A4#) δ - [cur_state] [read_symbol] [next_state] [write_symbol] [>|<|-]
Where δ denotes the transition function.
Accepted
In this example we have a Turing machine that accepts the language {a^n.b^n.c^n | n>0}, which are all the strings consisting of 'a's followed by 'b's followed by 'c's where the number of 'a's, 'b's and 'c's is the same.
Here's a more interesting machine that sums 2 decimal numbers. (The program is taken from here and adapted to match the notation)
Rejected
Here's an example where the machine rejects the input
Max steps reached
And here's an example where the machine reaches the maximum amount of steps
Formula
The formula that simulates the machine is the following:
1. =ARRAYFORMULA(LET(
2. input,IF(E15="","_",E15),
3. ini_pos_arr,SEQUENCE(LEN(input)),
4. ini_sym_arr,MID(input,ini_pos_arr,1),
5. transitions,SPLIT(A4:A," "),
6. IMPLODE,LAMBDA(arr,TEXTJOIN(CHAR(9),1,BYROW(IFNA(arr),LAMBDA(r,TEXTJOIN(CHAR(127),1,r))))),
7. EXPLODE,LAMBDA(str,SPLIT(TOCOL(SPLIT(str,CHAR(9))),CHAR(127))),
8. ø,ARRAY_CONSTRAIN(,,),
9. cur_state_arr,INDEX(transitions,,1),
10. read_sym_arr,INDEX(transitions,,2),
11. res,REDUCE(IMPLODE(HSTACK(ini_pos_arr,ini_sym_arr,REGEXEXTRACT(A2," (.+)"),1)),SEQUENCE(S15),LAMBDA(table_,_,LET(
12. table,EXPLODE(CHOOSEROWS(table_,-1)),
13. cur_pos_arr,INDEX(table,,1),cur_sym_arr,INDEX(table,,2),cur_state,INDEX(table,1,3),cur_pos,INDEX(table,1,4),
14. new,FILTER(transitions,EXACT(cur_state_arr,cur_state),EXACT(read_sym_arr,XLOOKUP(cur_pos,cur_pos_arr,cur_sym_arr))),
15. new_state,INDEX(new,1,3),new_sym,INDEX(new,1,4),new_pos,cur_pos+SWITCH(INDEX(new,1,5),">",1,"<",-1,),
16. new_pos_arr,VSTACK(IF(new_pos<MIN(cur_pos_arr),new_pos,ø),cur_pos_arr,IF(new_pos>MAX(cur_pos_arr),new_pos,ø)),
17. new_sym_arr,VSTACK(IF(new_pos<MIN(cur_pos_arr),"_",ø),IF(cur_pos_arr=cur_pos,new_sym,cur_sym_arr),IF(new_pos>MAX(cur_pos_arr),"_",ø)),
18. new_table,HSTACK(new_pos_arr,new_sym_arr,new_state,new_pos),
19. VSTACK(table_,IMPLODE(IF(ISNA(new_state),table,new_table)))))),
20. res))
Where E15 is the input, A4:A is the array of transitions, A2 is the cell containing the initial state and S15 is the maximum amount of steps
Explanation
Step 1 - Table initialization
The idea behind the implementation is very simple, to simulate the movement of the head along the tape we won't be actually moving the head but rather we will be moving the tape and keeping the head static, the reason we do this is because if we move the head there's a risk that we move it outside the bounds of the tape.
The simulation is done by first assigning a number to each cell on the tape where by convention the number 1 is assigned to the cell that the head is initially pointing to and all the other numbers are assigned sequentially. After this we create a table with two columns: position and symbol which simulate the current state of the tape. However, this is not sufficient because we also need to know what the current state is and which position the head is currently pointing to. So in addition to the Nx2 table (position, symbol), we also create a 1x2 table (current state, current position) and stack them together.
Here's an example of how the initialized table may look like:
The 'position' column is generated with the following formula:
SEQUENCE(LEN(input))
The 'symbol' column is generated with the following formula, which is the most common way of splitting a string into its characters:
MID(input,SEQUENCE(LEN(input)),1)
Then we have the 'current state' obtained with:
REGEXEXTRACT(A2," (.+)")
And finally we have the current position which is a hardcoded 1.
Step 2 - Get next transition
Once we have the initialized table, we have to get the next transition. So we need to retrieve from the transitions array A4:A the transition that corresponds to the current configuration. This is done using the state we are currently in and the symbol we are currently reading. The state we are currently in is easy to get as we have a value in our table specifically for that, in our case it's q0. To get the symbol we are currently reading, we need to look up the current position in the position array and return the corresponding value from the symbol array, in our case the current symbol is a.
Once we have the current state and the current symbol, we can retrieve the next transition from the transition array using the FILTER() function
FILTER(transitions,
EXACT(cur_state_arr,cur_state),
EXACT(read_sym_arr,XLOOKUP(cur_pos,cur_pos_arr,cur_sym_arr)))
Where cur_state_arr and read_sym_arr are respectively the first two column of the transitions array, cur_state is the current state, cur_pos is the current position and cur_pos_arr and cur_sym_arr are the position and symbol columns of our table. In a deterministic machine, this formula is guaranteed to return a single row.
In the image above the transition, which corresponds to the transition in A4, is splitted so we can more easily work with it.
Step 3 - Iteratively update the table
Now that we have the next transition, which we will call new, we can update our table, the new state is straightforward :
new_state ← INDEX(new,1,3)
The new position is obtained as follows:
new_pos ← cur_pos + SWITCH(INDEX(new,1,5), ">", 1, "<", -1, 0)
Essentially, we are telling the formula to check the shift, which is the 5th value from the transitions array, if it's > add 1 to the current position, if it's < subtract 1 from the current position, otherwise keep the current position unchanged.
The updated position and symbol arrays are a bit more tricky, the former is obtained with the following formula:
new_pos_arr ← VSTACK(
IF(new_pos<MIN(cur_pos_arr),new_pos,ø),
cur_pos_arr,
IF(new_pos>MAX(cur_pos_arr),new_pos,ø)
)
Where ø is a 0x0 pseudo-array generated with:
ARRAY_CONSTRAIN(,,)
This formula is checking if the new position is outside the bounds of the minimum or maximum position, if it is, it respectively stacks it on the top or on the bottom of the current positions array, if it's not, it simply returns the current positions array unchanged.
The updated symbol array is obtained in a similar way:
new_sym_arr ← VSTACK(
IF(new_pos<MIN(cur_pos_arr),"_",ø),
IF(cur_pos_arr=cur_pos,new_sym,cur_sym_arr),
IF(new_pos>MAX(cur_pos_arr),"_",ø)
)
Just like the case above, we are checking if the new position is outside the bounds of the minimum or maximum position, if it is, it respectively stacks an underscore _, which denotes an empty value, on the top or on the bottom of the current symbols array, if it's not, we don't stack anything on the current symbols array, we simply update it by replacing the current symbol with the new symbol.
IF(cur_pos_arr=cur_pos, new_sym, cur_sym_arr)
Here's how the updated table would look like after 1 iteration
And here's how it would look like after 8 iterations
The iterations are performed using the REDUCE() function.
Step 4 - Handle extra iterations and store previous iterations
The way the formula is developed doesn't automatically stop iterating after reaching an accepting state, but it keeps iterating the amount of times specified in max steps, this eventually leads the formula to error out when getting the next transition, because it tries to get the next transition from a state that is final. This issue is easily addressed using the ISNA() function.
IF(ISNA(new_state),table,new_table)
Which essentially means, if the new state of the updated table is an error, return the table unchanged, otherwise return the updated table.
Another issue we have is that REDUCE() by default only keeps the last itearation, however we want to be able to access all of them. We can do this by stacking the tables on top of each other but this is not a practical solution because a number of iterations as low as 300 could easily result into tens of thousands of rows of data. The way we solve this issue is by condensing each table into a 1x1 cell by joining it vertically and horizontally.
We can do this with all the iterations and stack them on top of each other
We can now get the table corresponding to an arbitrary iteration using a function like INDEX(), explode it and display it in a nicer format, which we will be doing in the next section
Visual representation
The formula used to visually represent the tape is the following:
1. =ARRAYFORMULA(LET(
2. EXPLODE,LAMBDA(str,SPLIT(TOCOL(SPLIT(str,CHAR(9))),CHAR(127))),
3. res,EXPLODE(INDEX(AD:AD,F7+1)),
4. SUBSTITUTE(IFNA(VLOOKUP(SEQUENCE(1,23,-10)+INDEX(res,1,4)-1,res,2,)),"_",)))
Where AD:AD is the range containing all the imploded tables and F7 is the current step. Since the position pointed to by the head at the current step may not be position 1, after retrieving the specified table and exploding it, the formula is normalizing the positions to simulate the head always pointing at position 1 then it's running a VLOOKUP() to get the symbols corresponding to their respective positions.
The counter is created using a technique called ghost values devised by Shay, a fellow Google Sheets wizard.
Upping the ante
If we want to go further, we can create a multi tape Turing machine. This is done by keeping track of the state of 2 or more tapes rather than 1, so we will need more than 1 table.
Here's an example of a Turing machine with 2 tapes that checks whether the given binary number is a palindrome:
And here's an example of a Turing machine with 3 tapes that performs binary addition: