Synechepedia

Notes on TLA+

WIP

There are notes compiled while learning the basics of TLA+.

NOTE: My initial notes here have a polemical tone. Please know that it is part of my learning process, and, until this notice is removed, any critical remarks here should be read as an exclamatory rather than an assertoric utterance. I don’t know enough yet to have a justified critical opinion.

My Initial Point of View

Over the last few years, I have spent some time reading about the constructivist tradition of formal verification, which is currently flourishing in the form of theorem provers/proof assistants and languages with increasingly expressive type-theoretic type systems. Prior to that, I had a years long infatuation with Prolog. It seems to be me that the approach in Prolog (and relational and constraint programming) is similar to the model checking approach: the user writes high-level declarations and an underlying algorithm explores the problem space to verify whether certain properties hold. This is in stark contrast to the constructivist approach, in which verification can only be achieved through a construction of the objective.

The constructivist approach appeals to me because it is both philosophically satisfying (Synechism implies Structuralism which – in my (admittedly post-structuralist) view – implies Constructivism1) and demonstrably effective in real world applications (see CompCert or the success of Rust).

However, the success and popularity of the model-checking approach should not be overlooked, and recent discussions with a prospective employer have given me good reason to learn the basics. So I’m digging in now, using the noted Resources and, and remarking what I learn and figure out along the way here.

Installation

This is a Linux-centric summary of the key steps mentioned by http://lamport.azurewebsites.net/tla/toolbox.html.

Prerequisites

A java runtime >= 1.8

For my system (Arch):

sudo pacman -Syu jdk-openjdk=13.0.1.u9-1

Download the release

Unzip to your desired installation directory

E.g.,

mkdir /opt/TLAToolbox/
sudo unzip TLAToolbox-1.6.0-linux.gtk.x86_64.zip -d /opt/TLAToolbox/

Run the executable

/opt/TLAToolbox/toolbox/toolbox

The TLA+ Toolbox GUI should launch

Fixing the colors (in Manjaro with a dark global theme)

I’m on Linux, specifically using the i3 edition of the Manjaro distribution (largely a wrapper around Arch). I opted for a dark global theme, which occasionally requires manually touching up the appearance configs of an application to fix dark-on-dark text. The TLAToolbox needed just such a touch up on my system. However, for whatever reason, the appearance configs for the TLAToolbox had no observable effect on the actual appearance. I ended up working around this by disabling the dark themeing in my launcher for executable.

I created the following shell script to launch the toolbox:

$ cat /usr/bin/tlaplus
# Need to override the dark themeing since the TLAToolbox is insufficently configurable
export GTK_THEME=Adwaita
exec /opt/TLAToolbox/toolbox/toolbox

After a sudo chmod +x /usr/bin/tlaplus, I can launch the toolbox by running tlaplus.

First Impressions

Syntax

Compared with the theorem proving languages (Coq, Agda, Lean), logic programming languages (Prolog, Datalog), and algebraic specification languages (CafeOBJ and Maude) I’ve been exposed to, the syntax seems quite clunky and appears to be pulled out of a grab bag of different symbol sets.

There are no fewer than 3 different syntaxes at play: TLC, PlusCal P, and PlusCal C. The latter 2 compile to the first and are apparently alternative surface syntaxes for the same underlying language. The tutorial I’m working through uses PlusCal P, so I’ll only address the syntax of TLC and PlusCal P.

TLC

Executable code must be enclosed in ASCII boilerplate matching this pattern:

-— MODULE <file_name> -— <program> ==

There must be at minimum four preceding - and four terminating =. Anything outside of this block will be ignored. Why? No explanation is given, so likely just an artifact of arbitrary decisions during earlier development.

TLC uses a mix of ASCII symbols (e.g., /\ and \/ for conjunction and disjunction), Latex syntax (e.g., \in for set membership), and KEYWORDS (e.g., SUBSET, IF, LET). Given that users are told they must install the (JVM based) TLAToolBox IDE to use the TLA+ system effectively, we might wonder why the IDE cannot support unicode, as can, e.g., Coq and Adga (without requiring you to install a bespoke JVM IDE).

Here’s an operation from an example spec that ships with the TLAToolbox:

Move(S,b) = /\ Cardinality(S) \in {1,2} /\ LET newThisBank = who_is_on_bank[b] \ S newOtherBank == who_is_on_bank[OtherBank(b)] ∪ S IN /\ IsSafe(newThisBank) /\ IsSafe(newOtherBank) /\ bank_of_boat’ = OtherBank(b) /\ who_is_on_bank’ = [i ∈ {“E”,“W”} |-> IF i = b THEN newThisBank ELSE newOtherBank]

Some oddities to note:

  • Whereas \foo is a LaTeX-like identifier (e.g., \in and \cup) the \ alone means set difference. If we are using \cup for union, why wouldn’t we use \setminus for difference? Presumably because \ is an icon for the unicode \[\setminus\]. Using the prefix for a class of identifiers as it’s own identifier instead of using the appropriate, assigned identifier from within that class seems like a very bad idea to me.
  • In most languages I’ve used, == indicates (some sort of) equality and = indicates assignment. That is reversed here.
  • The notation [x \in D |-> exp] is an anonymous “array/function” from x, which must be in the domain D, to exp.
  • Note the early appearance of stringly typed programming in the specification of the domain: {"E", "W"}. What happens if a value is supplied that is not in the domain? In my experiment this produced a runtime error with this helpful output: No error information.

PlusCal P

Why PlusCal?

In contrast to TLA+’s action-oriented focus on distributed systems, PlusCal most resembles an imperative programming language and is better-suited when specifying sequential algorithms. … PlusCal was designed to replace pseudocode, retaining its simplicity while providing a formally-defined and verifiable language

(wikipedia)

In other words, TLC seemes to be sufficiently difficult and unintuitive for imperative programmers to write that a toy imperative language was developed that transpiles to TLC. The idea is that users will implement their algorithms in PlusCal, transpile to TLC to verify them, then reimplement the algorithms in the target language.

Compare this with the functional approach, wherein the actual implementation language is designed to replace pseudocode and be formally verifiable. In this context, verification tools generally support extraction directly into executable programs.

Comments in your algorithms in your comments

PlusCal code must be written within a comment with a new kind of adhoc delimiter:

(* --algorithm algorithm_name
\* PlusCal code
algorithm; *)

Note that there is a comment syntax, \*, for PlusCal that works inside of the comment syntax for TLC, (* ... *).

Apparently the practice is to write your PlusCal code combedded in comments, and then execute the IDE function “Translate PlusCal Algorithm” which will elaborate the PlusCal code into generated TLC code in the very same file. This means that, for PlusCal, there is no sensible division between compiled build artifacts and source code.

Contex switch, syntax switch

In a PlusCal algorithm, variables are defined before the procedure:

variables x = 1, y \in {3, 4}, z = {3, 4};
begin
\* PlusCal code
end

Note these arbitrary differences from TLC:

  • = now means assignment (but we’re still using the latex symbol for set membersip)
  • statements are now terminated with a semicolon
  • keywords (i.e., variables, begin, end) are now lower case
Manually labeling your lines

Do you (or did you) miss the tedium of manually labeling your steps and using GOTOs for control flow? PlusCal has you covered.

IDE

  • Since it’s a JVM app, it didn’t initially integrate well with my OS, and I had to employ the workaround indicated above.
  • Since it’s a bespoke IDE environment
    • none of my configurations, bindings, or preferences from my usual programming setup are available
    • the commands, interface, and UI is all idiosyncratic, and unlike anything else I’ve used

Invariants have to be input via the GUI?

I hope you like programming with checkboxes and drop down menus! Apparently, in order to actually get TLA+ to check invariants, you have to navigate to the appropriate form in the “model” interface, click “Add”, and then enter your expression. Why wouldn’t the invariants be part of the source code?

Excerpts

In complex systems, simplicity isn’t achieved by coding tricks. It’s achieved by rigorous thinking above the code level. (Lamport, A High-Level View of TLA+)

(Note that the entire impetus beyond declarative programming is to make the executable code as clear and elegant for expressing and exploring rigorous thinking as possible.)

On Model Checking vs. theorem proving

https://wiki.eecs.yorku.ca/course_archive/2016-17/W/4315/_media/public:lecture24.pdf

  • Model checking is easier to write, but what can be verified is more limited, and it does less to sharpen an understanding of the domain
  • Theorem proving is harder, but anything expressible can be verified, and verification depends on synthesizing a (relatively) complete construction of the problem to be solved (we can only prove what we can construct).

Questions

  • Do the differences between the model checking and theorem proving approaches trace back to a difference between the model-theoretic vs. proof-theoretic approaches to logic?

Resources

Footnotes:

1

A quick search indicated that this view is likely contentious, and I should note here that I do not claim any authority or expertise on the philosophy of mathematics.