All Posts
Math Classes Should Be Less Useful
A Formal Introduction to Models of Computation in Coq (Part 3)
A Formal Introduction to Models of Computation in Coq (Part 2)
A Formal Introduction to Models of Computation in Coq (Part 1)
Why Prolog?

Why Prolog?

I've been interested in trying out different programming languages for a long time, trying out bits of every interesting language I hear about. Much of the code the I write for my personal projects is in Haskell or other functional languages (some of which are on my GitHub). But the latest language I've been exploring is Prolog, and so far, I've been enjoying it a lot —this website is written almost entirely in Prolog. This post is about why I think Prolog is a good programming language, and more generally, why declarative languages are better (for certain values of "better") than imperative ones. That's not to say it's without flaws or the best programming language, and I'll always be on the lookout for something better, but I think it's earned a place at the table.

So why Prolog? Because once you define your problem, you can explore its properties in ways that you simply cannot in other paradigms.

Note: you don't have to know Java, Haskell, or Prolog to appreciate this article (but you should probably know how to program).

Imperative vs. Declarative

As computers have become more powerful over the years, a greater emphasis has been put on writing code for humans rather than for computers. Over time, many of the low level details that programmers used to have to work about are now handled by computers, and in many cases, they do a better job than all but the best human programmers (and sometimes even better than them).

Instead of writing machine code directly, we can use assemblers to do some of the transformations for us—no need to memorize opcodes. Instead of manually allocating registers, we can use comparatively high level languages like C to say what variables we want and have the computer try to figure out the most efficient way to allocate registers. Instead of treating all values as just strings of bytes, we can have the computer track what types of values are to make sure our computations make sense. Instead of repeating lots of code, we can use procedures and functions and methods to build abstractions. Instead of writing code with gotos, we can use structured programming to organize our code. Instead of managing memory ourselves, we can use garbage collection, or have the compiler check our usage of variables so we don't cause any problems (à la Rust).

Declarative styles and languages are another step in this continuing journey to abstract low level details away. In this case, we are abstracting away execution order, at least as much as possible.

The fundamental difference between the imperative and declarative approaches is that imperative programs tell the computer how to do a task, and declarative programs define the problem. As an example, let's consider the problem of checking whether a list is a palindrome, that is, it reads the same forwards and backward. For example, [1,2,1] and [2,4,6,4,2] are palindromes, but [1,2,3,4,5] is not.

An imperative solution may look like this (this algorithm is fundamentally the same as that used here so as to prevent bias against imperative programming). In this case, I'm using Java, but this basic algorithm will work for any imperative language.

public <T> boolean isPalindrome(final List<T> list) {
    for (int i = 0; i < list.size() / 2; i++) {
        if (!list.get(i).equals(list.get(list.size() - i - 1))) {
            return false;
        }
    }

    return true;
}

If we write this in Prolog, we could write:

palindrome([]).
palindrome([_]).
palindrome([H|T]) :-
    append(Middle, [H], T),
    palindrome(Middle).

That is to say, an empty list and a single element list are both palindromes. Furthermore, a list is a palindrome if the first (H) and last thing are the same, and the middle of the list (i.e., without the first or last element) is a palindrome as well.

We can see that the imperative version specifies exactly what steps to take in order to determine if a given string is a palindrome. But the declarative version is different in some subtle and not-so-subtle ways. The declarative version defines what a palindrome is, and not how to test whether a list is a palindrome. We can see this even in the naming. A method called isPalindrome checks whether a list is a palindrome In Prolog, this definition is more useful than it may first appear and this will be explored further below.

Note that neither length of code nor speed are not important to this discussion, only the usefulness of abstractions.

As a disclaimer, it is important to note that we can't completely abstract over execution order. Among other reasons, the fact that most programming languages are Turing complete makes this impossible in general.

For example, we may have some nonterminating function f and some terminating function g. Suppose g x is false. If we wish to compute f x && g x, then the program will not terminate, but if we compute g x && f x, then it will. So in this case, we see that execution order does matter, even if we were to use a non-strict language like Haskell. This is impossible to resolve because it's not possible to determine whether a given function will terminate in general.

Motivating Examples

Let's consider the following examples of programs in all three paradigms (imperative OOP, functional, and logic), so that we can get a good understanding of the advantages of Prolog.

  • CSV Parsing
  • Palindromes
  • Factorial

The primary advantage is that when we define a problem properly, we can write translate these definitions into Prolog code that is clear and very powerful. Hopefully by the end of these examples you will see and agree.

CSV Parsing

We're starting out with this one because it's a Real World Problem and people seem to like those.

So the problem is fairly straightforward: taking some input (a CSV file) as a string, parse it into rows and columns. We'll just separate things into strings, rather than trying to parse stuff as numbers for simplicity's sake.

First, in Java:

public class CSVParser {
    private final List<List<String>> values = new ArrayList<>();

    public CSVParser(final String raw) {
        for (final String line : raw.split("\n")) {
            values.add(parseLine(line));
        }
    }

    private List<String> parseLine(final String line) {
        final List<String> values = new ArrayList<>();

        boolean quoted = false;

        StringBuilder value = new StringBuilder();

        for (final char c : line.toCharArray()) {
            if (c == '"') {
                if (quoted) {
                    values.add(value.toString());
                    value = new StringBuilder();
                    quoted = false;
                } else {
                    quoted = true;
                }
            } else if (c == ',' && !quoted) {
                values.add(value.toString());
                value = new StringBuilder();
            } else {
                value.append(c);
            }
        }

        return values;
    }

    public List<List<String>> values() {
        return values;
    }
}

This is a fairly straightforward, manually written parser, and hopefully is not to difficult to grasp how it works. In true imperative style, we provide an explicit list of steps for the computer to follow. But any human reading this code must decode it to figure out what it actually is trying to say—the opposite of writing code for humans. Obviously in the real world we would use a CSV parsing library and not write it ourselves, but this article is about solving a problem in a given paradigm, not with a library.

Now, in Haskell, using monadic parsing:

type Parser a = forall s m u. Stream s m Char => ParsecT s u m a

parseCSV :: String -> Either ParseError [[String]]
parseCSV = parse csvFile ""

csvFile :: Parser [[String]]
csvFile = csvRow `sepBy` newline

csvRow :: Parser [String]
csvRow = csvValue `sepBy` char ','

csvValue :: Parser String
csvValue = between (string "\"") (string "\"") (many (noneOf "\"")) <|>
           many (noneOf ",\n")

The syntax may be unfamiliar (you can basically ignore the first line), but by looking at the the definitions without the type signatures, we can see what they are trying to convey simply by reading them. A CSV file is a (list of) CSV rows separated by a newline, and CSV row is a list of values separated by commas (what CSV standards for). A value is either between quotes, in which case it is many characters, none of which are quotes, or it's not between quotes, in which case it's many characters, but not a comma.

And finally, in Prolog:

from_codes(Codes, Atom) :- atom_codes(Atom, Codes).

csv(Str, Rows) :-
    atom_codes(Str, Codes),
    phrase(csv_file(RawRows), Codes),
    maplist(maplist(from_codes), RawRows, Rows).

csv_file([Row|Rows]) -->
    csv_row(Row), "\n", csv_file(Rows);
    csv_row(Row), eos, { Rows = [] }.

csv_row([Val|Vals]) -->
    csv_val(Val), ",", csv_row(Vals);
    csv_val(Val), { Vals = [] }.

csv_val(Val) -->
    string_without("\n,\"", Val);
    "\"", string(Val), "\"".

This definition is very similar to the Haskell version. Ignoring the plumbing in the first two rules (from_codes/2 and csv/2), we can see that it similarly describes a CSV file, rather than the steps to parse one. Again, a CSV file is either a single CSV row, or it's followed by a new line and another CSV file with more rows, and so on.

The declarative versions are much more readable because they focus on the people reading them rather than the computer executing them. Furthermore, with the description provided by the Prolog version, we can reuse this code to write a version that let's us write as well as read CSV files.

All we have to do is change csv/2 to be:

csv(Str, Rows) :-
    nonvar(Str) -> % If Str has a value, read into Rows
        atom_codes(Str, Codes),
        phrase(csv_file(RawRows), Codes),
        maplist(maplist(from_codes), RawRows, Rows);

    % Otherwise, write into Str from Rows
    maplist(maplist(from_codes), RawRows, Rows),
    phrase(csv_file(RawRows), Codes),
    atom_codes(Str, Codes).

This is basically saying that if we provide a value for Str, then parse it, but if we don't, then we simply go through the exact same steps we use to parse a file, but backwards.

Example usage:

?- csv('words,2,bonjour\n"testing,comma",8,hello world', Rows).
Rows = [[words, '2', bonjour], ['testing,comma', '8', 'hello world']] .

?- csv(Str, [[1,2,3],['hello world','with,comma',912.2]]), writeln(Str).
1,2,3
hello world,"with,comma",912.2
Str = '1,2,3\nhello world,"with,comma",912.2' .

Note that when writing a csv, this same definition can produce all possible strings that will parse to the given rows. It also "knows" that it has to surround values that contain commas with quotes, even though we didn't tell it to do that—it's just a consequence of the definition. However, because of the evaluation model of Prolog, it won't actually do this unless you ask it to, so you don't have to worry about performance because of this.

Factorial

Usually we define this mathematically as:

$$n! = n \cdot (n-1) \cdot (n-2) \cdots 2 \cdot 1$$

In an imperative language (C, C++, Java, C#, etc.):

int factorial(int n) {
    int result = 1;

    for (int i = 1; i <= n; i++) {
        result *= i;
    }

    return result;
}

In a declarative language (Haskell):

factorial 0 = 1
factorial n = n * factorial (n - 1)

Here is the program in Prolog:

factorial(0, 1).
factorial(N, Result) :-
    N1 #= N - 1,
    factorial(N1, F1),
    Result #= N * F1.

Now this style is more verbose than the Haskell version, but if you just imagine substituting in N - 1 for N1 and such, then you can see it's not really any different. But, as before, this function has some advantages over the Haskell (and imperative) versions. Specifically, while I can use it to calculate \(5!\) by writing factorial(5, F)., I can also use it to solve \(x! = 120\) by writing factorial(X, 120).

Final Thoughts

Ultimately, it is true that any decent language can be used to write basically any program. You can be a good programmer or a bad programmer in any language. But on the other hand, it's obvious that there's much more to a language than Turing completeness. Being Turing complete doesn't mean that a language is a good choice, and there are good and useful languages which are not Turing complete (see: Datalog or Idris with only total functions).

It is important to consider other factors when choosing a language for a project.

Last updated: 10.02.2019
Newer

This website is written in Prolog