top of page

“In Computing, the Notion of Trust is Essential”

Producing bug-free software that does what it is supposed to do sometimes requires a miracle. As well as tests, formal logic and mathematics tools are used to guarantee reliability. Xavier Leroy, an expert in semantics, reflects on programming.

Photo © Michel Labelle

Computer programs, software, are essential for operating the machines that surround us; only a good program that is verified and certified will do what we ask of it. Xavier Leroy, world-renowned specialist in semantics and compilation, is a champion of programming. For him, a program is a mathematical object, a kind of large definition over which theorems can be proved. The formal tools that enable these mathematical proofs and which guarantee the reliability of software are mature; they are used in all “critical” industries such as nuclear or aeronautics. Elsewhere, for reasons of economy or through ignorance, the right practice is not always in place, leading to safety risks. The problem remains open for artificial intelligence software, as they cannot yet be formally verified. Elected to the Collège de France last year, Xavier Leroy is calling for extending good programming practice, but also for caution.

La Recherche Since 2018, you hold the second permanent chair in computer science at the Collège de France, after Gérard Berry in 2012. Could we say that computer science has finally become a respected science?

Xavier Leroy Let’s just say that it is the start of respectability. It has taken some time but computer science (CS) is a very young discipline. In the United States, pioneers in the field, universities started opening CS departments in the 1950s. In France, it took twenty years longer, depending on the institutions. Also, the appeal was not the same; from the 1990s, CS was considered a noble subject in the States. Mathematics students were the ones who were redirected to CS. In France, it was the reverse. The Collège de France opened up to computer science initially through annual chairs (specifically the chair in Informatics and Digital Science), then with the chair of Gérard Berry and my own. There are still some steps to be taken, such as creating an "agrégation d'informatique" (teaching certification in informatics), still absent from the French national education system.

How can a software or computer program be defined?

Perhaps we should start by saying that hardware is nothing without a program. If you have a computer containing electronic circuits, nothing happens until you give it an order: how to chain calculations? When to read entries? How to combine them? How to produce a result? On a non-programmable calculator, the human decides everything. In a programmable computer, it is the software that decides. In this sense, a program differs from the notion of an algorithm: it is an algorithm that has been materialized in a form that enables the machine to run it. From an algorithm to executable software there is one important step, the implementation phase. And to do that, you need a language.

Why is there no universal computer language?

Just as for the Tower of Babel, there are many computer languages. We have identified between 2,000 and 3,000 computer languages and there are almost 7,000 natural languages. It is extraordinary that these numbers are comparable even though the first computer language was only born less than a century ago! There are many reasons for their proliferation. First, it is often naively believed that a new language is going to solve new problems. Then there are chapels, what are called programming paradigms. In the 1990s, Java conquered enterprise computing, and Python was successful in teaching, the sciences and engineering. In the 2010s, the language Rust developed by Mozilla demonstrated how low-level programming could be done, close to the hardware and with a high degree of reliability.

You mention programming paradigms. What are they?

A programming paradigm is a way of approaching a problem. There are usually two major approaches: imperative and declarative. The imperative approach is the most intuitive and historically the first. It describes the program as a sequence of elementary computation steps that change the data stored in the computer memory, as a cooking recipe describes the steps that transform the ingredients. All of the old languages (Basic, Fortran, Pascal) use this type of programming. It is also the case for Python today. Declarative programming tries to have a higher-level vision. It tries to describe the solution in a more abstract way, more mathematically, explaining what the program must do but without describing how it must do it step by step. Within these categories there are other dimensions, such as the program structuring mechanisms: functions, objects, classes, models, communicating processes, etc.

In 1996, you and your colleagues created the OCaml language, still widely used today. What are its features?

There were two generations of Caml before I wrote a simplified version, Caml Light which saw some success for teaching. The current version, OCaml, is a sort of mature version of Caml Light. The main feature of these languages is their strong “type system”. Typing consists of classifying data according to their types (integers, character strings, arrays, functions, theorems, etc.). A strongly-typed language forbids mixing of types – it is a formal expression of the adage: “You can’t add up apples and pears”. Typing eliminates numerous programming errors and offers a first level of software reliability. It also influences the design and structure of the programs, and even the design of the programming languages themselves.

Do you talk about these aspects in your software science class?

The software sciences, in their most immediate sense, refer to software engineering; a rather empirical approach to industrial practices. But I believe it means something much wider; these are formalisms, ideas, concepts that underlie the programming languages, their semantics, the formal verification techniques of programs and the links to mathematical logic. It is an entire area of computer science, in the same way (but less well known) as computability, complexity and algorithmics. I want to focus on these issues that deal with the foundations of programming, the theory of languages and their importance today.

These foundation lead to the notion of trust and verification. What message are you trying to convey?

The notion of trust is fundamental in computing. In general, we need to be able to trust the programs we are using. How can we ensure that a program will do what it was designed to do? This is a crucial question for many critical applications – nuclear, aeronautics, etc. The traditional technique to ensure trust is based on tests, but it has reached its limits. In some areas such as aeronautics, we can conduct rigorous tests. But it is extremely costly. Engineers spend far more time developing tests and checking that everything has been properly tested than they do designing or writing the program… And the tests are not enough.

Can you give us an example?

The dismal failure of the first flight of Ariane 5, on 4th June 1996, is a good example of these limits. The flight program contained a software component that worked on Ariane 4 and that had been extensively tested. The engineers were so confident that they thought no additional tests were required when switching to version 5 of the rocket. But the conditions had changed: the acceleration was stronger. While everything was fine on Ariane 4, with the acceleration parameters of Ariane 5, an “arithmetic overflow” occurred (a value higher than predicted) caused the program to crash. The rocket changed direction rapidly, causing it to self-destruct.

How can we go beyond testing, then?

That is the purpose of formal verification methods; using deduction and calculation to establish properties that are true for all possible executions of a program, even if they are infinite in number and can never be tested exhaustively. It can thus be shown that a program does what it is supposed to, or more modestly that it does nothing dangerous (no “crashes”, no security breaches). There is a whole continuum of techniques that start with strongly typed languages in the Caml family. Beyond that, we can verify elementary properties such as the absence of arithmetic overflows or out-of-bounds array accesses. Accessing an array outside of its bounds is the technique used in half of all known security attacks today. Thus, we favor relatively automated techniques that avoid this. This is the case for static analysis tools such as Astrée, developed in the ENS, or PolySpace designed at Inria, that are used today in the aeronautics industry. More ambitiously, we can demonstrate mathematically that the behavior of a program complies with its specifications. To conduct these demonstrations, we use specialized forms of mathematical logic, called “program logics”, and automatic theorem provers, integrated in verification environments such as Frama-C, designed at the CEA.

Does it have practical applications?

In the field of software verification, at Airbus for example, engineers use formal verification tools to replace unit tests, to verify the libraries of functions used in electronic flight controls. Using these tools is faster than redoing all the tests and checking them. In fact, in some fields of application there is an economic motivation. That is one incentive we can use. Another way to get these ideas into industry: instead of programming and checking after, we can program in domain-specific, higher-level languages that already include some of the verifications.

How do you see the future of software verification?

The various fields of application of software determine different customs and practices. In the small world of highly critical software, best practices are known and applied. To produce high-assurance software, time and resources are required to apply these methods. This includes aeronautics, rail, nuclear, and, we hope in the near future, medical equipment and essential car components. The cost of verification is high, but that is the price of safety. In a second group, comprising infrastructure, networking, server and web browser software, some static analyzers are starting to be used. Here, the programs are not developed formally, but still with much rigor. The code is enormous, sometimes millions of lines. Some companies like Google are committing considerable resources, have a good culture and are responsive when problems occur. Their principle of compartmentalizing so that an issue in one part of the code does not hinder the operation of the whole, is effective (a bug on a web page doesn’t cause everything to crash). Even Microsoft, whose past practices in this field were deplorable, is at the leading edge of some fields of cybersecurity and distributes EverCrypt, a formally verified library of cryptographic primitives.

And bad practices?

Unfortunately, there are many programs designed with no care. I’m thinking of enterprise management software, a lot of Web components and numerous applications for mobile phones. What is important in this case is to isolate the execution of the application in places where it doesn’t cause too much damage.

What about artificial intelligence?

That is an important area of the software being created today. Machine learning delivers extraordinary results in many problems, especially perception. Software designed using this can effectively recognize images, play Go better than human champions, pilot driverless vehicles, etc. This upsurge in artificial intelligence has unlocked problems that had no traditional algorithmic solutions and we are starting to apply them everywhere. On the one hand, this is fantastic for dealing with new problems. On the other hand it worries me as we get “black box” systems that are easily tricked. For example, by adding a few chosen pixels, artificial intelligence will think that an image we obviously recognize as a dog, is in fact an ostrich (see opposite). We are definitely thinking we need to include this dimension, but in general we can only say “it works well on my set of test data”. In other words, we are still unable to prove the general security or operating safety properties of a system obtained by learning, such as deep neural networks.

Are you also worried about the driverless car?

Indeed I am. Driverless cars worry me. I wonder about the behavior of the software. Even if we try to reassure ourselves by noting that there are fewer errors than with drivers, it’s not the same thing to be killed by a human as by a machine. We need to be very careful here: driverless cars are still in the prototype stage and all the aspects of coexistence between driverless cars and human drivers remain poorly understood. Unfortunately, I don’t believe that formal methods are of much help in this area. The software system, that collects all the data from cameras, lidars and other sensors on driverless cars, still defy formal verification methods. There is some hope on the control side: alongside artificial intelligence that makes the decisions, there would be a far simpler (and verified) program that could decide to call a halt to everything. This works in rail: the systems are designed to stop the train when there is a problem. But for a plane or a car driving in the outside lane of a motorway, this is not possible. Computers, programming, machine learning are wonderful things, but we need to remain modest and careful.

A FULLY VERIFIED COMPILER

In 2008, Xavier Leroy released a fully verified compiler, called CompCert. “Compilation, the automatic translation by the machine of a program into executable code, has always fascinated me. The goal is to prove that a compiler is a good translator that does not mistake meaning”, he explains. He worked with his team in stages, first using toy compilers, then using a realistic compiler working on the whole of the C language. “CompCert is a success showing that you can prove large (almost 100,000 lines of code) and algorithmically complex programs”, he concludes.

HIGHLY COMPLEX PROGRAMS

Mobile phone app: 10,000 lines of code. Age of Empire (online game): 1.2 million lines. Firefox browser (Mozilla): 10 million lines. Boeing 787 flight computer: 14 million lines. Debian 5.0 (Linux distribution): 68 million lines. Luxury car software: 100 million lines. Google’s main software database: 2 billion lines.

> INTERVIEWEE

A specialist in the semantics of languages, Xavier Leroy holds the chair of software science at the Collège de France.

1968 • Born in Orléans. 1992 • Ph.D. under the direction of Gérard Huet. 1996 • He invented the language OCaml with colleagues from the Inria Cristal team. 2007 • Monpetit prize from the French Academy of Sciences. 2008 • First version of the formally verified compiler CompCert. 2011 • La Recherche prize. 2016 • Milner and Van Wijngaarden prizes. 2018 • Inria-Académie des Sciences grand prix; professor at the Collège de France.

RECENT POST
LATEST DISCOVERIES OF FRENCH SCIENTIFIC RESEARCH
bottom of page