
In software engineering, it is important to ensure that a software system behaves correctly and reliably. This is especially crucial for critical systems, such as online banking, e-commerce, and real-time systems. One promising technique for verifying the properties of such systems is called proof scores, which uses a method called term rewriting.
A proof score is composed of declarations and rewritings such that if all components evaluate as desired, then the problem is solved. This method strikes a balance between automation and manual effort: machines handle routine tasks like substitution, simplification, and reduction, while humans focus on the most interesting tasks, such as deciding proof strategies. Additionally, even partially completed proofs can yield valuable feedback, often indicating what to try next.
This technique has been put into practice through algebraic specification languages, particularly the OBJ family, such as OBJ3, CafeOBJ, and Maude, which are designed to be executable via term rewriting. A key advantage of proof scores is that they use the same syntax and evaluation mechanisms as the language used to specify the system, making the verification process smooth and tightly integrated.
Hence, this method has been successfully applied to a wide range of systems and protocols. However, this method also has several disadvantages, which have limited it to mostly academic environments.
To understand this gap, a research team led by Professor Kazuhiro Ogata, along with Assistant Professor Duong Dinh Tran from the Japan Advanced Institute of Science and Technology (JAIST), conducted a study into the past, present, and future of proof scores. “Proof scores have proven their capability to verify that systems, including those we rely on every day, meet their designs.
“In this study, we analyze the past and present of proof scores to understand their current challenges and find ways to improve their applicability,” Prof. Ogata and Asst. Prof. Tran explain. Their study is published in the journal ACM Computing Surveys.
Proof scores were first proposed in the 1990s by the researcher Joseph A. Goguen. Since then, it has been implemented across multiple OBJ languages. In the study, the researchers explored the theoretical foundations of proof scores and analyzed their implementations in different OBJ languages.
The researchers also studied several cases where proof scores were successfully applied, including communication, authentication and e-commerce protocols, real-time systems, modern cryptographic protocols, in post-quantum cryptographic protocols, which are encryption methods designed to be secure against the upcoming powerful quantum computers.
This analysis revealed the strong points of proof scores. Most notably, the same syntax used to specify a system can also be used to prove the properties of the system. Unlike traditional theorem-proving methods, which can be highly abstract, this property of proof scores ensures that every step in the proof is grounded in the formal definition of the system, making the proof more transparent and accessible. Furthermore, proof scores are written as programs and, therefore, are as flexible as programs.
However, this analysis also revealed their main weak point, i.e., proof scores are programmed by humans, who must ensure that all possible cases have been addressed, making them subject to human errors. None of the previous implementations warned the users if a case had been missed, which is especially problematic with large proofs. This is one of the main reasons why proof scores have not been more widely adopted.
While proof assistants have been developed to address this weakness, they usually weaken the advantages of proof scores. However, there is one proof assistant called CiMPG for CafeOBJ, which also retains the merits of proof scores.
The researchers also highlighted other open issues, including the need for easier, human-readable proofs, accessible to a wider audience beyond researchers, as well as for more public libraries.
To solve these open issues, the researchers suggest that modern systems should provide an integrated development environment, like those used for popular programming languages, that would provide graphical, interactive support for writing and managing proof scores. They also suggest looking into the latest features of Maude.
“Proof scores will prove critical for emerging safety-critical systems that will shape our future society,” say the researchers. “From the communication protocols used in online banking and e-commerce to blockchain and post-quantum cryptography, their potential for creating reliable systems is significant.”
Overall, this study not only highlights the critical role of proof scores but also lays out a roadmap for making them more practical and widely accessible.
More information:
Adrián Riesco et al, Proof Scores: A Survey, ACM Computing Surveys (2025). DOI: 10.1145/3729166
Citation:
Making software safer and more reliable: A deep dive into proof scores (2025, May 7)
retrieved 7 May 2025
from
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.
Leave a comment