Papers by Martin Ward
[Home]
[Talks]
[Publications]
[Software]
[G.K. Chesterton]
[GKC books]
[GKC pictures]
Click on any title to download a
gzipped
PostScript
file with the full text of the paper.
All papers also have PDF versions, click on the (PDF) link after the title.
Papers are listed in chronological order of publication,
so the most recent papers are at the end.
These works are licenced under a Creative Commons Licence.
Abstract
In this thesis we develop a theory of program refinement
and equivalence which can be used to develop practical tools for
program development, analysis and modification. The theory is based
on the use of general specifications and an imperative kernel language.
We use weakest preconditions, expressed as formulae in infinitary
logic to prove refinement and equivalence between programs.
The kernel language is extended by means of ``definitional
transformations'' which define new concepts in terms of those
already present. The extensions include practical programming constructs,
including recursive procedures, local variables, functions and
expressions with side-effects. This expands the language into a
``Wide Spectrum Language'' which covers the whole range of operations
from general specifications to assignments, jumps and labels.
We develop theorems for proving the termination of recursive and
iterative programs, transforming specifications into recursive programs
and transforming recursive procedures into iterative equivalents.
We develop a rigorous framework for reasoning about programs
with EXIT statements that terminate nested loops from within;
and this forms the basis for many efficiency-improving and
restructuring transformations. These are used as a tool for program
analysis and to derive algorithms by transforming their specifications.
We show that the methods of top-down design and program verification
using assertions can be viewed as the application of a small
subset of transformations.
Abstract
There has been much research in recent years on the problems of program and
system development but very little work has been done on the problems of
maintaining developed programs. This is despite the fact that for many years
it has been well-known that maintenance consumes the largest percentage of the
programming budget. We apply the techniques of program transformation to a
published program which was written in such a way that the structure and effect
of the program are very hard to discern. This will reveal the true structure
of the program and enable its effect to be summarised as a specification. The
method is language-independent and so can be used with a wide variety of
programming languages, the same method can be used to derive a program from a
specification, transform a program from one language to another, and (as
illustrated here), to transform a program into a specification.
A Knowledge-Based System
for Software Maintenance,
(PDF)
F. Callis, M. Khalil, M. Munro and M. Ward
Conference on Software Maintenance 1988, Phoenix, Arizona
October 24-27, 1988.
Abstract
A description of an Intelligent, Knowledge-Based maintenance tool, being
developed by the Centre for Software Maintenance at the University of Durham is
described. The tool is intended to help reduce the amount of time spent on
analysing code. Code analysis is performed when a programmer is familiarising
himself with a piece of code, and when the effects of a proposed modification
of the code is being assessed.
The Maintainer's Assistant
(PDF)
M. Ward, F. Callis and M. Munro
Conference on Software Maintenance 1989, Miami Beach, Florida,
October 16th-19th 1989.
Abstract
The Maintainer's Assistant is a code analysis tool aimed at helping the
maintenance programmer to understand and modify a given program. Program
transformation techniques are employed by the Maintainer's Assistant both to
derive a specification from a section of code and to transform a section of
code into a logically equivalent form. The general structure of the tool is
described and two examples of the application of program transformations are
given.
Abstract
We derive a hybrid sorting algorithm (a combination of Quicksort and insertion
sort which combines the best features of both algorithms) from a formal specification
of sorting via a sequence of correctness-preserving program transformations.
Abstract
In this paper we briefly introduce a Wide Spectrum Language and its
transformation theory and describe a recent success of the theory:
a general recursion removal theorem. This theorem includes as special
cases the two techniques discussed by Knuth and Bird. We describe
some applications of the theorem to cascade recursion,
binary cascade recursion, Gray codes, the Towers of Hanoi problem,
and an inverse engineering problem.
Abstract
This paper discusses how theoretical results from the field of program
transformations can be applied to develop a new approach to software reuse. We
describe a model for the semantics of nondeterministic programs and
specifications and use this model to show how refinements and transformations
of programs and specifications can be proved correct by reference to their
corresponding _Weakest_Preconditions_ expressed as formulae in infinitary first
order logic. We then show how this theory of program refinements and
transformations (which is further developed in [.Ward thesis.]) can be applied
to the construction of a repository of reusable components consisting of code,
specifications, documentation and development methods. These components are
linked together in such a way that specifications and their implementations can
be extracted easily.
Abstract
In this paper we briefly introduce a Wide Spectrum Language and its
transformation theory and describe a recent success of the theory: a general
recursion removal theorem. Recursion removal often forms an important step in
the systematic development of an algorithm from a formal specification. We use
semantic-preserving transformations to carry out such developments and the
theorem proves the correctness of many different classes of recursion removal.
This theorem includes as special cases the two techniques discussed by Knuth
[.Knuth goto.] and Bird [.Bird recursion removal.]. We describe some
applications of the theorem to cascade recursion, binary cascade recursion,
Gray codes, and an inverse engineering problem.
Abstracting a Specification from Code
(PDF)
M. Ward
Journal of Software Maintenance: Research and Practice,
Vol 5, 1993, pp 101-122
Abstract
Much of the work on developing program transformation systems has concentrated
on systems to assist in program development. However, the four separate
surveys carried out between 1977 and 1990, summarised in [.Foster.], show that
between 40% and 60% of all commercial software effort is devoted to software
maintenance rather than the development of new systems. In this paper we
describe a joint project between the University of Durham and CSM Ltd to
develop a method and tool for reverse engineering and software maintenance
based on program transformation theory. We present an example which
illustrates how such a tool can extract a high-level abstract specification
from the low-level source code of a program by a process of formal program
transformation based on a theory of program equivalence [.Ward thesis.]. All
the code-level reverse engineering of the example program was carried out on
the prototype tool with the resulting code pasted directly into the paper.
Abstract
This paper uses Ackerman's function as a testbed to illustrate
the operation of various program transformations which take recursive
procedures to equivalent iterative forms. The transformations are
taken from the author's DPhil thesis [.Ward89.]. In this paper we
illustrate that they can be successfully applied to even the most
convoluted recursion. For many programs a recursive function
is the most natural and clear specification while an iterative
(or tail-recursive) form is the most efficient implementation.
This paper illustrates how an efficient iterative program can
be developed and verified by starting with a simple recursive
program and using proven transformations to remove the recursion.
The resulting iterative program will be correct by construction,
so the problem of a direct verification of the iterative algorithm
is avoided. This process can also throw light on the nature
of the recursive specification. Several interesting properties
of Ackermann's function and the iterative algorithms are derived
in the course of this development.
A Practical Program Transformation System
(PDF)
M. Ward and K.H. Bennett
Working Conference on Reverse Engineering, 21st-22nd May 1993, Baltimore
Abstract
Program transformation systems provide one means of formally deriving a program
from its specification. The main advantage of this development method is that
the executable program is correct by construction. In this paper we describe a
tool called ReForm which is designed to address the inverse problem to support
the extraction of a specification from existing program code, using
transformations. This is an important activity during software maintenance.
One of the problems of transformation systems is the scarcity of practical
tools which can address industrial scale problems, rather than contrived
laboratory ``toy'' problems. The main contribution of this paper is an
analysis of the important software engineering factors that contribute to a
successful transformation based tool. Results from using the tool are also
presented.
Abstract
Reverse engineering of concurrent real-time programs with timing constraints is
a particularly challenging research area, because the functional behaviour of a
program, and the non-functional timing requirements, are implicit and can be
very difficult to discover. In this paper we present a significant advance in
this area, which is achieved by modelling real-time concurrent programs in the
wide spectrum language WSL. We show how a sequential program with interrupts
can be modelled in WSL, and the method is then extended to model more general
concurrent programs. We show how a program modelled in this way may
subsequently be ``inverse engineered'' by the use of formal program
transformations, to discover a specification for the program. (We use the term
``inverse engineering'' to mean ``reverse engineering achieved by formal
program transformations'').
Abstract
We describe a method for extracting high-level specifications
from unstructured source code. The method is based on a theory
of program refinement and transformation, which is used
as the bases for the development of a catalogue of powerful
semantics-preserving transformations. Each transformation is an operation
on a program which has a mechanically-checkable correctness condition,
and which has been rigorously proved to produce a semantically
equivalent result. The transformations are carried out in a wide
spectrum programming language (called WSL). This language includes
high-level specifications as well as low-level programming constructs.
As a result, the formal reverse engineering process (from source
code to equivalent specifications) and the redevelopment process
(refinement of specifications into source code) can both be carried
out within a single language and transformation theory.
We also discuss a tool (FermaT) which has been developed to support this
approach to reengineering. The tool is a program transformation system,
largely written in an extension to WSL called MetaWSL. Thus it is
possible to use the tool in the maintenance of its own source code,
and this is starting to be the case.
Abstract
In this paper we will take a detailed look at a larger example
of program analysis by transformation. We will be considering
Algorithm 2.3.3.A from Knuth's ``Fundamental Algorithms''
[.Knuth Fundamental Algorithms.] (P.357) which is an algorithm for
the addition of polynomials represented using four-directional links.
[.Knuth 74.] describes this as having ``a complicated structure
with excessively unrestrained \keyword{goto} statements''
and goes on to say ``I hope someday to see the algorithm
cleaned up without loss of its efficiency''. Our aim is to
manipulate the program, using semantics-preserving operations,
into an equivalent high-level specification. The transformations
are carried out in the WSL language, a ``wide spectrum language''
which includes both low-level program operations and high
level specifications, and which has been specifically designed
to be easy to transform.
Abstract
A backtracking algorithm with element order selection is presented,
and its efficiency discussed in relation both to standard examples
and to examples concerning relation preserving maps which the algorithm
was derived to solve.
The source code is available.
Abstract
This paper describes the concept of LANGUAGE ORIENTED PROGRAMMING
which is a novel way of organising the development of a large software
system, leading to a different structure for the finished product.
The approach starts by developing a formally specified,
domain-oriented, very high-level language which is designed
to be well-suited to developing ``this kind of program''. The
development process then splits into two independent stages:
(1) Implement the system using this ``middle level'' language, and
(2) Implement a compiler or translator or interpreter for
the language, using existing technology. The approach is claimed
to have advantages for domain analysis, rapid prototyping,
maintenance, portability, user-enhanceable systems, reuse of
development work, while also providing high development productivity.
We give an example where the method has been used very successfully
(in conjunction with rapid prototyping) in the development of a
large software system: the FermaT reverse engineering tool.
A major benefit of this approach to software development,
as compared to the usual sequential ``waterfall model''
is the speed with which products can be brought to market.
This is due to ``concurrent engineering'': the effective overlap
of development stages. Finally, the ``middle out'' development style
is compared and contrasted with the more usual ``top down'', ``bottom up''
and ``outside in'' development methods.
Abstract
Reverse engineering of interrupt-driven real-time programs with
timing constraints is a particularly challenging research area,
because the functional behaviour of a program, and the non-functional
timing requirements, are implicit and can be very difficult
to discover. However, in this paper we present a significant
advance in this area, which is achieved by modelling real-time
programs with interrupts in the wide spectrum language WSL. A
small example program is modelled in this way, and formal program
transformations are used to derive various timing constraints
and to ``inverse engineer'' a formal specification of the program.
(We use the term ``inverse engineering'' to mean ``reverse engineering
achieved by formal program transformations'').
Abstract
A wide spectrum language is presented, which is designed to facilitate
the proof of the correctness of refinements and transformations.
Two different proof methods are introduced and used to prove
some fundamental transformations, including a general induction
rule which enables transformations of recursive and iterative
programs to be proved by induction on their finite truncations.
A theorem for proving the correctness of recursive implementations
is presented, which provides a method for introducing a loop,
without requiring the user to provide a loop invariant. A powerful,
general purpose, transformation for removing or introducing recursion
is described and used in a case study in which we take a small,
but highly complex, program and apply formal transformations in order
to uncover an abstract specification of the behaviour of the program.
The transformation theory supports a transformation system, called FermaT,
in which the applicability conditions of each transformation
(and hence the correctness of the result) are mechanically verified.
These results together considerably simplify the construction
of viable program transformation tools; practical consequences
are briefly discussed.
Abstract
A method is described for obtaining useful information from legacy code.
The approach uses formal proven program transformations,
which preserve or refine the semantics of a construct while
changing its form. The applicability of a transformation in a
particular syntactic context is checked before application.
By using an appropriate sequence of transformations, the extracted
representation is guaranteed to be equivalent to the code.
In this paper, we focus on the results of using this approach
in the reverse engineering of medium scale, industrial software,
written mostly in languages such as assembler and JOVIAL.
Results from both benchmark algorithms and heavily modified,
geriatric software are summarised. It is concluded that the approach
is viable, for self contained code, and that useful design
information may be extracted from legacy systems at economic cost.
We conclude that formal methods have an important practical role
in the reverse engineering process.
Abstract
There is a vast collection of operational software systems
which are vitally important to their users, yet are becoming
increasingly difficult to maintain, enhance and keep up to date
with rapidly changing requirements. For many of these so
called LEGACY SYSTEMS the option of throwing the system away
an rewriting it from scratch is not economically viable.
Methods are therefore urgently required which enable these systems
to evolve in a controlled manner. The approach described in this
paper uses formal proven program transformations, which preserve
or refine the semantics of a program while changing its form.
These transformations are applied to restructure ans simplify
the legacy systems and to extract higherlevel representations.
By using an appropriate sequence of transformations, the extracted
representation is guaranteed to be equivalent to the code.
The method is based on a formal wide spectrum language, called WSL,
with accompanying formal method. Over the last ten years we
have developed a large catalogue of proven transformations,
together with mechanically verifiable applicability conditions.
These have been applied to many software development, reverse engineering
and maintenance problems.
Abstract
What does it mean to say that one program is "more abstract"
then another? What is "abstract" about an abstract data type?
What is the difference between a "high-level" program and a
"low-level" program? In this paper we attempt to answer these
questions by formally defining an abstraction relation between
programs which matches our intuitive ideas about abstraction.
The relation is based on examining the operational semantics
of the programs, expressed as a set of traces (sequences of states)
from a given initial state to a possible final state.
Note: This paper was updated on 9th April 1996
Abstract
In this paper we consider a particular class of algorithms which
present certain difficulties to formal verification. These are
algorithms which use a single data structure for two or more purposes,
which combine program control information with other data structures
or which are developed as a combination of a basic idea with an
implementation technique. Our approach is based on applying proven
semantics-preserving transformation rules in a wide spectrum language.
Starting with a set theoretical specification of ``reachability'' we
are able to derive iterative and recursive graph marking algorithms
using the ``pointer switching'' idea of Schorr and Waite. There have
been several proofs of correctness of the Schorr-Waite algorithm,
and a small number of transformational developments of the algorithm.
The great advantage of our approach is that we can derive the algorithm
from its specification using only general-purpose transformational rules:
without the need for complicated induction arguments.
Our approach applies equally well to several more complex algorithms
which make use of the pointer switching strategy, including a hybrid
algorithm which uses a fixed length stack, switching to the pointer
switching strategy when the stack runs out.
Abstract
This paper treats Knuth and Szwarcfiter's topological
sorting program, presented in their paper ``A Structured Program
to Generate All Topological Sorting Arrangements'', as a case
study for the analysis of a program by formal transformations.
This algorithm was selected for the case study because it is a
particularly challenging program for any reverse engineering method.
Besides a complex control flow, the program uses arrays to represent
various linked lists and sets, which are manipulated in various
``ingenious'' ways so as to squeeze the last ounce of performance
from the algorithm. Our aim is to manipulate the program, using
semantics-preserving operations, to produce an abstract specification.
The transformations are carried out in the WSL language, a ``wide
spectrum language'' which includes both low-level program operations
and high level specifications, and which has been specifically
designed to be easy to transform.
Abstract
The FermaT transformation system, based on research carried out over
the last twelve years at Durham University and Software Migrations Ltd.,
is an industrial-strength formal transformation engine with many
applications in program comprehension and language migration.
This paper describes one application of the system: the migration
of IBM 370 Assembler code to equivalent, maintainable C code.
We present an example of using the tool to migrate a small, but complex,
assembler module to C with no manual intervention required.
We briefly discuss a mass migration exercise where 1,925 assembler
modules were sucessfully migrated to C code.
Abstract
The transformation of a recursive program to an iterative
equivalent is a fundamental operation in Computer Science.
In the reverse direction, the task of reverse engineering
(analysing a given program in order to determine its specification)
can be greatly ameliorated if the program can be re-expressed
in a suitable recursive form. But the existing recursion removal
transformations, such as the techniques discussed by Knuth [.Knuth
goto.] and Bird [.Bird recursion removal.], can only be applied
in the reverse direction if the source program happens to match
the structure produced by a particular recursion removal operation.
In this paper we describe a much more powerful recursion removal
and introduction operation which describes its source and target
in the form of an action system (a collection of labels
and calls to labels). A simple, mechanical, restructuring operation
can be applied to a great many iterative programs which will
put them in a suitable form for recursion introduction.
Our transformation generates strictly more iterative versions than
the standard methods, including those of Knuth and Bird [.Knuth goto,
Bird recursion removal.]. With the aid of this theorem we prove a
(somewhat counterintuitive) result for programs that contain
sequences of two or more recursive calls: under a reasonable
commutativity condition, ``depth-first'' execution is more
general than ``breadth-first'' execution. In ``depth-first''
execution, the execution of each recursive call is completed,
including all sub-calls, before execution of the next call is started.
In ``breadth-first'' execution, each recursive call in the sequence
is partially executed but any sub-calls are temporarily postponed.
This result means that any breadth-first program can be reimplemented
as a corresponding depth-first program, but the converse does not hold.
We also treat the case of ``random-first'' execution, where the execution
order is implementation dependent. For the more restricted domain of tree
searching we show that breadth first search is the most general form.
We also give two examples of recursion introduction as an aid
to formal reverse engineering.
Abstract
The FermaT transformation system, based on research carried out over
the last sixteen years at Durham University, De Montfort University
and Software Migrations Ltd., is an industrial-strength formal
transformation engine with many applications in program comprehension
and language migration. This paper is a case study which uses
automated plus manually-directed transformations and abstractions
to convert an IBM 370 Assembler code program into a very
high-level abstract specification.
The FermaT Assembler Re-engineering
Workbench (PDF)
M. Ward
International Conference on Software Maintenance 2001,
6th-9th November 2001, Florence, Italy
IEEE Computer Society
Abstract
Research into the working practices of software engineers has shown the
need for integrated browsing and searching tools which include graphical
visualisations linked back to the source code under investigation.
In addition, for assembler maintenance and re-engineering there
is an even greater need for sophisticated control flow analysis,
data flow analysis, slicing and migration technology.
All these technologies are provided by the FermaT Workbench:
an industrial-strength assembler re-engineering workbench consisting
of a number of integrated tools for program comprehension, migration and
re-engineering. The various program analysis and migrations tools
are based on research carried out over the last sixteen years at
Durham University, De Montfort University and Software Migrations Ltd.,
and make extensive use of program transformation theory.
Abstract
In this paper we give a brief introduction to the foundations of WSL
transformation theory and describe some applications to program slicing.
We introduce some generalisations of traditional slicing, amorphous
slicing and conditioned slicing, which are possible in the framework
of WSL transformations. One generalisation is ``semantic slicing''
which combines slicing and abstraction to a specification.
ConSUS:
A Scalable Approach to Conditioned Slicing
(PDF)
M. Daoudi, L. Ouarbya, J. Howroyd, S. Danicic, Mark Harman,
Chris Fox, M. P. Ward
9th IEEE Working Conference on Reverse Engineering, 2002.
October 28 - November 1, 2002 Richmond, Virginia, USA
IEEE Computer Society
Abstract
Conditioned slicing can be applied to reverse engineering problems
which involve the extraction of executable fragments of code in the
context of some criteria of interest. This paper introduces ConSUS,
a conditioner for the Wide Spectrum Language, WSL. The symbolic
executor of ConSUS prunes the symbolic execution paths, and its
predicate reasoning system uses the FermaT Simplify
transformation in place of a more conventional theorem prover.
We show that this combination of pruning and simplification-as-reasoner
leads to a more scalable approach to conditioning.
Program Slicing
via FermaT Transformations
(PDF)
M. Ward
COMPSAC 2002,
26th Annual International Computer Software and Applications Conference
Oxford, England, 26th-29th August 2002
IEEE Computer Society
Abstract
In this paper we describe how the concept of program
slicing can be formalised in WSL transformation theory.
This formalism naturally lends itself to several generalisations
including amorphous slicing and conditioned slicing.
One novel generalisation is ``semantic slicing'' which combines
slicing and abstraction to a specification. Interprocedural semantic
slicing has been implemented in the FermaT transformation system:
an industrial-strength transformation system designed for forward
and reverse engineering, re-engineering and program comprehension.
Publisher's Blurb
In today's fast-changing, competitive environment, having an up-to-date
information system (IS) is critical for all companies and institutions.
Rather than creating a new system from scratch, reengineering is
an economical way to develop an IS to match changing business needs.
Using detailed examples, this practical book gives you methods and
techniques for reengineering systems for flexibility and reliability.
It helps you reengineer a system to continue to provide for business
critical missions as well as achieve a smooth transformation
to an up-to-date software technology environment. What's more,
it shows you how to redevelop a flexible system that can evolve
to meet future business objectives, reduce start time and save
money in the reengineering process.
This unique resource places particular emphasis on the reengineering
of legacy assembler systems using the FermaT workbench, and describes
other widely available general techniques. The book poses
the key questions that you need to address before reengineering
can provide an integrated framework that reveals the answers.
A built workbench is used to illustrate the approach with the application
of real case studies.
Contents:
Constant Software Changes - Legacy Systems. Business Changes.
Software Evolution.
Software Engineering and Software Maintenance - Software and
Software Engineering. Software Maintenance. Summary.
Software Reengineering - Introduction. Taxonomy of
Software Reengineering. Reverse Engineering. Issues of
Reverse Engineering. Current State of Formal Methods in Reengineering.
Classification of Formal Methods. Criteria and Results.
Analysis and Summary.
An Integrated Reengineering Framework - Characteristics of
Legacy Systems. The Reengineering Approach. Wide Spectrum Language.
Process for Reengineering - A system Architecture for Implementing
the Process of Reengineering. Abstracting. Translation of
Source Code to an Intermediate Representation. Restructuring.
Elicitating Business Rules. Abstraction and Abstraction Patterns.
Reusing Components. Retargeting. Measuring Reegineering.
Reengineering Workbench - FermaT Workbench.
Case Studies. Conclusions. References.
ConSUS:
A Light-Weight Program Conditioner
(PDF)
Sebastian Danicic, Mohammed Daoudi, Chris Fox,
Mark Harman, Rob M. Hierons, John R. Howroyd,
Lahcen Ourabya and Martin Ward
Journal of Systems and Software (Elsevier),
Volume 77, Issue 3, September 2005, Pages 241-262.
doi:10.1016/j.jss.2004.03.034
Abstract
Program conditioning consists of identifying and removing a set of
statements which cannot be executed when a condition of interest holds
at some point in a program. It has been applied to problems in
maintenance, testing, re-use and re-engineering. Program
conditioning relies upon both symbolic execution and reasoning about
symbolic predicates. Automation of the process therefore requires
some form of automated theorem proving. However, the use of a
full-power `heavyweight' theorem prover would impose unrealistic
performance constraints.
This paper reports on a lightweight approach to theorem proving
using the FermaT simplify decision procedure. This is used as a
component to ConSUS, a program conditioning system for the Wide
Spectrum Language WSL. The paper describes the symbolic execution
algorithm used by ConSUS, which prunes as it conditions.
The paper also provides empirical evidence that conditioning
produces a significant reduction in program size and, although
exponential in the worst case) the conditioning system, has low degree
polynomial behaviour in many cases, thereby making it scalable to unit
level applications of program conditioning.
Slicing the SCAM Mug:
A Case Study in Semantic Slicing
(PDF)
Martin Ward
Third IEEE International Workshop on Source Code Analysis and Manipulation,
27th September 2003, Amsterdam, Netherlands.
IEEE Computer Society
Abstract
In this paper we describe an improved formalisation of slicing in WSL
transformation theory and apply the result to a particularly
challenging slicing problem: the SCAM mug. We present both syntactic
and semantic slices of the mug program and give semantic slices for
various generalisations of the program. Although there is no
algorithm for constructing a minimal syntactic slice, we show
that it is possible, in the WSL language, to derive a minimal
semantic slice for any program and any slicing criteria.
Abstract
Software reengineering has been described as being ``about as easy as
reconstructing a pig from a sausage'' [.Eastwood 1992.]. But the
development of program transformation theory, as embodied in the
FermaT transformation system, has made this miraculous feat into a
practical possibility. This paper describes the theory behind the
FermaT system and describes a recent migration project in which over
544,000 lines of assembler ``sausage'' (part of a large embedded
system) were transformed into efficient and maintainable structured C
code.
Legacy Assembler Reengineering
and Migration
(PDF)
Martin Ward, Hussein Zedan and Tim Hardcastle
ICSM2004, The 20th IEEE International Conference
on Software Maintenance, 11th-17th Sept Chicago Illinois, USA.
IEEE Computer Society
Abstract
In this paper we describe the legacy assembler problem and describe how
the FermaT transformation system is used to reengineer assembler
systems and migrate from assembler to C and COBOL.
Conditioned Semantic Slicing
via Abstraction and Refinement in FermaT
(PDF)
Martin Ward, Hussein Zedan and Tim Hardcastle
9th European Conference on Software Maintenance and Reengineering (CSMR)
Manchester, UK, March 21-23 2005
Abstract
In this paper we describe an improved formalisation of slicing
in WSL (Wide Spectrum Language) transformation theory and apply
the result to give syntactic and semantic slices for some challenging
slicing problems. Although there is no algorithm for constructing
a minimal syntactic slice, we show that it is possible, in the WSL
language, to derive a minimal semantic slice for any program and any
slicing criteria. We describe the Representation Theorem and show
how it is (partially) implemented in the FermaT transformation system.
The theorem has applications to semantic (or conditioned) slicing, and we
use a combination of abstraction (via the representation theorem),
simplification and refinement plus other program transformations
to develop a powerful conditioned slicing algorithm.
MetaWSL and Meta-Transformations
in the FermaT Transformation System
(PDF)
Martin Ward and Hussein Zedan
The 29th Annual International Computer Software and Applications Conference
COMPSAC 2005, Edinburgh, Scotland, July 26-28, 2005
Abstract
A program transformation is an operation which can be applied to
any program (satisfying the transformations applicability conditions)
and returns a semantically equivalent program. In the FermaT
transformation system program transformations are carried out
in a wide spectrum language, called WSL, and the transformations
themselves are written in an extension of WSL called MetaWSL
which was specifically designed to be a domain-specific language
for writing program transformations . As a result, FermaT is capable
of transforming its own source code via meta-transformations.
This paper introduces MetaWSL and describes some applications
of meta-transformations in the FermaT system.
Abstract
The aim of this paper is to provide a unified mathematical
framework for program slicing which places all slicing work,
for sequential programs, on a sound theoretical foundation.
The main advantage to a mathematical approach is that it is not
tied to a particular representation. In fact the mathematics
provides a sound basis for any particular representation.
We use the WSL (Wide Spectrum Language) program transformation
theory as our framework. Within this framework we define a new
semantic relation, semi-refinement which lies between semantic
equivalence and semantic refinement. Combining this semantic relation,
a syntactic relation (called reduction) and WSL's remove statement,
we can give mathematical definitions for backwards slicing,
conditioned slicing, static and dynamic slicing and semantic slicing
as program transformations in the WSL transformation theory. A novel
technique of "encoding" operational semantics within a denotational
semantics allows the framework to handle "operational slicing".
The theory also enables the concept of slicing to be applied to
nondeterministic programs. These transformations are implemented
in the industry-strength FermaT transformation system.
Abstract
One of the most challenging tasks a programmer can face is
attempting to analyse and understand a legacy assembler system.
Many features of assembler make analysis difficult, and these are
the same features which make migration from assembler to a high
level language difficult. In this paper we discuss the application
of program transformation technology to assist with analysing
and understanding legacy assembler systems. We briefly introduce
the fundamentals of our program transformation theory and program
slicing which generalises to conditional semantic slicing.
These transformations are applied to a large commercial assembler
system to automatically generate high-level abstract descriptions
of the behaviour of each assembler module, with error handling code
sliced away. The assembler system was then migrated to C. The result
is a dramatic improvement in the understandability of the programs:
on average a 6,000 line assembler listing is condensed down to a
132 line high level language abstraction. A second case study,
involving over one million lines of source code from many different
assembler systems showed equally dramatic results.
Properties of Slicing Definitions
(PDF)
Martin Ward
Ninth IEEE International Working Conference on Source Code Analysis and Manipulation
IEEE Computer Society Press, September 2009
Abstract
Weiser's original papers on slicing defined the concept in an
informal way. Since then there have been several attempts to formalise
slicing using various formal methods and semantics of programs.
In this paper we start by defining some properties that a definition
of slicing might reasonably be expected to satisfy and then compare
different definitions of slicing to see which properties are satisfied.
Properties are classified into "floor" requirements:
all slices satisfying the property must be considered as valid,
and "ceiling" requirements: slices which do not satisfy the property
must not be considered valid. Any slicing relation which lies above
a "floor" requirement, or below a "ceiling" requirement, satisfies the
property in question.
The main result of the paper is the proof that, given a certain
property of the programming language (informally: it is possible
to write an infinite loop in the language), two of the most basic
and fundamental properties of slicing are sufficient to completely
characterise the semantic part of the slicing relation.
These properties are: behaviour preservation and truncation.
Abstract
One of the most challenging tasks a programmer can face is attempting
to analyse and understand a legacy assembler system. Many features
of assembler make analysis difficult, and these are the same features
which make migration from assembler to a high level language difficult.
In this paper we describe some of the methods used in the FermaT
transformation system for analysing and migrating assembler systems.
One technique we discuss in detail is to combine a simple dynamic slice,
computed with virtually no overhead, and a static slice implemented
using program transformation technology, to generate very concise
high-level descriptions of the sliced code.
Using Software Metrics to Evaluate
Static Single Assignment Form in GCC
(PDF)
Jeremy Singer, Christos Tjortjis and Martin Ward
International Workshop on GCC Research Opportunities, GROW 2010
5th International Conference on High-Performance Embedded Architectures
and Compilers (HiPEAC), Pisa, Italy, 23rd January 2010
Abstract
Over the past 20 years, static single assignment form (SSA)
has risen to become the compiler intermediate representation of choice.
Compiler developers cite many qualitative reasons for choosing SSA.
However in this study, we present clear quantitative benefits
of SSA, by applying several standard software metrics to
compiler intermediate code in both SSA and non-SSA forms.
The average complexity reduction achieved by using SSA in the GCC
compiler is between 32% and 60% according to our software metrics,
over a set of standard SPEC benchmarks.
Abstract
The transformational programming, method of algorithm derivation starts
with a formal specification of the result to be achieved (which provides
no indication of how the result is to be achieved), plus some informal
ideas as to what techniques will be used in the implementation.
The formal specification is then transformed into an implementation,
by means of correctness-preserving refinement and transformation steps.
The informal ideas are used to guide the selection of transformations
to apply: since they only guide the selection of valid transformations,
the ideas do not themselves have to be formalised.
At any stage in the process, sub-specifications can be extracted
and transformed separately. The main difference between this
approach and the invariant based programming approach (and similar
stepwise refinement methods) is that loops can be introduced and
manipulated while maintaining program correctness and with no need
to derive loop invariants. Another difference is that at every
stage in the process we are working with a correct program:
there is never any need for a separate "verification" step.
These factors help to ensure that the method is capable of scaling
up to the dvelopment of large and complex software systems.
Abstract
In this paper we present a case study in deriving an algorithm
from a formal specification via FermaT transformations.
The general method (which is presented in a separate paper)
is extended to a method for deriving an implementation of a program
transformation from a specification of the program transformation.
We use program slicing as an example transformation, since this
is of interest outside the program transformation community.
We develop a formal specification for program slicing, in the form
of a WSL specification statement, which is refined into a simple
slicing algorithm by applying a sequence of general purpose program
transformations and refinements. Finally, we show how the same
methods can be used to derive an algorithm for semantic slicing.
The main novel contributions of this paper are: (1) Developing a
formal specification for slicing. (2) Expressing the definition
of slicing in terms of a WSL specification statement. (3) By applying
correctness preserving transformations to the specification we can
derive a simple slicing algorithm.
Abstract
The FermaT transformation system has proved to be a very successful
tool for migrating from assembler to high level languages,
including C and COBOL. One of the more challenging aspects facing
automated migration, specifically when the aim is to produce
maintainable code from unstructured "spaghetti" code, is to restructure
assembler subroutines into semantically equivalent high level
language procedures. In this paper we describe some of the many
varieties of assembler subroutine structures and the techniques used
by the migration engine to transform these into structured code.
These transformations require a deep analysis of both control flow
and data flow in order to guarantee the correctness of the result.
Two separate case studies, involving over 10,000 assembler modules
from commercial systems, demonstrate that these techniques
are able to restructure over 99% of hand-written assembler,
with no human intervention required.
Abstract
The transformational programming method of algorithm derivation starts
with a formal specification of the result to be achieved, plus some
informal ideas as to what techniques will be used in the implementation.
The formal specification is then transformed into an implementation,
by means of correctness-preserving refinement and transformation steps,
guided by the informal ideas. The transformation process will
typically include the following stages: (1) Formal specification (2)
Elaboration of the specification, (3) Divide and conquer to handle
the general case (4) Recursion introduction, (5) Recursion removal,
if an iterative solution is desired, (6) Optimisation, if required.
At any stage in the process, sub-specifications can be extracted
and transformed separately. The main difference between this
approach and the invariant based programming approach (and similar
stepwise refinement methods) is that loops can be introduced and
manipulated while maintaining program correctness and with no need
to derive loop invariants. Another difference is that at every
stage in the process we are working with a correct program:
there is never any need for a separate "verification" step.
These factors help to ensure that the method is capable of scaling
up to the development of large and complex software systems.
The method is applied to the derivation of a complex linked list
algorithm and produces code which is over twice as fast as the code
written by Donald Knuth to solve the same problem.
Abstract
Combining formal and agile techniques in software development has the
potential to minimize change-related problems.
Abstract
Since the original development of program slicing in 1979 there have
been many attempts to define a suitable semantics, which will precisely
define the meaning of a slice. Particular issues include handling
termination and nontermination, slicing nonterminating programs,
and slicing nondeterministic programs. In this paper we review
and critique the main attempts to construct a semantics for slicing
and present a new operational semantics, which correctly handles
slicing for nonterminating and nondeterministic programs.
We also present a modified denotational semantics, which we prove to be
equivalent to the operational semantics. This provides programmers
with two different methods to prove the correctness of a slice or a
slicing algorithm and means that the program transformation theory
and FermaT transformation system, developed last 25 years of research,
and which has proved so successful in analyzing terminating programs,
can now be applied to nonterminating interactive programs.
Abstract
Software maintenance takes up a disproportionately large amount
of time in the modern software life cycle. One of the common problems
is understanding the original code that is being restructured
and improved and this is especially true with low-level code.
This paper investigates the results and properties of an
automated process that can raise the abstraction level
of code from low-level operations to high-level structures.
The process is made of independent components and can be adapted
to different scenarios. The automated improvements implementation
relies on the program transformation system FermaT and its catalogue
of semantics-preserving transformations. The process uses hill
climbing and a metric for the fitness function of the programs.
This component was made to work on general inputs, without explicit
knowledge of the type of origin of the program. The paper explores
how different inputs are actually handled by the system, what are
the properties and how these can be used for further improvements.
Two main types of inputs are shown, x86 assembly and MicroJava bytecode.
These two have many operational differences, and the translator
tools introduce some more, but nonetheless, the same process can
handle all of these and, on average, improve the Structure metric
(a good approximation of the complexity of the code) by around 85%.
Back to my home page.
Last modified: 17th April 2024
Martin Ward,
Email: martin@gkc.org.uk