How to Quantify Quality: Finding Scales of Measure

How to Quantify Quality: Finding Scales of Measure

By Tom Gilb

Overload, 13(70):, December 2005


'Scales of measure' are fundamental to the definition of all scalar system attributes; that is, to all the performance attributes (such as reliability, usability and adaptability), and to all the resource attributes (such as financial budget and time). A defined scale of measure, allows you to numerically quantify such attributes.

'Scales of measure' form a central part of Planguage, a specification language and set of methods, which I have developed over many years.

This paper describes how you can develop your own tailored scales of measure for the specific system attributes, which are important to your organization or system. You cannot rely on being 'given the answer' about how to quantify. You will lose control over your current vital system performance concerns if you cannot, or do not, quantify your critical attributes.

Scales of Measure and Meters

Scales of measure (Scales) are essential to quantify system attributes. A Scale specifies an operational definition of 'what' is being measured and it states the units of measure. All estimates or measurements are made with reference to the Scale.

The practical ability to measure where you are on a Scale (that is to be able to establish the numeric level) is also important. A Meter (sometimes known as a 'Test') is a practical method for measuring. A Scale can have several Meters.

Tag: <assign a tag name to this Scale>.
Version: <date of the latest version or change>.
Owner: <role/email of who is responsible for updates/changes>.
Status: <Draft, SQC Exited, Approved>.
Scale: <specify the Scale with defined [qualifiers].>.
Alternative Scales: <reference by tag or define other Scales of interest as alternatives and supplements>.
Qualifier Definitions: <define the scale qualifiers, like 'for defined [Staff]', and list the options, like {CEO, Finance Manager, Customer}.>.
Meter Options: <suggest Meter(s) appropriate to the Scale>.
Known Usage: <reference projects & specifications where this Scale was actually used in practice with designers' names>.
Known Problems: <list known or perceived problems with this Scale>.
Limitations: <list known or perceived limitations with this Scale>.

Figure 1. Draft template

Finding and Developing Scales of Measure and Meters

The basic advice for identifying and developing scales of measure (Scales) and meters (Meters) for scalar attributes is as follows:

  1. Try to re-use previously defined Scales and Meters.

  2. Try to modify previously defined Scales and Meters.

  3. If no existing Scale or Meter can be reused or modified, use common sense to develop innovative, homegrown quantification ideas.

  4. Whatever Scale or Meter you start off with, you must be prepared to learn. Obtain and use early feedback, from colleagues and from field tests, to redefine and improve your Scales and Meters.

Tag: Ease of Access.
Version: August 11, 2003.
Owner: Rating Model Project (Bill).
Scale: Speed for a defined [Employee Type] with defined [Experience] to get a defined [Client Type] operating successfully from the moment of a decision to use the application.
Alternative Scales: None known yet.
Qualifier Definitions:
Employee Type: {Credit Analyst, Investment Banker, …}.
Experience: {Never, Occasional, Frequent, Recent}.
Client Type: {Major, Frequent, Minor, Infrequent}.
Meter Options: EATT: Ease of Access Test Trial. "This tests all frequent combinations of qualifiers at least twice. Measure speed for the combinations."
Known Usage: Project Capital Investment Proposals [2001, London].
Known Problems: None recorded yet.
Limitations: None recorded yet.

Figure 2. Use of the template

Reference Library for Scales of Measure

'Reuse' is an important concept for, sharing experience and saving time when developing Scales. You need to build reference libraries of your 'standard' scales of measure. Remember to maintain details supporting each 'standard' Scale, such as Source, Owner, Status and Version (Date). If the name of a Scale's designer is also kept, you can probably contact them for assistance and ideas.

Figure 1 is a draft template with <hints>, for specification of scales of measure in a reference library. Figure 2 is an example of the use of this template.

Reference Library for Meters

Another important standards library to maintain is a library of 'Meters.' 'Off the shelf' Meters from standard reference libraries can save time and effort since they are already developed and are more or less 'tried and tested' in the field.

It is natural to reference suggested Meters within definitions of specific scales of measure (as in the template and example above). Scales and Meters belong intimately together.

Managing 'What' You Measure

It is a well-known paradigm that you can manage what you can measure. If you want to achieve something in practice, then quantification, and later measurement, are essential first steps for making sure you get it. If you do not make critical performance attributes measurable, then it is likely to be less motivating for people to find ways to deliver necessary performance levels. They have no clear targets to work towards, and there are no precise criteria for judgment of failure or success.

Practical Example: Scale Definition

'User-friendly' is a popular term. Can you specify a scale of measure for it?

Here is my advice on how to tackle developing a definition for this quality.

  1. If we assume there is no 'off-the-shelf' definition that could be used, then you need to start describing the various aspects of the quality that are of interest.

    There are always many distinct dimensions to qualities such as usability, maintainability, security, adaptability and their like [ Gilb3 ]. (Suggestion: Try listing about 5 to 15 aspects of some selected quality that is critical to your project.)

    For this example, let's select 'environmentally friendly' as the one of many aspects that we are interested in, and we shall work on this below.

  2. Invent and specify a Tag: 'Environmentally Friendly' is sufficiently descriptive. Ideally, it could be shorter, but it is very descriptive left as it is. We indicate a formally defined concept by capitalizing the tag.

    Note, we usually don't explicitly specify 'Tag: ' but this sometimes makes the tag identity clearer.

  3. Check there is an Ambition statement, which briefly describes the level of requirement ambition. 'Ambition' is one of the defined Planguage parameters.

  4. Ensure there is general agreement by all the involved parties with the Ambition definition. If not, ask for suggestions for modifications or additions to it. Here is a simple improvement to my initial Ambition statement. It actually introduces a 'constraint'.

  5. Using the Ambition description, define an initial Scale that is somehow quantifiable (meaning - you can meaningfully attach a number to it). Consider what will be sensed by the stakeholders if the level of quality changes. What would be a visible effect if the quality improved? My initial, unfinished attempt, at finding a suitable Scale captured the ideas of change occurring, and of things getting better or worse:

    However, I was not happy with it, so I made a second attempt. I refined the Scale by expanding it to include the ideas of specific things being effected in specific places over given times:

    This felt better. In practice, I have added more [qualifiers] into the Scale, to indicate the variables that must be defined by specific things, places and time periods whenever the Scale is used.

  6. Determine if the term needs to be defined with several different scales of measure, or whether one like this, with general parameters, will do. Has the Ambition been adequately captured? To determine what's best, you should list some of the possible sub-components of the term (that is, what can it be broken down into, in detail?). For example:

    This example means: 'Thing' is defined as the set of things: Air, Water, Plant and Animal (which, since they are all four capitalized, are themselves defined elsewhere).

    Or alternatively, instead of just the colon after the tag, '=' or the more explicit Planguage parameter, 'Consists Of' can be used to make this notation more immediately intelligible to novices in reading Planguage:

    Then consider whether your defined Scale enables the performance levels for these sub-components to be expressed. You may have overlooked an opportunity, and may want to add one or more qualifiers to that Scale. For example, we could potentially add the scale qualifiers ' …. under defined [Environmental Conditions] in defined [Countries]… ' to make the scale definition even more explicit and more general.

    Scale qualifiers (like …' defined [Place] '…) have the following advantages:

    • they add clarity to the specifications

    • they make the Scales themselves more reusable in other projects

    • they make the Scale more useful in this project: specific benchmarks, targets and constraints can then be specified for any interesting combination of scale variables (such as, 'Thing = Air').

  7. Start working on a Meter - a specification of how we intend to test or measure the performance of a real system with respect to the defined Scale. Remember, you should first check there is not a standard or company reference library Meter that you could use. Try to imagine a practical way to measure things along the Scale, or at least sketch one out. My example is only an initial rough sketch defined by a {set} of three rough measurement concepts. These at least suggest something about the quality and costs with such a measuring process.

    Meter: {scientific data where available, opinion surveys, admitted intuitive guesses}.

    The Meter must always explicitly address a particular Scale specification. It will help confirm your choice of Scale as it will provide evidence that practical measurements can feasibly be obtained on the given scale of measure.

  8. Now try out the Scale specification by trying to use it to specify some useful levels on the Scale. Define some reference points from the past (Benchmarks) and some future requirements (Targets and Constraints). See Figure 3, at the bottom of the previous page, for an example.

    Environmentally Friendly:
    Ambition: A high degree of protection, compared to competitors, over the short-term and the long-term,
    in near and remote environments for health and safety of living things, which does not reduce the protection
    already present in nature.

    Scale: The percentage (%) destruction or reduction of defined [Thing] in defined [Place] during a defined
    [Time Period] as caused by defined [Environmental Changes].
    ============= Benchmarks =================
    Past [Time Period = Next Two Years, Place = Local House, Thing = Water]: 20% <- intuitive guess.
    Record [Last Year, Cabin Well, Thing = Water]: 0% <- declared reference point.
    Trend [Ten to Twenty Years From Now, Local, Thing = Water]: 30% <- intuitive. "Things seem to be getting worse."
    ============ Scalar Constraint ==========
    Fail [End Next Year, Thing = Water, Place = Eritrea]: 0%. "Not get worse."
    =============== Targets ===================
    Wish [Thing = Water, Time = Next Decade, Place = Africa]: <3% <- Pan African Council Policy.

    Goal [Time = After Five Years, Place = <our local community>, Thing = Water]: <5%.

    Figure 3. Benchmarks, targets and constraints

    If this seems unsatisfactory, then maybe I can find another, more specific, scale of measure? Maybe use a 'set' of different Scales to express the measured concept better? See examples below.

    Here is an example of a single more-specific Scale:

    Figure 4 shows an example of some other and more-specific set of Scales for the 'Environmentally Friendly' example. They are perhaps a complimentary set for expressing a complex Environmentally Friendly idea.

    Environmentally Friendly:
    Ambition: A high degree of protection, compared to competitors, over the short-term and the long-term,
    in near and remote environments for health and safety of living things, which does not reduce the protection
    already present in nature.
    ----Some scales of measure candidates - they can be used as a complementary set ---
    Air: Scale: % of days annually when <air> is <fit for all humans to breath>.
    Water: Scale: % change in water pollution degree as defined by UN Standard 1026.
    Earth: Scale: Grams per kilo of toxic content.
    Predators: Scale: Average number of <free-roaming predators> per square km, per day.
    Animals: Scale: The percentage (%) reduction of any defined [Living Creature] who has a defined [Area]
    as their natural habitat.

    Figure 4. Alternative scales

    Many different scales of measure can be candidates to reflect changes in a single critical factor.

    Environmentally Friendly is now defined as a 'Complex Attribute,' because it consists of a number of 'elementary' attributes: {Air, Water, Earth, Predators, Animals}. A different scale of measure now defines each of these elementary attributes. Using these Scales we can add corresponding Meters, benchmarks (like Past), constraints (like Fail), and target levels (like Goal), to describe exactly how Environmentally Friendly we want to be.

Level of Specification Detail. How much detail you need to specify, depends on what you want control over, and how much effort it is worth. The basic paradigm of Planguage is you should only elect to do what pays off for you. You should not build a more detailed specification than is meaningful in terms of your project and economic environment. Planguage tries to give you sufficient power of articulation to control both complex and simple problems. You need to scale up, or down, as appropriate. This is done through common sense, intuition, experience and organizational standards (reflecting experience). But, if in doubt, go into more detail. History says we have tended in the past to specify too little detail about requirements. The result consequently has often been to lose control, which costs a lot more than the extra investment in requirement specification.

Language Core: Scale Definition

Now let's discuss the specification of Scales in more detail, particularly the use of qualifiers.

The Central Role of a Scale within Scalar Attribute Definition. The specified Scale of an elementary scalar attribute is used (re-used!) within all the scalar parameter specifications of the attribute (that is, within all the benchmarks, the constraints and the targets). In other words, a Scale parameter specification is the heart of a specification. Scale is essential to support all the related scalar level parameters: for example Past, Record, Trend, Goal, Budget, Stretch, Wish, Fail and Survival.

Each time a different scalar level parameter is specified, the Scale specification dictates what has to be defined numerically and in terms of Scale Qualifiers (like 'Staff = Financial Manager'). And then later, each time a scalar level parameter definition is read, the Scale specification itself has to be referenced to 'interpret' the meaning of the corresponding scale level specification. So the Scale is truly central to a scalar definition. For example, 'Goal [Staff = Financial Manager]: 23%.' only has meaning in the context of the corresponding scale: for example 'Scale: % of defined [Staff] attending the meeting', Well-defined scales of measure are well worth the small investment to define them, to refine them, and to re-use them.

Specifying Scales using Qualifiers. The scalar attributes (performance and resource) are best measured in terms of specific times, places and events. If we fail to do this, they lose meaning. People wrongly guess other times, places and events than you intend, and cannot relate their experiences and knowledge to your numbers. If we don't get more specific by using qualifiers, then performance and resource continues to be a vague concept, and there is ambiguity (which times? which places? which events?).

Further, it is important that the set of different performance and resource levels for different specific time, places and events are identified. It is likely that the levels of the performance and resource requirements will differ across the system depending on such things as time, location, role and system component.

Embedded Qualifiers within a Scale. A Scale specification can set up useful 'scale qualifiers' by declaring embedded scale qualifiers, using the format 'defined [<qualifier>]'.

Decomposing complex performance and resource ideas, and finding market-segmenting qualifiers for differing target levels is a key method of competing for business.

It can also declare default qualifier values that apply by default if not overridden, 'defined [<qualifier>: default: <User-defined Variable or numeric value>]'. For example, […default: Novice].

Additional Qualifiers. However, embedded qualifiers should not stop you adding any other useful additional qualifiers later, as needed, during scale-related specification (such as Goal or Meter). But, if you do find you are adding the same type of parameters in almost all related specifications, then you might as well design the Scale to include those qualifiers. A Scale should be built to ensure that it forces the user to define the critical information needed to understand and control a critical performance or resource attribute. This implies that scale qualifiers serve as a checklist of good practice in defining scalar level specifications, such as Past and Goal.

Here is an example of how locally defined qualifiers (see the Goal specification below) can make a quality specification more specific. In this example we are going to see how a requirement can be conditional upon an event. If the event is not true, the requirement does not apply.

First, some basic definitions are required (Note that 'Basis', 'Source' and 'State' are Planguage parameters):

Assumption A: Basis [This Financial Year]: Norway is still
not a full member of the European Union.
EU Trade: Source: Euro Union Report "EU Trade in Decade
Positive Trade Balance: State [Next Financial Year]:
Norwegian Net Foreign Trade Balance has Positive
Total to Date.

Now we apply those definitions below:

Quality A:
Type: Quality Requirement.
Scale: The percentage (%) by value of Goods delivered that
are returned for repair or replacement by consumers.
Meter [Development]: Weekly samples of 10,
[Acceptance]: 30 day sampling at 10% of representative cases,
[Maintenance]: Daily sample of largest cost case.
Fail [European Union, Assumption A]: 40% <- European
Economic Members.
Goal [EU and EEU members, Positive Trade Balance]:
50% <- EU Trade.

The Fail and the Goal requirements are now defined partly with the help of qualifiers. The Goal to achieve 50% (or more, is implied) is only a valid plan if 'Positive Trade Balance' is true. The Fail level requirement of 40% (or worse, less, is implied) is only valid if 'Assumption A' is true. All qualifier conditions must be true for the level to be valid.

Principles: Scale Specification

  1. The Principle of 'Defining a Scale of Measure'

    If you can't define a scale of measure, then the goal is out of control.

    Specifying any critical variable starts with defining its units of measure.

  2. The Principle of 'Quantification being Mandatory for Control'

    If you can't quantify it, you can't control it. [ 1 ]

    If you cannot put numbers on your critical system variables, then you cannot expect to communicate about them, or to control them.

  3. The Principle of 'Scales should control the Stakeholder Requirements'

    Don't choose the easy Scale, choose the powerful Scale.

    Select scales of measure that give you the most direct control over the critical stakeholder requirements. Chose the Scales that lead to useful results.

  4. The Principle of 'Copycats Cumulate Wisdom'

    Don't reinvent Scales anew each time - store the wisdom of other Scales for reuse.

    Most scales of measure you will need, will be found somewhere in the literature, or can be adapted from existing literature.

  5. The Cartesian Principle

    Divide and conquer said René - put complexity at bay.

    Most high-level performance attributes need decomposition into the list of sub-attributes that we are actually referring to. This makes it much easier to define complex concepts, like 'Usability', or 'Adaptability,' measurably.

  6. The Principle of 'Quantification is not Measurement'

    You don't have to measure in order to quantify!

    There is an essential distinction between quantification and measurement. "I want to take a trip to the moon in nine picoseconds" is a clear requirement specification without measurement." The well-known problems of measuring systems accurately are no excuse for avoiding quantification. Quantification allows us to communicate about how good scalar attributes are or can be - before we have any need to measure them in the new systems.

  7. The Principle of 'Meters Matter'

    Measurement methods give real world feedback about our ideas.

    A 'Meter' definition determines the quality and cost of measurement on a scale; it needs to be sufficient for control and for our purse.

  8. The Principle of 'Horses for Courses' [ 2 ]

    Different measuring processes will be necessary for different points in time, different events, and different places. [ 3 ]

  9. The Principle of 'The Answer always being '42' ' [ 4 ]

    Exact numbers are ambiguous unless the units of measure are well-defined and agreed.

    Formally defined scales of measure avoid ambiguity. If you don't define scales of measure well, the requirement level might just as well be an arbitrary number.

  10. The Principle of 'Being Sure About Results'

    If you want to be sure of delivering the critical result - then quantify the requirement.

    Critical requirements can hurt you if they go wrong - and you can always find a useful way to quantify the notion of 'going right' to help you avoid doing so.


This paper has tried to show how to define scales of measure for system attributes. It has also introduced the pragmatic detail available in Planguage for such specification and, for exploiting scales of measure to define benchmarks, targets and constraints.

Scales of measure are an essential means towards quantifying and getting control of your critical system attributes.


[Glib] Gilb, Tom, Principles of Software Engineering Management . Addison-Wesley, 1988, 442 pages, ISBN 0-201-19246-2. See particularly page 150 (Usability) and Chapter 19 Software Engineering Templates.

[Glib-] Gilb, Tom and Graham, Dorothy, Software Inspection. Addison-Wesley, 1993, ISBN 0-201-63181-4, 471 pages.

[Glib2] Gilb, Tom, Competitive Engineering , Elsevier 2005 This book defines Planguage.)

[Glib3] Gilb, Tom. Various free papers, slides, and manuscripts on . The manuscripts include: (1) Quantifying Quality (Book manuscript draft Summer 2004, available from by request if not on website yet.) (2) Requirements Engineering (about 500 slides giving examples and theory.) Version April 15 2003 for INCOSE June Wash DC, updated Dec 14 2004. Paper accepted as a talk at INCOSE 2003, Washington DC, and published in the CD Proceedings.

[ 1 ] Paraphrasing a well-known old saying.

[ 2 ] 'Horses for courses' is a UK expression indicating something must be appropriate for use, fit for purpose.

[ 3 ] There is no universal static scale of measure. You need to tailor them to make them useful.

[ 4 ] Concept made famous by Douglas Adams in The Hitchiker's Guide to the Galaxy.

Your Privacy

By clicking "Accept Non-Essential Cookies" you agree ACCU can store non-essential cookies on your device and disclose information in accordance with our Privacy Policy and Cookie Policy.

Current Setting: Non-Essential Cookies REJECTED

By clicking "Include Third Party Content" you agree ACCU can forward your IP address to third-party sites (such as YouTube) to enhance the information presented on this site, and that third-party sites may store cookies on your device.

Current Setting: Third Party Content EXCLUDED

Settings can be changed at any time from the Cookie Policy page.