Skip Navigation

4.1: What Is Test and Evaluation?

Difficulty Level: At Grade Created by: CK-12
Turn In

Lesson Objectives

  • Define Test & Evaluation (T&E), and explain why it is conducted on a new product
  • Understand the difference between a test and an evaluation
  • Describe and give examples of technical T&E
  • Describe and give examples of user T&E
  • Describe effectiveness and suitability
  • Define risk as the term is used in the T&E process
  • Discuss the need for test planning


A person who designs and manufactures a product or system. May also be the one selling the product to a customer. For example, the developer may be a car manufacturer or the maker of electronic games and software.
The person who buys the product and actually uses it. In the context of this lesson, the customer can also be called the user.
The extent to which the goals of the system are attained, or the degree to which a system can be used to achieve its intended purpose or a specific set of requirements.1
empirical information
Information gained by means of observation or experimentation. Empirical data are data produced by an experiment or observation. A central concept in modern science and the scientific method is that all evidence must be empirical, or empirically based. Evidence or consequences must be observable and repeatable.2
The range of conditions under which a system can be expected to operate. This can mean adverse weather, heat or cold, rough terrain, or poor lighting conditions. The environment can also include various systems and components that are needed to complete a larger system or product, such as a satellite that is needed to send a GPS signal to a navigation system in your car. In the context of T&E, environment will include the actual conditions in which and under which a product will be used and the interrelationships that exist among them.3
integrated testing
A process meant to put the product in a "real-world" environment as soon as possible and emphasize the early integration of user testing into technical test planning and execution.4
The qualities of a product or system that makes it appropriate for use by a customer. Some examples would be initial cost, reliability, ease of use, cost to operate, cost to maintain, and length of time it is able to perform its stated task, among others. Suitability for a customer should be considered as part of a product’s overall effectiveness.
system under test (SUT)
The actual thing that is to be tested. It can be new hardware, updated software, or a component of a larger system. When it is an integral part of a larger system or family of systems, the overall interaction as a unit is referred to as a system-of-systems. We will discuss more of this system-of-systems concept in the lesson "Introduction to T&E of System-of-Systems and Interoperability," below.
qualitative data
Information based on some characteristic rather than on some quantity or measured value. In T&E, qualitative data may be based on subjective inputs or, in some cases, opinions; for example, ease of use, comfort, and aesthetics.
quantitative data
Information based on quantities or quantifiable inputs. Quantitative data has objective properties and is based on specific and observable measurements; for example, dimensions, weight, and capacity.

Check Your Understanding

Test and Evaluation (T&E) is a fundamental aspect of developing a new product. T&E is conducted in some form by the developers of almost all the products we use. T&E can be a simple and inexpensive process, or it can be complicated and expensive. Most of us are exposed to the effects of T&E, but many do not understand why or how it is conducted. To understand how live, virtual, and constructive models and simulations can be used for T&E, the student will first need to understand the basics of T&E.


This lesson will introduce the student to the basics of Test and Evaluation (T&E). It is not meant to provide detailed knowledge of the T&E process, but rather to expose students to the necessary aspects of T&E, so they understand the relationships of T&E and live, virtual, and constructive simulations. We will discuss live, virtual, and constructive simulations in lesson two of this chapter.

Lesson Content


Test and Evaluation (T&E) is generally conducted by the developer of a product on behalf of the customer, or the person who will actually buy or use the product. The developer does not want to sell the customer a product that does not work, and that developer wants to make sure the product works as intended. So the developer tests his product to ensure that the product works in the conditions, or environment, in which the customer will actually use it. Most developers will follow a defined T&E process so they can identify deficiencies and problems with their products and then correct those problems before the product is released or sold to the customer.

What Is a Test?

Technically speaking, a test is a procedure designed to obtain, verify, or provide data for the evaluation of the performance and suitability of systems, sub-systems, components, and other equipment items. In short, a test is an event that obtains raw data to be used to measure specific or individual performance factors. A test can be very resource intensive, in that it can require a large amount of manpower and equipment to obtain adequate and credible data.

Discovering problems when building complex products is a normal part of any development process, and testing is perhaps the most effective tool for discovering such problems. Testing is the main instrument used to gauge the progress being made when an idea or concept is translated into an actual product. Ideally, testing progresses from early laboratory testing of technologies, to component and sub-system testing, through testing of a complete system, and finally to trial use in the customer’s hands.5

http://www.youtube.com/watch?v=5xlObdXF8VE Go to this link to view the T&E process for a new commercial jet aircraft engine. GE90-115B Gas Turbine Jet Engine Testing & Evaluation.

http://www.bfbs.com/news/england/military-testing-facility-where-bullets-meet-their-match-45508.html Go to this link to view the process for testing new and upgraded body armor used by British troops in combat.

What Is an Evaluation?

An evaluation refers to what is learned from a test. An evaluation denotes the process whereby raw data obtained during a test are logically assembled, analyzed, and compared to expected performance to aid in systematic decision-making. An evaluation results in analyzed information and may involve the review and analysis of qualitative or quantitative data obtained from design reviews, hardware inspections, modeling and simulation (M&S), hardware and software testing, metrics review, and actual use of equipment. An evaluation can be intellectually intensive and is used to draw conclusions by analyzing data to determine how the data from tests, models and simulations relate and interact. In general, most evaluations result in a formal written report on findings and recommendations to the developer.6

The Federal Aviation Administration evaluated data from their tests and from Boeing tests of the 777-200 airliner to make the decision to issue a certificate of airworthiness needed to sell an aircraft on the commercial market. Automobile manufacturers evaluate data from tests of their products before they decide to release those products for sale. They want to avoid recalls such as have been experienced by some companies over the last few years.

What Is Test and Evaluation (T&E)?

When test and evaluation are put together, it becomes a process by which a system or components are exercised and the results analyzed to provide performance-related information. Test and Evaluation (T&E) is used at a variety of levels, including basic technology, components and subsystems, a complete system or product, and even several systems working together. The information has many uses, including design decisions, production decisions, risk identification, risk mitigation, and gathering of empirical data to validate models and simulations. T&E is a process that ensures a product or system meets its designed capability by enabling an assessment of technical performance, specifications, and system maturity. This allows the developer to determine whether the product or system performs correctly and is appropriate for use by the customer. The T&E process is often repeated as the system evolves from models to components to production articles and complete systems.7

It is important that the T&E is conducted using the intended operating environment for the product as early in the design process as feasible. That will present challenges for those who plan and conduct the T&E, but if T&E is not done under realistic conditions, then problems with the product may not surface until the product is in the hands of the customer. If this occurs, the developer will have the additional expense of correcting the problem after the product is on the market for sale to the customer.

Many commercial companies have found innovative ways of conducting T&E that help them avoid being surprised by problems late in a product’s development. In general, most product testing is conducted by organizations separate from those responsible for managing product development. The intent is that the separate or independent organization will be more objective in T&E of the product than the developing organization.8, 9

What Is the Purpose of T&E?

The fundamental purpose of T&E is to provide knowledge to assist in managing the risks involved in developing, producing, operating, and sustaining systems and capabilities.10 T&E is generally done by the developer, but sometimes can be done on behalf of the customer who will actually use the product. T&E provides knowledge of system capabilities and limitations to the developer for optimizing product performance and suitability in real world use. T&E enables the product’s developers to learn about limitations (technical or operational) of the system under development, so that they can be resolved prior to production and deployment of the product or system to the customer.

Commercial companies have learned to make T&E an integral part of their product development process. For example, Boeing experienced significant problems with the 747-400 airliner due to ineffective T&E planning. Adequate T&E was not accomplished early in the developmental process of the new aircraft and caused the company to deliver the aircraft late and to assign 300 engineers to solve problems not found earlier in development. For the 777-200 airliner (Figure below), Boeing included aggressive T&E from the very beginning of the aircraft’s design and development. This approach was so much more effective that the 777-200 program reduced design changes, errors, and systems rework by more than 60 percent. In addition, the Federal Aviation Administration certified the initial aircraft for overseas flight on the basis of Boeing’s T&E results. The certification normally requires two years of actual flight service.11

After a flaw in the original Pentium® microprocessor cost Intel about $500 million to replace products for customers, the firm approached the T&E of subsequent microprocessors differently. The quality of these microprocessors, such as the Pentium® Pro and Pentium® III, has significantly improved, yet they were developed in the same amount of time as the original Pentium® microprocessor, despite being many times more complex.12

In sum, the ultimate goal of T&E is to make sure the product works as intended before it is provided to customers. This saves the company time and money in the long run and makes a better product for the customer.

Designing and building the Boeing 777-200 included many innovations new to commercial aircraft. It has the most aerodynamically efficient airfoil ever developed for subsonic commercial aviation. Weight-saving advanced composite materials, including carbon fibers, are embedded in toughened resins in the vertical and horizontal tail. The advanced wing enhances the airplane's ability to climb quickly and cruise at higher altitudes than competing airplanes, while achieving higher cruise speeds. It also allows the airplane to carry full passenger payloads out of many high-elevation, high-temperature airfields. (Courtesy Boeing)

What Types of Test and Evaluation Are There?

There are various types of T&E depending on the system being developed, its use, and the intended customers. T&E activities might be called technical specification testing, acceptance trials, advanced technology demonstrations, verification testing, or product quality testing. Some developers conduct what they call certification testing to "certify" that the product is ready for customer use, and some companies use the term "user testing" for the same purpose. The popular magazine Consumer Reports conducts what they call "consumer product testing." They do not develop or sell the products; they conduct T&E and report the evaluation to the consumer via the magazine. Sometimes, there is more than one type of testing done on the same product or system. For example, many large commercial airplane builders will conduct in-house testing, followed by the Federal Aviation Administration conducting separate certification testing. At times, models and simulations may be required to test certain capabilities and testers will have to decide whether those simulations should be live, virtual, or constructive, or a combination of those three. We will discuss live, virtual, and constructive simulations further in the next lesson.

While there are many names for the various types of T&E, companies will generally think in terms of using T&E to ensure that a product is working as intended or maturing in accordance with an expected schedule. Most products will have three basic levels of maturity: components working individually, components working together as a system in a controlled setting, and components working together as a system in a realistic setting. Thus, the focus is on attaining the knowledge necessary to ensure that their products meet a basic set of standards at given point in time.13

Each type of T&E has its purpose in the development of a new product. In order to understand how live, virtual, and constructive models and simulations are used for T&E, we will discuss two types of T&E: technical T&E and user T&E.

Technical Test and Evaluation

Technical T&E (Figure below) supports the design and developmental process used to build new systems and products and is generally used in the first two levels of product maturity: components work individually and components work together as a system in a controlled setting.

Technical T&E is done to provide information about risk and risk mitigation, as well as assessing the technical performance parameters of the system under test. Technical T&E also provides empirical data to validate models and simulations and information to support periodic technical performance and system maturity evaluations. Robust technical T&E reduces technical risk by discovering problems early during the product's development, when they are less costly to correct. This in turn increases the probability of a successful program. During early technical testing, the product developer will focus on testing technical performance specifications. Follow-on technical testing events should advance to robust, system-level and system-of-systems level testing to ensure that the system has matured to a point where it can meet the requirements of use in a realistic environment.

A Consumer Reports technician conducting "consumer product testing" on a frozen pizza in 1972. While not a real system in the strictest sense, the system under test (SUT) is the pizza. To verify that the SUT meets technical standards, the technician is painstakingly removing meat and other ingredients with tweezers and magnifying glass so they can be weighed. This is a form of technical testing, as it will assess whether the pizza meets the advertised standards. Once the technician begins to conduct a taste test on the pizza, it would be considered a "user test." (Courtesy of Consumer Reports magazine)

User Test and Evaluation

User T&E can be considered a test using representatives of typical customers or users, under realistic conditions, of any system or key component for the purpose of determining the performance of the system or product for use in real life by typical users. A user test will generally focus on the third of the three product maturities: components work together as a system in a realistic setting, also called the product’s environment.

A user type of testing should generally include interfacing systems and sub-systems to provide as real an environment as possible for the system to perform in. User T&E can be conducted early in a product’s development to provide insight into potential real-world problems and progress toward meeting performance and suitability requirements. Real-world problems might be those caused by operating a product in harsh weather or driving a vehicle over rough terrain. Another real-world example is a product being used by a customer who lacks detailed technical knowledge of how a product works. Most companies will delay a product’s release until the developer is confident the product or system is effective and suitable for use by the customer.

User T&E will generally include using only production representative systems, operated by representative users (Figure below). The early planning for user testing should consider any special user requirements, such as the need for large or unique test areas, supporting capabilities and systems, necessary models and simulations (to include simulators), or other unique requirements to put the system under test in as realistic an environment and as close to realistic conditions as feasible. An environment is simply the surroundings in which a system or product must operate, and can include weather, terrain, dust and dirt, heat, cold, air quality, darkness, high humidity, etc. The product should be tested in an environment as near to the actual conditions it will be used in as possible.

This is an early permanent wave machine. Technical testing might find that it met all technical specifications. User testing might find that it was effective and worked just fine, doing a wonderful job of making wavy hair. But, would it be found suitable for use at home? (Courtesy of Consumer Reports magazine)

This link demonstrates one company’s process for what it calls "usability testing," which is part of user testing. http://www.youtube.com/watch?v=l9xYeP0z78k

What Is Risk Reduction?

Risk is a measure of future uncertainties in achieving a system’s or product’s performance goals and objectives within a set of defined cost, schedule, and performance constraints.14

"Late-cycle churn" is a phrase one commercial company uses to describe the scramble to fix significant problems or flaws that are discovered late in a product’s development. The "churn" refers to the additional, unanticipated time, money, and effort that must be invested to overcome the problem. Problems are most devastating when they delay product delivery, increase product cost, or "escape" to the customer. Usually, it is a test that reveals the problem, hopefully early in the product’s design and development when the problem is easier and cheaper to correct.15

Risk reduction is the activity that examines identified risks at various stages of a system’s or product’s development to isolate the cause and allow the developer to take the most appropriate corrective action to mitigate that risk (Figure below). Mitigating and reducing risk has obvious consequences in terms of a system’s performance, development schedule, and cost. Risk reduction can be applied to all aspects of a program (e.g., technology, maturity, supplier capability, design maturation, performance against plan) and should be of great interest to T&E planners. T&E is a vital tool not only in mitigating known risks to a product, but also in identifying unanticipated risks and discovering previously unknown problems. The proof of successful risk reduction in T&E can be found in the degree to which a product experiences "late-cycle churn."

The underside of the Boeing 777-200 with the landing gear and flaps fully extended. To ensure the safe operation of the airplane, all these complicated parts need to work each time they are used, without fail, under all conditions! So the T&E of these systems will have a very low threshold of acceptable risk. (Courtesy of Wikipedia)

What Is Integrated Testing?

A significant method to minimizing surprises and "late-cycle churn" in developing new products is to integrate technical and user testing for product maturity early in a product’s development. If the risks to "components working together as a system in a realistic setting" are identified early in the developmental cycle, they are much easier and cheaper to correct. Consequently, many companies and product developers place a high value on conducting user T&E early, often embedded into technical testing events, and using the results to save time, money, and assets as well as to make the product better. Examples of integrated testing will be discussed in the next lesson, "Live, Virtual, and Constructive Simulations in Test and Evaluation."

Integrated testing is not an event or separate test phase, nor is it a new type of test. Integrated testing is a process meant to put the product in a "real-world" environment as soon as possible and emphasize the early integration of user testing into technical test planning. The goal of integrated testing is to conduct a seamless T&E program that produces credible qualitative and quantitative data useful to all evaluators, and to address developmental, suitability, and customer use issues. Integrated testing allows for the collaborative planning of test events, where a single test point or mission can provide data to satisfy multiple objectives without compromising the test objectives of participating test organizations.16

There is no single implementation of integrated testing that will be optimum for all products, but planning and conducting the T&E program in a collaborative manner will result in a more effective and efficient test effort. Such a structured approach also ensures that all test activities are necessary, duplication is eliminated, and that no areas are missing in the overall T&E effort. If done correctly, the enhanced operational realism in integrated testing provides greater opportunity for early identification of system design improvements, and may even change the course of system development during its early stages. Integrated testing can increase the statistical confidence and power of all T&E activities. Most obviously, integrated testing can also reduce the number of T&E resources needed in user T&E.

How Do You Plan for T&E?

Planning for T&E should be within the product’s overall development strategy and allow for a realistic period of time to accomplish the planned test events, evaluations, and report preparation. Planning for T&E should first identify technological capabilities and limitations of alternative concepts and design options under consideration to support cost-performance tradeoffs. For instance, what is the minimum number of test runs needed to collect the right amount of data necessary to conduct an adequate evaluation of a system? Given that light bulbs cost money and that it takes time to test new ones, how long do you need to leave a particular light bulb on to evaluate its reliability? Also, how many light bulbs do you need for an individual test? Those are cost-performance factors a test planner must consider. Test too many light bulbs for too long, and you waste time and money. If you test too few or for too short a time, your test data may be unreliable. There are several T&E planning strategies used by various developers. Most will include a process that begins with identifying what facets or capabilities of the product need to be tested. Then T&E planners will continue by identifying how they will do that: what resources will be needed to test, how the data will be gathered and analyzed, and how the data will be reported.17

T&E planning should also:

  • Identify and describe design technical risks to the system.
  • Develop a test that will stress the system under test to at least the limits of the expected product operating limits — and for some systems, beyond the normal operating limits — to ensure the robustness of the design. This testing will reduce risk of poor performance in the expected operational environments.
  • Consider using early test activities, where appropriate, prior to conducting full-up, system-level testing, such as flight-testing, in realistic environments.
  • Include technical and manufacturing risks in the required assessment of technical progress in order to mitigate technical risk.18

Part of the T&E planning process should consider modeling and simulation. Test planners should collaborate early on the planned use of models and simulation to support or supplement their test activities or analyze test results. When actual system testing is not possible to support a T&E event, test planners may use computer modeling and simulations, preferably with real operators involved or "in the loop"). Should these capabilities be live, virtual, or constructive? We will discuss more about live, virtual, and constructive simulations in the next lesson.

It is important to identify gaps in planning that will prevent the test from achieving the desired results. Test planners must ensure that each T&E event will satisfy the objectives the developers need to assess the product maturity level for the system under test.

Lesson Summary

  • A test is an event that obtains raw data to be used to measure specific or individual performance factors.
  • An evaluation denotes the process of analyzing data obtained during a test. Simply put, an evaluation refers to what is learned from a test.
  • Test and Evaluation (T&E) is a process by which a system or components are exercised and the results analyzed to provide performance-related information.
  • There are many types of T&E, depending on the developer, the product, and the product’s use. The two most recognized types are described here as technical and user T&E.
  • Risk reduction can be applied to all aspects of a program (e.g., technology, maturity, supplier capability, design maturation, performance against plan) and should be of great interest to T&E planners.
  • The ultimate goal of T&E is to make sure the product works as intended before it is provided to customers. This saves the company time and money and makes a better product for the customer.

Review Questions

1. Describe the differences between test and evaluation.

2. What are two attributes of technical T&E?

3. What are two attributes of user T&E?

4. What are some points to consider when planning a T&E on a new product?

5. How can an effective T&E reduce risk in developing a new product?

6. What is the ultimate goal of T&E?

Further Reading/Supplemental Links

http://www.youtube.com/watch?v=VdlaQTS76VU This link provides a tongue-in-cheek demonstration of the lengths some companies will go in user testing of their product.

Points to Consider

For this lesson, points to consider are provided as an aid to the instructor to stimulate critical thinking among the students. These questions have no right or wrong answers, but may help further the student’s understanding of the material. Suggested responses are provided to help the instructor guide the discussion.

T&E is a proven and accepted methodology used to make sure new products work as they are intended to and are ready for use by customers.

  • Based on the picture of the landing gear and flaps of the Boeing 777-200, shown in Figure above, what risk would the developer incur if adequate technical and user T&E were not conducted on this complex system? (Suggested responses: The safety consequences of a failure of the gear or flaps to extend or retract are obvious. However, additional risk incurred by inadequate T&E might include cost overruns in developing, or delays in fielding, the new airplane; expensive corrections to the landing gear after it is fielded; or excessive maintenance required of the user to keep the landing gear working properly. Aggressive testing might show that there was no problem with the landing gear itself but that the light bulb that indicates the gear is down is faulty and unreliable. So, early integrated testing might help avoid an unnecessary re-design of the system.)
  • Based on what was learned in the "Introduction to Modeling and Simulation" chapter, how can M&S techniques be applied to T&E? (Suggested responses: Modeling and Simulation can be used to replicate the operational environment for the product and thus allow for early input of user testing. Testing of actual hardware can be very expensive. M&S can be used to help reduce some costs by replicating all or part of a system under test. M&S techniques could simulate components of a system that has not been completed yet, so T&E can be done before actual components are built.)


1 Defense Acquisition University Glossary 14th edition July 2011 (DODI 5000.02)

2 Wikipedia article, Scientific Method, http://en.wikipedia.org/wiki/Scientific_method

3 Defense Acquisition University Glossary 14th edition July 2011 (DODI 5000.02)

4 Defense Acquisition Guidebook, July 2011, https://acc.dau.mil/CommunityBrowser.aspx?id=315922

5 United States General Accounting Office, General Practices: A More Constructive Test Approach is Key to Better Weapon System Outcomes, July 2002 (http://www.gao.gov/archive/2000/ns00199.pdf)

6 Defense Acquisition Guidebook, July 2011, https://acc.dau.mil/CommunityBrowser.aspx?id=315922

7 ACQNOTES.COM (http://imap.acqnotes.com/Attachments/Lesson%2018%20Test%20and%20Evaluation%20Overview.pdf)

8 Defense Acquisition Guidebook, July 2011, https://acc.dau.mil/CommunityBrowser.aspx?id=315922

9 DoD 5000.59-M (DoD Modeling and Simulation Glossary), December 1997

10 Defense Acquisition Guidebook, July 2011, https://acc.dau.mil/CommunityBrowser.aspx?id=315922

11 GAO General Practices: A More Constructive Test Approach is Key to Better Weapon System Outcomes, July 2002, http://www.gao.gov/archive/2000/ns00199.pdf

12 United States General Accounting Office, General Practices: A More Constructive Test Approach is Key to Better Weapon System Outcomes, July 2002, http://www.gao.gov/archive/2000/ns00199.pdf

13 United States General Accounting Office, General Practices: A More Constructive Test Approach is Key to Better Weapon System Outcomes, July 2002, http://www.gao.gov/archive/2000/ns00199.pdf

14 Defense Acquisition University, Risk Management Guide for Defense Acquisition, 6th ed., Aug 2006, http://www.dau.mil/pubs/gdbks/docs/RMG%206Ed%20Aug06.pdf

15 Francis, Paul L., Testimony Before the Subcommittee on Tactical Air and Land Forces, Committee on Armed Services, House of Representatives United States Government Accountability Office (GAO), DEFENSE ACQUISITIONS: Future Combat Systems Challenges and Prospects for Success, Mar 2005, http://www.gpo.gov/fdsys/pkg/GAOREPORTS-GAO-05-428T/pdf/GAOREPORTS-GAO-05-428T.pdf

16 Defense Acquisition Guidebook, July 2011

17 Defense Acquisition University Glossary 14th edition, July 2011 (DODI 5000.02)

18 Defense Acquisition Guidebook, July 2011, https://acc.dau.mil/CommunityBrowser.aspx?id=315924

Notes/Highlights Having trouble? Report an issue.

Color Highlighted Text Notes
Show More

Image Attributions

Show Hide Details
Date Created:
Aug 06, 2012
Last Modified:
Jan 30, 2016
Files can only be attached to the latest version of section
Please wait...
Please wait...
Image Detail
Sizes: Medium | Original