TestingDream

Dream World of Testing

The Product Quality Measures:

Written By: Isha Arora Bhatia - Apr• 10•12

1. Customer satisfaction index- This index is surveyed before product delivery and after product delivery (and on-going on a periodic basis, using standard questionnaires).The following are analyzed:

  • Number of system enhancement requests per year
  • Number of maintenance fix requests per year
  • User friendliness: call volume to customer service hotline
  • User friendliness: training time per new user
  • Number of product recalls or fix releases (software vendors)
  • Number of production re-runs (in-house information systems groups)

2. Delivered defect quantities- They are normalized per function point (or per LOC) at product delivery (first 3 months or first year of operation) or Ongoing (per year of operation) by level of severity, by category or cause, e.g.: requirements defect, design defect, code defect, documentation/on-line help defect, defect introduced by fixes, etc.

3. Responsiveness (turnaround time) to users

  • Turnaround time for defect fixes, by level of severity
  • Time for minor vs. major enhancements; actual vs. planned elapsed time

4. Product volatility

  • Ratio of maintenance fixes (to repair the system & bring it into compliance with specifications), vs. enhancement requests (requests by users to enhance or change functionality)

5. Defect ratios

  • Defects found after product delivery per function point.
  • Defects found after product delivery per LOC
  • Pre-delivery defects: annual post-delivery defects
  • Defects per function point of the system modifications

6. Defect removal efficiency

  • Number of post-release defects (found by clients in field operation), categorized by level of severity
  • Ratio of defects found internally prior to release (via inspections and testing), as a percentage of all defects
  • All defects include defects found internally plus externally (by customers) in the first year after product delivery

7. Complexity of delivered product

  • McCabe’s cyclomatic complexity counts across the system
  • Halstead’s measure
  • Card’s design complexity measures
  • Predicted defects and maintenance costs, based on complexity measures

8. Test coverage

  • Breadth of functional coverage
  • Percentage of paths, branches or conditions that were actually tested
  • Percentage by criticality level: perceived level of risk of paths
  • The ratio of the number of detected faults to the number of predicted faults.

9. Cost of defects

  • Business losses per defect that occurs during operation
  • Business interruption costs; costs of work-arounds
  • Lost sales and lost goodwill
  • Litigation costs resulting from defects
  • Annual maintenance cost (per function point)
  • Annual operating cost (per function point)
  • Measurable damage to your boss’s career

10. Costs of quality activities

  • Costs of reviews, inspections and preventive measures
  • Costs of test planning and preparation
  • Costs of test execution, defect tracking, version and change control
  • Costs of diagnostics, debugging and fixing
  • Costs of tools and tool support
  • Costs of test case library maintenance
  • Costs of testing & QA education associated with the product
  • Costs of monitoring and oversight by the QA organization (if separate from the development and test organizations)

11. Re-work

  • Re-work effort (hours, as a percentage of the original coding hours)
  • Re-worked LOC (source lines of code, as a percentage of the total delivered LOC)
  • Re-worked software components (as a percentage of the total delivered components)

12. Reliability

  • Availability (percentage of time a system is available, versus the time the system is needed to be available)
  • Mean time between failure (MTBF).
  • Man time to repair (MTTR)
  • Reliability ratio (MTBF / MTTR)
  • Number of product recalls or fix releases
  • Number of production re-runs as a ratio of production runs

 

Load Testing Metrics

Written By: Isha Arora Bhatia - Apr• 05•12

There are many measurements that you can use when load testing. The following metrics are key performance indicators for your web application or web site.

Average Response Times
Peak Response Times
Error Rates
Throughput
Requests per Second
Concurrent Users
Average Response TimeWhen you measure every request and every response to those requests, you will have data for the round trip of what is sent from a browser and how long it takes the target web application to deliver what was needed.

For example, one request will be a web page…let’s say the home page of the web site. The load testing system will simulate the user’s browser in sending a request for the “home.html” resource. On the target’s side, the request is received by the web server, it makes further requests of the application to dynamically build the page, and when the full HTML document is compiled, the web server returns that document along with a response header.

The Average Response Time takes into consideration every round trip request/response cycle up until that point in time of the load test and calculates the mathematical mean of all response times.

The resulting metric is a reflection of the speed of the web application being tested – the BEST indicator of how the target site is performing from the users’ perspective. The Average Response Time includes the delivery of HTML, images, CSS, XML, Javascript files, and any other resource being used. Thus, the average will be significantly affected by any slow components.

Response times can be measured as either:

Time to First Byte
Time to Last Byte
Some people like to know when the first byte of the response is received by the load generator (simulated browser). This shows how long the request took to get there and how long the server took to start replying. However, that is only part of the real equation. It seems to be much more valuable to know the entire cycle of response that encompasses the duration of download for the resource. Meaning, why would I want to know only part of the response time? What is most important is what the user experiences, and that includes the delivery of the full payload from the server. A user wants to see the HTML page – which requires receipt of the full document. So the Time to Last Byte would be preferred as a Key Performance Indicator (KPI) over Time to First Byte.

Peak Response Time: Similar to the previous metric, Peak Response Time is measuring the round trip of a request/response cycle. However the peak will tell us what is the LONGEST cycle at this point in the test.

For example, if we are looking at a graph that is showing 5 minutes into the load test that the Peak Response Time is 12 seconds, then we now know one of our requests took that long. The average may still be sub-second because our other resources had speedy response.

The Peak Response Time shows us that at least one of our resources are potentially problematic. It can reflect an anomaly in the application where a specific request was mishandled by the target system. Usually though, there will be an “expensive” database query involved in fulfilling a certain request such as a page that makes it take much longer, and this metric is great to expose those issues.

Typically images and stylesheets are not the slowest (although they can be when a mistake is made like using a BMP file). In a web application, the process of dynamically building the HTML document from application logic and database queries is usually the most time intensive part of the system. It is less common, yet occurs more often with open source apps, to have very slow Javascript files because of their enormous size. Large files can produce slow responses that will show up in Peak Response Time, so be careful when using big images or calling big JS libraries. Many times, you really only need less than 20% of the Javascript inside those libraries. Lazy coders won’t take the trouble to clean out the other 80%, and that will hurt their system performance.

Error Rate: it is to be expected that some errors may occur when processing requests, especially under load. Most of the time you will see errors begin to be reported when the load has reached a point that exceeds the web application’s ability to deliver what is necessary.

The Error Rate is the mathematical calculation that produces a percentage of problem requests to all requests. The percentage reflects how many responses are HTTP status codes indicating an error on the server, as well as any request that never gets a response.

The web server will return an HTTP Status Code in the response header. Normal codes are usually 200 (OK) or something in the 3xx range indicating a redirect on the server. A common error code is 500, which means the web server knows it has a problem with fulfilling that request. That of course doesn’t tell you what caused the problem, but at least you know that the server knows there is a definitive technical defect in the functioning of the system somewhere.

It is much trickier to measure something you never receive, so an error code can be reported by the load testing tool for a condition not indicated by the server. Specifically, the tool must wait for some period of time before it quits “listening” for a response. The tool must determine when it will “give up” on a request and declare a timeout condition. Timeouts will not a code received from a web server, so the tool must choose a code such as a 408 to represent the timeout error.

Other errors can be hard to describe because they do not occur at the HTTP level. A good example is when the web server refuses a connection at the TCP network layer. There is no way to receive an HTTP Status Code for this, thus the load testing tool must choose some error code to use for reporting this condition back to you in the load testing results. A code of 417 is what LoadStorm reports.

Error Rate is a significant metric because it measure “performance failure” in the application. It tells you how many failed requests are occurring at a particular point in time of your load test. The value of this metric is most evident when you can easily see the percentage of problems increase significantly as the higher load produces more errors. In many load tests, this climb in Error Rate will be drastic. This rapid rise in errors tells you where the target system is stressed beyond its ability to deliver adequate performance.

No one can define the tolerance for Error Rate in your web application. Some testers consider less than 1% Error Rate successful if the test is delivering greater than 95% of the maximum expected traffic. However, other testers consider any errors to be a big problem and work to eliminate them. It is not uncommon to have a few errors in web applications – especially when you are dealing with thousands of concurrent users.

ThroughputThroughput is the measurement of bandwidth consumed during the test. It shows how much data is flowing back and forth from your servers.

Throughput is measured in units of Kilobytes Per Second.

Requests per SecondRPS is the measurement of how many requests are being sent to the target server. It includes requests for HTML pages, CSS style sheets, XML documents, JavaScript libraries, images and Flash/multimedia files.

RPS will be affected by how many resources are called from the site’s pages. Some sites can have 50-100 images per page, and as long as these images are small in size (e.g. <25k), than the RPS will be higher than long text pages with few images that are dynamically generated from database queries. The reason for this is that images and other static resources are served by the web server or a Content Delivery Network, and there is virtually no expensive processing that must take place before that resource is sent to the browser (i.e. Load Storm).

Concurrent Users: Concurrent users is the most common way to express the load being applied during a test. This metric is measuring how many virtual users are active at any particular point in time. It does not equate to RPS because one user can generate a high number of requests, and each vuser will not constantly be generating requests.

A virtual user does what a “real” user does as specified by the scenarios and steps that you have created in the load testing tool. If there are 1,000 vusers, then there are 1,000 scenarios running at that particular time. Many of those 1,000 vusers may be spawning requests at the same time, but there are many vusers that are not because of “think time”. Simply put, think time is the pause between vuser actions that simulates what happens with a real user as he or she reads the page received before clicking again.

Other Thoughts on Load Testing Metrics
On SOA Testing blog, they list the most important load testing metrics in their context as:

* Response time: It’s the most important parameter to reflect the quality of a Web Service. Response time is the total time it takes after the client sends a request till it gets a response. This includes the time the message remains in transit on the network, which can’t be measured exclusively by any load-testing tool. So we’re restricted to testing Web Services deployed on a local machine. The result will be a graph measuring the average response time against the number of virtual users.
* Number of transactions passed/failed: This parameter simply shows the total number of transactions passed or failed.
* Throughput: It’s measured in bytes and represents the amount of data that the virtual users receive from the server at any given second. We can compare this graph to the response-time graph to see how the throughput affects transaction performance.
* Load size: The number of concurrent virtual users trying to access the Web Service at any particular instance in an interval of time.
* CPU utilization: The amount of CPU time used by the Web Service while processing the request.
* Memory utilization: The amount of memory used by the Web Service while processing the request.
* Wait Time (Average Latency): The time it takes from when a request is sent until the first byte is received.

User Acceptance testing

Written By: Isha Arora Bhatia - Mar• 21•12

User Acceptance testing is the formal testing done on the system to ensure that it satisfies the acceptance criteria before the system is put into production. [Most of the times it is done by users/clients].

The incremental process of approving or rejecting the system during development and maintenance.

Acceptance Testing checks the system against the requirements of the user. It is done by real people using real data and real documents to ensure ease of use and functionality of systems. Users who understand the business functions run the tests as given in the acceptance test plans, including installation and Online help Hardcopies of user documentation are also being reviewed for usability and accuracy. The testers/users formally document the results of each test, and provide error reports, correction requests to the developers.

User Acceptance testing Myth – Passing the UAT acknowledges that the system is fit for use and also it acknowledges the process of development was adequate.

Now a days we are using Agile and Incremental software development models. So Acceptance testing should be the ongoing activity. It needs to involved in the development process and approximate correction need to be made whenever it fails the acceptance criteria.

Ongoing Software Acceptance Testing enables:

Early detection of software problem.
Early consideration of user needs during software development.
Ensure user are involved in system and acceptance criteria.
Decision involved based on the results.

Characteristics of Effective Test Metrics

Written By: Isha Arora Bhatia - Feb• 08•12

Ideally, identifying test metrics takes place at the beginning of the project, so incorporation into the appropriate activities is easy. The test metrics you wish to collect need to be:

·Quantifiable

·Easy to collect

·Simple

·Meaningful

·Non-threatening

Quantifiable Measurements

To ensure consistent comparison of findings, the method of measurement needs to be standard, concise, and quantifiable. For example, to determine the density of defects, you need to identify what metrics provide this information and a standard of measurement. For example, the test metric to gather is the number of defects and the method of measurement is lines of code (loc), (i.e., “x” number of defects per “y” loc).

Definitions must be clear and concise. For example, the definition of defect must state what constitutes a defect and the definition of lines of code must state the number of lines of code to be used as the standard of measure, (e.g., 1000). The definitions must also provide any other information necessary to ensure consistency, (e.g., if the lines of code are commented or not commented).
Easy to Collect

Easy to collect

The information collection process must not take too much of the collector’s time, or the information will not be collected. The amount of test metrics gathered from any one group needs to be kept at a minimum, collecting only that which is most useful. Whenever possible, automate the data collection process.

Simple Information

The information collected should be simple to gather. If it is hard for the collector to determine what to measure or report, the information is likely to be inaccurate.

Meaningful Purpose

The information gathered must have a specific purpose, (or purposes). For example, the information will be used to determine the number of defects and time used for each testing phase, in order to determine the most cost effective ways to minimize errors.

The information to collect must be understandable and viewed as relevant to the collector, or the information will not be collected. For example, to make the information in the previous example relevant, explain that the findings will highlight the testing methods that work and methods that don’t work, so that employee effort is focused on productive activities.

Non-Threatening Use

Avoid using test metrics for employee evaluation purposes. Collection of information that is perceived as a threat to the employee’s job status is frequently reported inaccurately or incompletely.

Methods for Identifying Test Metrics

Start the process of identifying test metrics by listing the problems to be solved and objectives first. Then determine the items to measure and the standards of measurement to use, to achieve the objectives.

Various methods can be used to complete the test metrics identification process, (e.g., brainstorming, use of a committee composed of representatives from management and the groups that will help with the collection process).

 

Why we need Metrics?

Written By: Isha Arora Bhatia - Feb• 03•12

“We cannot improve what we cannot measure.”

“We cannot control what we cannot measure”

AND TEST METRICS HELPS IN
* Take decision for next phase of activities
* Evidence of the claim or prediction
* Understand the type of improvement required
* Take decision on process or technology change

Test Metrics

Written By: Isha Arora Bhatia - Feb• 02•12

Test metrics accomplish in analyzing the current level of maturity in testing and give a projection on how to go about testing activities by allowing us to set goals and predict future trends.

Metrics- Metrics are a system of parameters or ways of quantitative and periodic assessment of a process that is to be measured, along with the procedures to carry out such measurement and the procedures for the interpretation of the assessment in the light of previous or comparable assessments. Metrics are usually specialized by the subject area, in which case they are valid only within a certain domain and cannot be directly benchmarked or interpreted outside it.” Metrics are measurements. It is as simple as that. We use them all the time in our everyday lives. Entangling them in wordy definitions is just intended to make them seem more mysterious and technical than they really are. So what sorts of things do we measure in our daily lives and how do we use them? Shopping for food is a good place to start. At the meat counter, there is a choice of cuts of different kinds of meat, all at different prices. If we just look at the total price, we may be misled. A nice round steak might cost $10.00 while a round roast might cost $8.00 even though it weighs the same as the steak. So to get the best value for our money we tend to look at the price per unit weight. This is a microcosm of the field of metrics. There are two basis types of metrics. The first type is the elemental or basic measurement such as weight, length, time, volume, and in this example, cost. The second type is derived, normally from the elemental measurements. At the meat counter, the derived metric is dollars/weight (VIZ. $7.49/kg). This is called a normalized metric. Generally speaking, normalized metrics are the most useful because they allow us to make comparisons between things that are different. Some other examples are miles/gallon, dollars/gallon, dollars/share, dollars/hr, and dollars/square foot to give but a few. We also see metrics in sports. In hockey its shots on goal and plus/minus ratio. In baseball its batting average and errors per game. All of these numbers are provided in newspapers and sports magazines and if they disappeared there would be a great uproar among fans. “When you can measure what you are speaking about and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of the meager and unsatisfactory kind.” – Lord Kelvin Now Lord Kelvin wasn’t right about everything he spoke about. He predicted that heavier than air flight was impossible. But about metrics, he was dead right. We as shoppers apply this principle whenever we go to the market. If a cut of meat is marked $10.00 but has no weight assigned, we are likely to look for something else. The same would apply if the weight were given but no price. This is just plain old ordinary common sense. Yet we may go though our professional lives without using metrics to guide us in our work. Maintaining a “meager and unsatisfactory” knowledge about the way you earn your living is probably not the best approach.

Types:-

  1. Process Metrics
  2. Product Metrics
  3. Product Metrics

Process Metrics- Until a year ago, many of the communications and information metrics of Air Force Space Command (AFSC) were taken because they had been collected for years, and people thought those metrics must have a purpose. At that time, many metrics were not being used to make a decision based on fact, but fulfilled a headquarters’ requirement to report on information by a certain date every month. After a fairly extensive study, the AFSC Senior Communicator (SC) changed the format and collection of many of these metrics, while deleting the requirement for many that had little value. Like many discoveries, the process for metrics collection and analysis in this directorate was the result of a change in leadership. Communications metrics at AFSC seemed to provide good information, since senior leaders did not complain about content or format of the 30 metrics collected at the headquarters level. Haphazard metrics collection continued until a number of new senior leaders asked why these metrics were being collected and if they were the right measurements for their organizations. These questions sparked a complete review of the metrics collection, analysis, and reporting process.

Product Metrics- Product metrics are for describing characteristics of product such as it’s size, complexity, features, and performance. Several common product metrics are mean time to failure, defect density, customer problem, and customer satisfaction metrics. 1. Mean time to failure metric is, to put it plainly, the average time the product runs before experiencing a crash, which is important for systems like air traffic control that are required to have no more than a few seconds of down time in a full year. 2. Defect density metric refers to number of imperfections per: * Lines of code * Function definitions * Lines on input screens 3. Customer problem metric is a measure of problems customers have encountered with the product over the total usage of the product. This metric takes into account that multiple instances of the product can be used at the same time, which effectively multiplies the length of time the product has been in operation by the number of product licenses. 4. Customer satisfaction metric is generally a survey asking customers to rate their satisfaction with the product and/or it’s features on a five-point scale.

Project Metrics- Unlike software process metrics that are used for strategic purposes, software project metrics are tactical. That is , project metricsand the indicators derived from them are used by a project manager and a software team to adapt project workflow and technical activities. The first application of project metrics on most software projects occurs during estimation. Metrics collected from past projects are used as a basis from which effort and time estimates are made for current software work. As a project proceeds, measures of effort and calendar time expended are compared to original estimates. The project manager uses these data to monitor and control progress. As technical work commences, other project metrics begin to have significance. Production rates represented in terms of models created, review hours, function points, and delivered source lines are measured. In addition, erors uncovered during each software engineering task are tracked. As the software evolves from requirements into design, technical metrics are collected to assess design quality and to provide indicators that will influence the approachtaken to code generation and testing. The intent of project metrics is twofold. First, these metrics are used to minimize the developing schedule by making the adjustments necessary to avoid delays and mitigate potential problems and risks. Second, project metrics are used to assess product quality on an ongoin basis and when, necessary, modify the technical approach to improve quality. As quality improves, defects are minimized, and as the defect count goes down, the amount of rework required during the project is also reduced. This leads to a reduction in overall project cost.

 

Development V/s Independent Testing

Written By: Isha Arora Bhatia - Jan• 23•12

Development Testingdenotes the aspect the aspects of test design and implementation most appropriate for the team developers to undertake. In most cases test execution initially occurs with the developer testing group who designed and implemented the test, but it is a good practice for the developer to create their tests in such a way so as to make them available to independent testing groups for execution.

Independent Testingdenotes the test design and implementation most appropriately performed by someone who is independent from the team of developers. In most cases test execution initially occurs with the independent testing group that design and implement the test, but the independent tester should create their testes to make them available to the developer testing groups for execution.

Development testing - In which developer himself test the software.
Independent Testing- Group of people and they are concerned with the development part.

Bug Life Cycle

Written By: Isha Arora Bhatia - Jan• 04•12

Bug CycleBug life cycle or Defect life cycle comprises of all the defect status changes it would under go once a new defect is logged and till the defect is closed or cancelled. Bug Life Cycle indicates the flow on how defects are being analyzed, assigned, fixed, verified and closed or cancelled.

Description of each defect status-

1) New – When a Defect is logged and yet to be assigned to a developer. Usually Project Manager or Dev Lead will decide on which defects to be assigned to which developer.

2) Assigned – indicates that the developer who would fix the defect has been identified and has started analyzing and working on the defect fix.

3) Duplicate – Manager or Developer will update the status of a defect as “Duplicate” if this defect was already reported.

4) Rejected / Not Reproducible – This status indicates that the developer is not considering the defect as valid due to following reasons

a) Not able to reproduce

b) Not a valid defect and it is as per requirement

c) Test Data used was invalid

d) Defect referring to the Requirement has been de-scoped from the current release, tester was not aware of this late changes.

5) Deferred – Defect fix has been held back because of time or budget constraints and project team has got approval from customer to defer the defect till next or future release.

6) Fixed – Devloper has fixed the defect and has unit tested the fix. The code changes are deployed in test environment for verifying the defect fix.

7) Reopen – Status is changed to “Reopen” by a tester, when a tester finds the defect is Not fixed or partially fixed. Developer who fixed the defect looks into the comment that was provided by the tester at the time of reopening the defect. Developer will change the status to “Assigned” and starts working on the fix again. In case the developer wants the tester to re-verify the defect then he/she will add a comment and will change the defect status to “Fixed”.

8) Closed – Tester verifies the defects that are in “Fixed” status and once they find the defect is fixed, they change the status to “Closed”. This is the last status of Defect Life Cycle.

9)
Cancelled – This status indicates that the tester realized that the defect logged by him was invalid and agreed to cancel it.

Bug Template-
1. Bud ID
2. Test Case ID
3. Bug Description
4. Steps to Reproduce
5. Expected Result
6. Actual Result
7. Status
8. Priority
9. Severity
10. Logged By
11. Environment

Software Integration and Interation Testing Techniques

Written By: Isha Arora Bhatia - Dec• 23•11

After completion of related programs writing and unit testing, the programmers are inter connecting that programs to form a software.

There are 4 approaches to integrate programs such as
1. Top-down Approach
2. Bottom up Approach
3. Hybrid Approach
4. System Approach

Top-down Approach-  In this approach the programmers are interconnecting main program and some sub programs. In the place of remaining under constructive sub programs the programmers are using temporary programs called as Stubs.

Bottom Up Approach-  In this approach the programmers are inter connecting sub programs without main program. In the place of under constructive main program the programmers are using temporary program called as Driver.

Hybrid Approach- It is a combination of top down and bottom up approaches. This approach is also known as sandwich approach. In the place of under constructive sub and main programs the programmers are using temporary program called stub and Driver.

System Approach- From this approach the programmers are integrating programs after completion of 100% of coding. This approach is also called BigBang Approach.

Test Scenario and Template of Test Scenario

Written By: Isha Arora Bhatia - Dec• 19•11

Test Scenario- A document specifying a sequence of actions for the execution of a test. Also known as test script or manual test script.

Template for Test Senario-
* Test Senario ID
* Requirment ID
* Test Senario Description
* Precondition
* Test Step
* Expected Result
* Actual Result
* Status
* Build Num
* Author