Monday, June 04, 2007

Tester's Quiz: What is Equivalence Partitioning Technique?

Most of you have heard this term "Equivalence Partitioning Technique". It is one of the common techniques associated with black box test design techniques (under Domain Testing techniques). Today's Quiz on my blog is related to this...

Explain in 4-5 sentences or less about "Equivalence Partitioning Test design Technique" and provide an example that illustrates the technique (avoid examples like "login" screen)

I would also appreciate lots of "context" specific questions like "Who is asking?", “Why 4-5 sentences” etc …

I will consolidate all comments and questions and make another post later....
I am also looking for "real life" experiences of applications of this technique....

Some background:
One motivation for me to present this to my blog readers is my own experiences of discussing in various job interviews (where I was the interviewer) and lots of informal discussions with people. In my experience, lots of testers (especially in India) seem to have a very narrow view of this technique. Most of them have admitted that "it is a theoretical technique - useful only in interviews... etc"

What would happen next?

I would like to de-mystify this stuff (with your views, questions and opinions) and present to you my version of EQ partitioning technique in a blog post....



Madhukar Jain said...

Equivalence Partitioning Test design Technique----

This is a technique which basically aims at dividing and conquering the data so that we can have more test coverage with less data to be passed.
Between 2 equivalence classes we have boundary values which represent a transition from one state to other. These are crtical placeholders as imp bugs might revolve around such places.
In Equivalence Partitioning the context of testing is very imp as otherwise we are sure to miss some imp bugs. In fact equivalence partitioning should be treated more towards context driven side and then values should be picked and derived. Good equivalence partitioning cannot be considered as simply dividing the data set into valid and invalid partition classes and then picking some data from each of them, its more of a context driven approach.
For example...
If i have a set of integers which form a valid equivalence class and the program multiplies the input by say 100. Now it would be wrong to pick a random value from my equivalence partition as it all depends on the context. If i multiply a no. which is in the lower range then multiplying it by 100 might still give me a valid 16 bit integer value. However their might be some data values from same set which when multiplied by 100 will lead to an interger which crosses the 16 bit value range and the developer might not have allocated enough space for holding the same. Another example might be a set of values where if i add the middle 2 values then it becomes out of the range. Sometimes the data values of same equivalence class might be treated differently inside the system based on further conditions (white box testing Context), Hence it becomes important that context needs to be judged carefully and based on that values needs to be derived.
We can cover it better by identifying multiple sets of different values from equivalence classes and running different sets at each test run.
This method should be supplemented by boundary value analysis to get more effective sets of data.

Shrini Kulkarni said...

Good reply Madhukar ---

Nice attempt I would say ...

One notable part of your answer is
recongnizing the fact that having just two eq classes valid and invalid is not going help the situation. This is the most common mistake I have seen people making while thinking of EQ classes ...
Good to see that you are not making that mistake.

Not very clear about your example. you start with saying "I have set of integers that form a valid Eq class" - what does this mean? Why don't you state the testing problem that attempts apply EQP technique? What is Valid here?

One more thing to note in EQP is there appears to be nothing called "Random" selection.

you have used the word "context" liberally in your post - what that would mean? Give an example.

>>> We can cover it better by identifying multiple sets of different values from equivalence classes and running different sets at each test run.

Above is totally confusing ...
you are taking of sets (more than one - I guess) that too multiple. Then you are talking about differnt values from EQ classes.
What is that you are trying to say?
consider more than one parameter or a variable at a time?

Let me wait for some more responses

Keep checking this blog for updates


Anonymous said...

Equivalence class partitioning is a Test Design technique as the label says used for getting the test date by partitioning input domain of the AUT... The finite number of partitions or equivalence
Classes that result out are used to select any member of an
Equivalence class as a representative of that class. So we make a assumption here that all the member of EQ class will behave in same manner.
Generally we prepare one equivalence class for valid domain and two classes for Invalid Domain.
So, if my specification says that value of “this” field should be b/w 0 and 100, then I would create a valid equivalence class, that include all value from 0 and 100, two EQ class for for Invalid value: one consisting of value greater than 100 and other consisting of value less than 100.

Anonymous said...

Here's (more or less) how I explain eq classes to my students.

ECP is a method of reducing test cases by deciding to treat groups of values the same. For example, if I am testing a function that can accept values between 1 and 1000, I can probably treat most of those values the same (e.g. 2-999).

That, however, is only the start. Proper ECP must analyze *all* potential inputs and classify which input can be grouped into valid and invalid classes, along with all other special or unique values.

I often use the example of a filename in windows (Bj wrote about this recently). Alphanumeric characters can be safely classified together as one valid eq class. Invalid punctuation characters (e.g. {}\; ) may also be grouped into an invalid eq class. Some characters may even fall into both classes (e.g. a filename can contain a space, but not as the first or last character).

While EQP can reduce test cases, in my opinion, the benefit comes from forcing the tester to carefully analyze all types of input that the application can handle.

btw - why do you have BOTH a captcha AND moderation. Seems overkill?

Anonymous said...

I'll add another comment while I'm waiting (don't worry Shrini - I have nothing against you - I'm snippy when commenting in every single blog that has moderation. I think it goes against the principles of blogging).

Anyway - while checking to see if you read my comment yet, I read the previous comments more closely, and wanted to add a few notes on the comment by Ashish.

I think this is a good example of *almost* getting ecp. There certainly aren't always one valid and two invalid classes for a (seemingly) sequential range of numbers. At a minimum, I'd break down the tests for values between 0 and 100 like this:

1-99: valid range within the boundaries
0 and 100: valid input at the boundary cases
-1 and 101: invalid input at the boundaries
alphanumeric characters: invalid input
punctuation chars: invalid input
(other invalid input ommitted - I would have to know more about the application, or perhaps review the source in order to correctly define the rest of the invalid input classes)

I *could* be done, but what if there are special values within that range (e.g. perhaps an input of 50 generates unique results). In that case, I'd have another valid case, and perhaps additional boundaries to test.

To reiterate my last post - ecp forces you to analyze the data your application consumes. If you skip that step, you may as well guess.

Anonymous said...

The equivalence partitions are usually derived from the specification of the component's behaviour. An input has certain ranges which are valid and other ranges which are invalid. This may be best explained at the following example of a function which has the pass parameter "month" of a date. The valid range for the month is 1 to 12, standing for January to December. This valid range is called a partition. In this example there are two further partitions of invalid ranges. The first invalid partition would be <= 0 and the second invalid partition would be >= 13.

1) .... -2 -1 0 (invalid partition 1)
2) 1 .... 12 (valid partition)
3) 13 14 15 ....(invalid partition 2)

James Marcus Bach said...

Equivalence class partition, as it is normally practiced, is so simple that to call it a technique seems like overkill.

Every human knows how to do it. You don't need to learn how. We use it in the normal course of perceiving the world.

This is the principle of ECP: "Some distinctions don't make a difference."

This is the method of ECP: "Notice when a set of different tests or test data would teach you the same thing, and elect not to try ALL of that data or those tests."

Doing ECP is trivial. Doing it well can be *very* hard, because the whole point of testing is that we don't know what will happen. To say that two tests or bits of data will teach me the same thing, without actually performing those tests or trying that data, is to say I can know the results of a test without doing it. If that were true, why do ANY testing?

So, ECP involves a lot of modeling and technical reasoning, if you want to do it well. But those things don't actually belong to the ECP "technique". They are generic testing skills.

Cem Kaner has written a marvelous article about different views of ECP. You will find it here.

Shrini Kulkarni said...

Hi Ashish,

>>>Generally we prepare one equivalence class for valid domain and two classes for Invalid Domain.

Why one-two ratio between valid and invalid. How valid and invalid are decided? What is that you would be achieving by making such a classification?

What is the problem that you are trying to solve?

>>>I would create a valid equivalence class, that include all value from 0 and 100

What does this mean including all the values between 0 and 100 - how many values? How do you include?

This is precisely the problem that I am trying to highlight - we need to very clear about the problem that we are trying to solve by applying ECP.

Think about it -- come back with more ..


Shrini Kulkarni said...

Alan -

>> ECP is a method of reducing test cases by deciding to treat groups of values the same

What is risk of reducing the test cases? How does ECP addresses this?

>> For example, if I am testing a function that can accept values between 1 and 1000, I can probably treat most of those values the same (e.g. 2-999).

Most of or all the values (2-999)?
Can you be specific? How do you arrive at this conclusion that the values between 2-999 are treated equally by the application? What are the assumptions you are making w r t the function here?

>>>classify which input can be grouped into valid and invalid classes,

I always wondered why most of the people feel that valid and invalid are best ways to model EQ classes.
Valid values I think, represent a class of values that application *should* accept (and process appropriately) and Invalid values are the ones that application should reject.
I am not seeing any real reason to believe "universally" that application will treat all reject values as *same*.

I would not consider the classification of valid and invalid as necessarily *useful* ones though they could be good starting points. What do you think?

Also special values and unique values are more discreet and as such do not fall in category of EQ classes. Right?

>>>Alphanumeric characters can be safely classified together as one valid eq class.

Would that it is sufficient use one (or any) alphanumeric character string to represent file name? Don't you think that the domain or data space of alphanumeric space itself is huge and you are over simplifying? What if there are certain business/program logic rules that might cause certain combination of characters to behave differently (like "con"). You might call them as special values - but what if there are just way too many such special values ?

I am also surprised to know that you did not mention about the role of application processing logic.
I believe that is the key. If I am asked to give EQ classes for a date field without the application processing logic knowledge - I would model the date field only in terms of valid/invalid dates in terms of calendar. That would leave out a large scope of EQ class modeling.

>>the benefit comes from forcing the tester to carefully analyze all types of input that the application can handle.

I think you wanted to say "All possible values that can be considered (can possibly be supplied as input/can possibly be output'ed by the app) for A GIVEN VARIABLE or PARAMETER or a FIELD of a feature of the application under test.

I believe one of the main characteristics of ECP modeling is - it considers "ONE" parameter/field/attribute at a TIME.

For any modeling involving more than one parameter at a time - one would additionally use "Combinatorial" techniques. Right? I would like to emphasize
one parameter - EQP, More than one - use combinatorial techniques.

More to come later -- Stay tuned.
Thanks for your views...

Waiting for Google EQP exercise on your blog Alan ...


Shrini Kulkarni said...

Nidhi -

>>The equivalence partitions are usually derived from the specification of the component's behaviour

Which Component? Can you be bit generic? What would do if there are no specifications? Would you not appy EQP? How would you model a component's behaviour by EQP ...

>>An input has certain ranges which are valid and other ranges which are invalid.

As I reiterated earlier in this post - not sure if classifying a complex and Very large data domain as valid and invalid class - is not going help. By usual meaning EQP you would then use only 2 values (one from valid and one from invalid) to represent the entire data domain - irrespective of application logic and type of data.

>>> The valid range for the month is 1 to 12, standing for January to December.

Depending upon how the date is represented - you can have date as one variable or a collection of 3 variables (day, month, year). And then, there is application logic processing - the notion of equivalence is dependent on this in great deal. Any EQP modeling without mention of any application processing logic would be largely incomplete.

nidhi - here is an excercise for you -- think of EQP classes for a field that takes 16 digit credit card numbers. What are your EQ classes?

Stay tuned to this ...


Rahul Verma said...

Hi Shrini,

You have taken a good initiative. I will eagerly wait for your final compilation of all thoughts and analysis.

Following is what I think about EQP:

EQP is a technique to analyze and split the inputs to a function, into classes of input. Each member of such a class is treated equivalent in terms of the output the function generates when this member of the class is given as input. This treatment for equivalence is based on a calculated risk (as we are not testing the functionality with ALL inputs from all classes). Similarly the number and types of such classes and the choice of their representative members needs careful analysis and in the process sometimes turn out to be subjective in nature (vary from one tester to other). There are cases, where none of the members can represent a class; they all have to be tested separately.

I will take the example of testing a command line interface, which has 5 predefined switches, which can be executed from the DOS prompt as:

> cmd.exe /switchn

To keep the situation simple, I will proceed with the following assumptions:
1.None of the switches has any argument passed to it.
2.At one time one switch has to be compulsorily used and only one switch can be used at a time. (This is to avoid discussing tests related to their combinations and tests where more than one switch is passed to test error messages)

For such a situation, at a high level, when we think of equivalence classes – they are Valid and Invalid, as the kind of inputs in each category can be treated as equivalent in terms of the outcome. The Valid and Invalid classes can be further split based on purpose and what comprises the input.

Here, the Valid class includes all the predefined switches. This is a typical situation, because you can not have a representative member of the valid class. You will have to test ALL the switches, one at a time.

The Invalid equivalence classes include:
* Undefined single char switch – Blank, /, /0, /space, /&, /*, /., and other special chars of choice
* Undefined 3 char switch --> Alphabetical switch: /abc, Numeric switch: /123, alphanumeric switch: a12, 1a2 (note the placement of the letter a)
* An arbitrarily long switch -- alphabetical switch /aaaaaaaaa…, numeric switch /1111111….., zeroes switch /0000000……….., spaces / ….., & switch /&&&&&&&&&&&&&&&……...., * switch /*************....., . switch /..........
* Predefined switch with prefixed chars: /0switch1, / switch1 etc.

Above are some of the classes I could think of as of now. The characters which I have used above have some practical significance from my experience.

In the above test, considering 3-letter-length as nominal, is based on the context (I found it a good number when I was testing such a product, you can define your own). Similarly an arbitrary long switch is used to touch the undefined boundaries, which cause a crash. I proceed by entering with the very long string. If it causes a failure I use 50% of it for the next test, if it doesn’t, I make it double the present length. When done iteratively, this gave me a sense whether I was leading anywhere. The observations were really interesting. I found some serious crashes in the interface.

Rahul Verma

Anonymous said...

I think this question is an excellent example of how a seemingly simple question can have a complex and involved answer, and of how someone intent on digging into a thing can find all sorts of questions about every single word in someone's answer. Which sounds like a description of testing to me!

ECP is a technique for reducing the input range for a test value to a manageable size. Most parameters have so many possible values that testing each one would take a long time. Many of these values will be treated the same by the application under test, so rather than testing each value in that set we can select one value to represent all the others.

Partitioning all possible values into equivalency classes requires some knowledge of the thing you are testing, in order to determine what values are likely to be equivalent. If you don't have this knowledge you are guessing.

As is any testing aid, ECP is fallible. You may mis-partition some or all of your data. Most of the time, though, it is a useful way to reduce your search set.

Rajesh Kazhankodath said...

My favorite example for ECP is analysis of a date input field. An example I often use is to find test cases using ECP for an application that find the difference ( in days ) between two input dates. In case of dates, the equivalence classes can be considered as days, months, years and century. Any day below today's date and any day ahead of today's date is another set of ECP. The equivalence classes need not be mutually exclusive, its perfectly fine to have overlapping equivalence classes. For test design, these ECPs will translate into something like 1/1/2007 and 2/1/2007 as a set of values, ( 1/2/2007 and 2/2/2007 taking months into consideration) (1/2/2006, 1/2/2007 for years) Again months with 30 and 31 days forms its own equivalence classes. Considering today’s date a 19 June 2007, any day less than today and any day greater than today is also an equivalence class.

From a developer’s point of view, Equivalence classes can be considered as distinct set of values ( input as well as output ) that an application processes in a “similar” way.
From a testers (or black box) point of view, equivalence classes is a set of values that will make a logically distinguishable set of inputs and outputs.

Often the examples quoted as ECP are numbers, for practical application, these are mostly non numeric. For example, up-case alphabets, lower case alphabets, mouse clicks within and outside graphical object, zoom in and zoom out of graphical components.

Unknown said...

Equivalence partitioning is a method for deriving test cases. In this
method, classes of input conditions called equivalence classes are
identified such that each member of the class causes the same kind of
processing and output to occur.
In this method, the tester identifies various equivalence classes for partitioning. A class
is a set of input conditions that are is likely to be handled the same way
by the system. If the system were to handle one case in the class
erroneously, it would handle all cases erroneously.

Equivalence partitioning drastically cuts down the number of test cases
required to test a system reasonably. It is an attempt to get a good
'hit rate', to find the most errors with the smallest number
of test cases.

Anonymous said...

Waiting for your views/comments on this Shrini.

Anonymous said...

Questioning others comments or explanations is very easy than any other,Dear srini give your openions about the explanations given by these legends

mansi said...

Equivalence partitioning is a test design technique in which the set of input values are divided in valid and invalid partitions and representative values are selected from the partitions as test data.
This is done so as to minimize the number of test cases.
For eg: An input box which accepts 1 to 1000 numbers .It is impossible to write 1000 test cases. So as to minimize the number of test cases we use equivalence partitioning.
So from the above eg there will be one valid partitions and two invalid partitions:
valid partitions : 1 - 1000
invalid partitions : < 1 and
invalid partitions : > 1000

mansi said...

Equivalence partitioning is a test design technique dividing input values into valid and invalid partitions and selecting representing values as test data.
It is use so as to minimize the number of test cases.
For eg : An input box accepts 1 to 1000 numbers.It is not possible to write 1000 test cases for all numbers.So EQP is used.
The above eg will have one valid partition and two invalid partitions.
valid partitions : 1-1000
invalid partitions : < 1
invalid partitions : >1000