More servicesWindows Live
HomeHotmailSpacesOneCare
 
MSN
Sign in
 
 
Spaces home  Musing about Software Te...PhotosProfileFriendsBlog Tools Explore the Spaces community

Blog

June 10

Info buying instead of bug buying; Simple MBT; conferences

Random musings:

Testing is getting paid by the bug?  http://www.utest.com/

“a unique pay-per-performance business model, to provide our customers with a cost effective solution to test any product,”
            pay you based on your approved bugs submitted.”

Attended Risk Based Testing talk by Paul Gerrard as part of MS internal Engineering Excellence and Trustworthy Computer Forum (EE&TwC – as the invited speaker Scott Berkun  said – this contorted name shows a turf fight between groups that compromised on “&” rather than a useful, catchy phrase or acronym for people to remember.  The EE&TwC theme was design!).

Most interesting thing to me was in a list of Risk Reduction Strategies:          Information Buying
What is “Information Buying”?     It’s testing!
What a great way to really hit home about what testing is really about.
We test to gain information about a system to reduce risk.  You pay for the testing to get that information.

Note that paying only for bugs, means you have no idea what parts of your system are good.  Does no bugs mean it is good, or just that nobody tried it?

The other nugget from Gerrard was we should measure not by deliverables, was the code checked in, but by test evidence.   What was the pass rate, coverage, etc. of the checked in code.    If you have test evidence, you can infer the item being tested has been delivered.


I just gave a basic Model Based Testing (MBT) introductory talk to the VSTS (Visual Studio Team System) Test SIG (Special Interest Group).  Always amazed how few people have heard of MBT before.  I was asked how big a model you need to discover bugs.  The questioner had seen his own simple example of just 2 states.  I related the 2 state model I saw find bugs.  It was used by a tester on Indigo we had just taught MBT.  They were testing the .Net asynchronous calling pattern for several APIs.  Their simple model was:

TwoStateAsyncInvokeModel

This has just 2 states (invoked or not) and 2 actions (Begin or End).   Each action can conceptually occur from each state (4 transitions).   The ones in red should result in an error without changing state.    The tester then ran this simple model over about 50 APIs and found about 1/3 of them failed doing NotInvoked – EndInvoke ->

You can also see my article on Testing For Exceptions for why a sequence of more than one transitions might be useful.


I’ll be moderating a panel about Collaborative Quality at PNSQC and presenting Thursday, 9:45 “Patterns and Practices for Model-Based Testing” at StarWest.   Maybe I’ll see you at ISSTA in Seattle?

More conferences past and present about the MBT work of my Protocols Team.

Wolfgang Grieskamp : ETAPS MBT 2008 - Using Model-Based Testing for Quality Assurance of Protocol Documentation
Wolfgang Grieskamp : ICST 2008 Model-Based Quality Assurance of Windows Protocol Documentation
Nico Kicillof : QSIC 2008Model-Based Quality Assurance of the SMB2 Protocol Documentation

May 11

Suprising Fit for Software Testing?

From http://hbswk.hbs.edu/item/5869.html  (Harvard Business school - Published: April 14, 2008, Author: Martha Lagace)

Software analysts and programmers live to innovate—but hate to run tests. Yet top-notch testing saves many a company money when bugs are caught early.

The majority of a Danish consultancy's testers have Asperger syndrome or a form of autism spectrum disorder.
• Software testing requires superb powers of concentration combined with tolerance (even preference) for routine tasks.

I hear complaints all the time from testers who have to run routine tests over and over again.  If that is the task a company gives testers then the testers need to change something.  

 

If the tests are repetitive and needed -- automate them.  Humans shouldn't repeat what computers can do.

If the tests are repetitive and not needed -- convince management that they are not needed and stop running them!

If management can't be convinced to avoid repetitive, unnecessary, and especially difficult to automate tests -- suggest they use the Danish consultancy in the article.

 

Testers should be creative - either in their automation or in their exploration of the system under test.  Repeatedly run tests typically have low bug finding ability.  Management likes the idea of avoiding regressions by repeatedly running the same tests.  Regression tests should be automated.  

 

Many products get released with low regressions (from running the same tests), but still with high bug rates!  Because they spend too much time avoiding regressions and not enough time finding bugs.

 

 

 

March 06

Microsoft Geek Secrets Finally Revealed

Click here for the bestOpen-mouthed of 30,000 pages of never-before-seen insight into the Redmond lifestyle.

February 28

ICST 2008 - Model-based quality assurance of windows protocol documentation

At  ICST 2008 : First International Conference On Software Testing, Verification And Validation
Lillehammer, Norway April 9-11, 2008

My colleague, Wolfgang will be presenting our paper:

259: Model-based quality assurance of windows protocol documentation
Wolfgang Grieskamp, Microsoft Research; Dave MacDonald, Nico Kicillof, Alok Nandan, Keith Stobie, and Fred Wurden, Microsoft, USA

ICST includes four sessions on Model-based testing each with 4 or more talks among many other things.  Book now!

Preceeding the conference is the 4th Workshop on Advances in Model Based Testing (A-MOST 2008) which includes Robert Binder of MVerify on the Programme Committee.  I've been working with Bob (whom I first met while we were on the board of Quality Week) recently on projects related to our protocol testing.

Testing Service Compatibility - updating services in a set of connected services.

Last summer in my Conference of the Association for Software Testing (CAST) keynote "Testing Web Services" and again with my Pacific Northwest Quality Conference (PNSQC) presentation (70-Stobie-esting Web Services.doc inside the proceedings.zip) you will see a reference

[6]     Keith Stobie, Service Compatibility
<to appear on msdn Tester Center site>
Well it took longer than I thought, but it is finally there in the TesterCenter Library:

Service Compatibility    This article describes what a tester should consider when updating services in a set of connected services.

The new paper just adds a little more detail and reference to that section of the conference papers about rolling out new services.

February 21

Open Protocol Specifications

News Flash -- my semi-secret project has just been outed!    You can read the press announcement etc.
I work at verifying the usability and accuracy of protocol documents which are now all publicly available on MSDN.
Some of the things my team facilitates:

It's an exciting time.  Interoperability and protocols are really catching fire and my team's Model-Based Testing (MBT) approach based on Spec Explorer is going well also.  Finally, the long awaited, "Model-Based Software Testing and Analysis with C#", is out in stores!   Pick up your copy today (mine is autographed by the authors).  You can get a feel for my brand of MBT reading the book and using the free, open source, Nmodel toolkit.   Spec Explorer is essentially just a more advanced version of NModel.

We've built numerous model-based test suites for verifying the published protocol documents.  

By the way, if all this sounds exciting to you, my team is looking to hire my peer.  That is, we have an opening for a Software Test Architect.
http://members.microsoft.com/careers/search/details.aspx?JobID=E925F2F5-2AD5-46BA-886F-E391BBC662BC
If you're curious about Test Architects at Microsoft you can read about another great peer of mine, David.  We worked on refining each other's notion of Testability.

Finally, check out the ever popular "Book Bytes" on the Tester Center site.

PNSQC and Call For Papers

Sorry, my blog entries have gotten behind.   Last November I was re-elected to the PNSQC board (by the board - since they forgot to ask me to be on the ballot) and then elected me Vice President!   That got me heading a sorbanes-oxley derived committee on document retention.
The PNSQC board, chairs, and voluntees did have an interesting and productive strategic offsite at the Heathman lodge in late January.  You should see an updated mission statement soon.
 
Finally, in case you missed it, the Call for Papers for this October's Pacific Northwest Software Quality Conference (PNSQC) is open.  The theme is Collaborative Quality.
November 09

PNSQC Web Services & Microsoft TesterCenter & Protocol Model Based Testing

PNSQC 2007 conference has the 2007 [10.5 MB]proceedings posted now.  
You can also see my  slide deck about
Model Based Testing For Protocols (PDF),using Spec Explorer that I presented.  The talk got very mixed reviews.   A few astute people noted that my Web Services testing paper (the talk got good reviews) had the following reference:

6] Keith Stobie, Service Compatibility

<to appear on msdn Tester Center site>

 

What is Tester Center?

The Microsoft Tester Center showcases the test discipline as an integral part of the application lifecycle, describes test roles and responsibilities, and promotes the test investments required to deliver high quality software.  With the new Microsoft Tester Center, Microsoft can not only share its own experiences and lessons learned, but also provide the testing community a new way to engage with one another.

This is in addition to the MSDN forums set up for discussing software testing issues.

 

I recently updated the Service Compatibility paper responding to reviewers (AlanPa) comments and it should be posted soon under Technical Articles or in the Library. 

 

After PNSQC and QSIC, I went to Hyderabad India for 2 weeks.  One of the days there I visited the Visual Studio Test Team working on VSTS2008 and beyond.  I gave them a demo of Spec Explorer for Visual Studio and they liked it.  We discussed making it a power tool or otherwise distributed.  The wheels are in motion to get the licensing worked out and publicly released.

October 21

PNSQC & QSIC 07 notes

I was in Beijing for 3 weeks and negligent in updating by blog.
 
I presented at PNSQC.  My testing web services talk was well received.  The paper should appear soon in the proceedings.  I also stepped in to give a talk in place of a speaker who withdrew.  The talk on Model Based Testing of Protocol Specifications was less well received.  The slides should be on the PNSQC site shortly.

I attended the SPIN meeting with an excellent talk by Niels Malotaux about estimation with a simple, but great exercise. 
At the lunch panel, Dick Hamlet indicated to be on the look out for two phrases:  "Adaptive Random Testig" (ART) and "Partial Oracles".  I persued them a bit at QSIC07 and don't think they are ready for prime time.  ART has mostly been researched only for numerical domains and the major question (which perplexed the last WHET 4 Workshop on boundary testing also) is how to measure distance or closeness of inputs in a non-numerical domain.
Partial Oracles is related to metamorphic testing which is perhaps more problematic than I realized.  Since there are so many possible partial oracles, how do you choose which ones?
 
I found QSIC07 far more useful to me than I expected.  I particularly like Gordon Fraser's talk (Improving Model-Checkers for Software
Testing) on generating tests from Models.  He introduced several concepts I would like to follow up on.  Mainly he took ideas I've seen applied to implementations and applied them to models.  In particular for example, as given by Jean Hartman's talk at PNSQC, is prioritizing tests.  Code coverage is a traditional way to do this for implementations.  We can do the same for tests of models.  Which tests cover the most states, transitions, etc. 
Even more exciting was the thought of determining the most powerful tests.  Kaner defines test B as more powerful than test A if test B can detect all the defects of test A and other defects besides.  Using the old implementation concept of mutation testing, we can mutate our models and see which tests detect this.   Mutation of implementations suffers from the mutation possibly causing just another legal implementation of the behavior expressed differently.  It is an undecidable computer problem to determine if two programs have the same behavior.  However with model mutation, Gordon claims it is decidable, and thus an even more powerful technique.
 
August 10

Pacific Northwest Software Quality Conference

The Pacific Northwest Software Quality Conference (PNSQC) is one of the most established conferences of any conference on software quality.  It is celebrating its 25th conference this year in early October.   Covering both software testing and software process as two major aspects of software quality, I have always found PNSQC the best value for the money.   A very experienced and dedicated group of volunteers with passion for software quality put this two-day conference on Oct. 9-10.   The topics go beyond the basics, so both new and experienced software quality professionals will find value.   I just registered.

 As a non-profit, it is less expensive than many other conferences and Portland, Oregon is a nice, but less expensive venue.  I know and recommend both of the keynote speakers:

·       Johanna Rothman

·       Herbert (Hugh) Thompson

PNSQC also has a selection of workshops on Oct. 8, the day before the conference.  

As a coincidence, you can consider staying in Portland the whole week and attending Quality Software International Conference (QSIC) on Oct. 11-12.  QSIC originated as the Asia-Pacific Conference on Quality Software  and evolved into an international conference since 2003.  Co-located with QSIC is First International Workshop on Software Test Evaluation (STEV 2007).

July 18

Boundary Value testing at WHET4

I just attended the fourth Workshop on Huesristic and Exploratory Testing (WHET4) the weekend of July 7-8 just before CAST focused on the question:  What is Boundary Testing?
It was quite intriuging as it quickly became apparent boundary testing is not as well understood as we might think.  It has not been covered in much depth as it is considered so simple.  Most people tie the concept of boudary testing with equivalence class.  But equivalence classes are equally simple on the surface, but have subtle complexity.  Now when you talk about the boundary of an equivalence classes, things become even murkier.
 
A History of some things I already understood:
The simple, most classical example is easy:  We have a legal range of integer values, e.g. 0-100.   Here the boundaries are pre-defined.  We have at least 2 equivalence classes,  legal values (0-100) and other integer values that are not legal.  Most people sub-divide illegal values into 2 classes, the lower illegal values (in this case all negative integers) and the upper illegal values (in this case integers > 100).
So far the equivalence classes are based on legal values or their relationship (lower or higher) to the legal values.  Studies by Hamlet and others have shown that off-by-1 errors are unfortunately not uncommon.  Off-by-1 errors frequenly occur due to using < when <= or vice-versa are meant and similar mistakes.   So some people, based on likelihood of errors look at the boundary values where an off-by-1 error could occur as the own special equivalence classes.  The 100 limit could be encoded as <=100, but might be mistakenly coded as <100.  Thus if 100 were treated as an illegal value it have been mis-classified and be an error.  Similarly, the 100 limit could be coded as <101, but mistakenly coded as <=100.  Thus 101 could mistakenly be treated as a legal value.  This is why, for this example, the limit and 1 greater the limit are so frequently quoted as the boundary values of this example.  Similarly for 0 and -1 as the lower legal value and 1 less than the lower value.  (There are other reasons 0 and negative number could be interesting making them even better candidates for finding a bug).
 
A linearizable variable's values can be mapped to a number line. From the above, everyone knows how to apply classical boundary testing to linearizable input variables where the analysis is so simple. 
 
Now how to go beyond linearizable input variables? 
That was a major topic of this WHET.
A simple extension:  go beyond just input variables.  Obviously output variables, but also other influencers such as time and outcomes such as internal state change.  Also, of course, consider more than one variable at a time in the analysis.
But what about linearizable?  Some characterstics about a number line:
1) it  is ordered (has precendence)
2) it is continuous (descretely as integers sometimes, but w/o missing integers)
 
Myself and many WHET attendees didn't feel these were essential required characteristics.  A major disagreement arose about the partitioning into equivalence classes.  I'm in the partitioning the domain into disjoint sets of equivalence classes camp, but equally many felt you could have boundaries between overlapping sets.
Another area, of no surprise to me, was the stress (or performance) notion of boundary testing.  Scott Barber, from the performance perspective, discussed testing for the knee of a curve.   Here we know there are boundaries, but we don't know what they are until we test. This is different from hunting for unkonwn boundaries.  Another characteristic, not unique to stress testing, is what I will cause physcial boundaries.  The example I focus on is 100% cpu utilization.  For a single cpu (clocked at the same speed and many other qualifications), you can't go above 100%.  It isn't that 101% is an illegal value, it is an impossible value.
 
The most interesting new concept I learned was Cem Kaner's way of expressing boundaries as mis-classification or chance of confusion. 
I have often tried to instruct my students in this.  Below is one of my classic non-linear boundary examples that also rely on a theory of error.
Program feature to set ON or OFF some feature and is specified as
 SET ON | OFF
What tests would you try?
All students give
 SET ON
 SET OFF
Now what would you give for an illegal value?  A typical new student answer is:
 SET X
to which I push back:  Yes this would be an invalid value, but how likely is it to find any other bug, or as Kaner says, can you have a more powerful testcase?  (Case A is more powerful than Case B, if Case A finds all the defects of Case B and more).
Better are cases like:
 1. SET
 2. SET ONN
 3. SET OF
Case 1 is used by those choosing to see if there is an undocumented default.
Cases 2 and 3 I consider boundary cases, that is they could be misclassified.
 Case 3 theory of error is that programmer may have compared the first 2 characters as that is sufficient to distinguish the two legal values (or even more lazily only compares the second character). 
 Case 2 theory of error is similar to 3, except it also incorporates the idea of only comparing up to the length of the expected value.
The beauty of cases 2 and 3 is besides being  boundary cases, they are very concievable typographical errors (making failures more justifiable).  In fact, for text strings, typographical errors are typically boundary errors.  If you consider spell correcting programs, they have algorithms for determining the "nearest" correct spelling.  I consider mis-spellings that a spell correcting program would sugget or change to the correct spelling, as being boundaries.
 

WHET4 attendees:
Rob Sabourin, Karen Johnson, David Gilbert, Michael Bolton, Cem Kaner, Ross Collard, Doug Hoffman, Keith Stobie, Mike Kelley, Tim Coulter, Henrik Andersson, Scott Barber, Dawn Haynes, James Bach, Jon Bach, Paul Holland
 
May 17

Microsoft Testing forum

There is a new forum on software testing at

 MSDN Forums » Software Engineering Discussion » Software Testing Discussion

You can gain some insight into some of the posters by reading the thread Introduce Yourself

I think there will be more public informaiton about Microsoft testing soon.
April 17

Model Based Testing of Interoperable Protocols

I’ve joined a new team within Microsoft called Protocol Tools and Test Team (or PT3 for short).

This group is a derivative of the netmon product group and is helping with tools and testing for protocols.

In particular, interoperable protocols as required by the Technical Committee [US anti-trust case settlement] and  European Commission anti-trust case.

The protocols are documented viaTechnical Documents”.

 

I joined the team because of my belief that Model Based Testing (MBT) is a great answer for many problems, especially protocol testing.

Most Protocols are naturally described by state machines and frequently are a very natural fit for MBT.

Building on Microsoft Research’s work with ASML (and ASML/t) and Spec#/Spec Explorer, a new even easier to use version of Spec Explorer is being created (in a product group, not MSR!).  Prior developers of Microsoft MBT tools Wolfgang Grieskamp and Darren Fisher are also part of the team.  I’ve talked about ASML outside of Microsoft to various groups, e.g.

"Advanced Modeling, Model Based Test Generation and Abstract State Machine Language"  
Download Powerpoint presentation (1.01 MB)

 ASML was used in the Indigo team, e.g. http://blogs.msdn.com/kavitak/archive/2004/01/25/62761.aspx

Another good intro to MBT via SpecExplorer is http://research.microsoft.com/projects/T5/ppt/Grieskamp.ppt

February 20

Agile TDD -- unit or acceptance tests?

I went to a panel discussion on Implementing Agile Methods last night at Seattle SPIN that included as panelists:

·         Rod Claar, a Certified ScrumMaster Practitioner (CSM-P)

·         Steve Tockey is the Principal Consultant at Construx Software

·         Shane Currier a software professional who has led two development teams through the transition to agile software development.

 

Steve brought up a house construction analogy, that in one form or another all the panel’s members picked up on and discussed.  Steve, in building his house had only a 2.5% cost overrun with 2 change orders.   The industry typically sees 30-50% cost over runs (or in Bill Gate's house a $40M over run on $14M budget).   Steve contended throughout, that often times for known steady requirements you can save money by correctly surfacing them up front, rather than discovering them over time and doing rework.  Steve agrees that for research or exploratory projects with unknown, unsteady requirements, agile is probably better.

 

I wonder if Steve's conviction in requirements is because he excels at them (his custom house for example), while others struggle (typical over runs).  It is like Michael Jordan saying you just need to jump and shoot better to play basketball like him.

 

 

I found Shane's input fascinating from the real world experience he had with applying agile to legacy systems.  Rod pointed me to Michael Feathers's book and article Working Effectively with Legacy Code and a great quote:

    "The main thing that distinguishes legacy code from non-legacy code is tests, or rather a lack of tests."

It is legacy because you can't be agile with it.  You don't know if things break when you change it.  The tests are not unit tests, but rather "cover some small area of a system just well enough to provide some ''invariant''".    Further "In a design driven from the beginning using tests, the tests ... seed the design, they record the intentions of the designers"

 

Another audience member, in discussions after the panel, made a distinction between two different definitions of TDD that he uses (searching the web, there appears to be no agreement on their meaning or distinction):

    Test Driven Design :  designing based on acceptance tests

    Test Driven Development : coding based on unit tests.

Interestingly, the audience member claims in developing 2M LOC that they didn't need unit tests, but only acceptance tests and thus had virtually no test maintenance to code refactoring (the design continued to meet the acceptance tests).

 

I'm not going to settle the debate here either.  But already we see 3 types of tests:  Unit tests, Invariant tests, and Acceptance tests.  If I knew exactly which ones you needed when, I'd be rich!
 

January 25

Click 4 the cause -- helping by serching

Please try to spread awareness of this program and blog about it, you can actually get a good search result and help out an unprivileged kid too.  

Searching for a way to help? You’ve found it.

Support ninemillion.org by using Live Search. Each time you search here until March 31, 2007, Microsoft will make a contribution to ninemillion.org, a UN Refugee Agency-led campaign providing educational resources for the nine million refugee youth around the world.

http://click4thecause.live.com/

 

Make it your home pages, change you search bar default, etc. 

 

January 24

CAST & LTAC & which tests to automate

Been a while since I posted anything, so here is a couple of quickies.

 I'll be giving a Keynote at CAST 2007 on Testing Software Services.  Originally I thought of doing a double entendre about Testing of Web services and Services for Testing software like SDT's Unified TestPro® but I found there is so much more to say about testing of web services, that I will focus only on that.

 Google sponsored the London Test Automation Conference (LTAC) in Sept. 2006 where they videod lots of non-Google people talking about automated testing:     http://video.google.com/videosearch?q=LTAC&hl=en

 

I found this paper on “A way of Improving Test Automation Cost-Effectiveness”  in the free CAST 2006 proceedings, page 24, and thought it might be of interest (I also like their list of references).  It provides something I hear frequently asked for.
             
It offers a viability analysis method, to help testers decide which tests can be automated cost-effectively.
In particular they reduce it down to a decision tree (see below).

I also liked “EPDAV – A Model for Test Case Definition” on page 50 for its clarity. 

Perform a verification,                             Vv
which may be preceded by a sequence of actions, A
a
which may require a set of data,              D
d
which may require preconditions,           P
p
all of which runs in environment,            Ee

Turning this notation around so that it can be read from left to right, we get:

A Test Case, Tt = Ee Pp Dd Aa Vv

The above is a different take on Expect This from Test Drivers (p5 on Test Impl Notes)

S - Setup
E - Execution
A - Analysis
R - Reporting
C - Cleanup
H - Help

An automated test case has SEARCH.  (From my 1992 paper How to Automate Testing – The Big Picture.  )

 And finally, page 133, “The Personal Test Maturity Matrix” (another flavor of Microsoft’s Career Stage Profiles or even the Testers Body of Knowledge, originally from James Bach, and adapted to BEA.)

               

Table 1. Questions for each point

Identifier

Topics