Test Driven Reverse Engineering

What is the right level of coverage?  100% Logic Paths?  100% Code?

One of the problems people have with TDD (or test-driven reverse engineering) is conflating testing and debugging.  In the olden days, they were the same thing – you debugged by running tests or you debugging during the testing phase of the project.

They’re separate things.

For many business applications you’ll have piles of examples.  Once I had four spreadsheets, eight to a dozen tabs per sheet, two to a dozen examples per tab.  This was 90+ individual examples, each showing some unique combination of obscure special cases.

In data warehousing, the ETL processing will include lots of examples of transformations from the source application to the final warehouse dimension and fact definitions.  In health insurance processing, similarly, there will be a pile of exceptions, extensions, amendments and special cases.

When confronted with lots of test cases, I can’t be trusted to write unittest TestCases by hand.  Python to the rescue.

The customer says “This case doesn’t work.”  I say, “Really?” And add it to the unit tests.  Guess what I discovered?

I don’t like the word “risk”, it makes things sound random.  Risk-Based Testing really a strategy for setting priorities based on “importance”; usually the costs associated with the problems that are found.  There’s no “risk” – as in random event – involved.  It’s all about priorities.

Indeed, the priorities are just the same priorities that drive the Scrum process backlog.  I don’t think Risk-Based Testing offers us anything above and beyond basic Agile practices.

If you need to have your icky old software reverse engineered, we’re going to find “things” and “situations” you can’t explain.  The code is clearly bad, but is in production.  What do we do?  Fix it?  Preserve it?  Try to figure out if it doesn’t really matter?
In talking about TDD, I’ve been making little tweaks and adjustments. Recently, I read an article in DDJ with some additional suggestions for improvement.
Sticky Minds sends me their Better Software Magazine.  Good stuff on how to do Test Driven Development well.  Stuff I was doing unconsciously.  The article helped me see what I was doing and why.
A procedural program tied to a database presents some reverse engineering challenges. I’m growing more reluctant to use the neutral-sounding “procedural programming” and think I’ll switch to “procedure-only” programming.
When embarking on reverse engineering something big (over 10,000 lines of code) can you really call it test driven?  Or is it just test aware?  Or test friendly?
Reverse engineering C is bad enough even when the C is gloriously well-written.  When obscurity prevails, however...
As long as it’s reasonably well-written C, it’s not so bad.  You can rewrite it as Java or Python.  Sometimes you can almost find the object classes.  Other times, you really have to dig.  Python gives you a lot of intellectual leverage, even if the final product is Java.
Building Skills Content Management Culture of Complexity Data Structures and Algorithms Databases and Python DocBook Economics of Software Macintosh Methodology for Non-Programmers Open Source Projects Personal Web Toys Technology News Test Driven Reverse Engineering The Lure of XML Unit Testing in Python User Interface War Stories and Advice

Previous topic

Python Unit Testing Frameworks (v3)

Next topic

Unit Test Coverage

This Page