Author Archive

The Flow of Refactoring and Refactoring Automation

The Flow of Refactoring

In a Test-First context, refactoring has the same flow as any other code change. You have your automated tests. You begin the refactoring by making the smallest discrete change you can that will compile, run, and function. Wherever possible, you make such changes by adding to the existing code, in parallel with it. You run the tests. You then make the next small discrete change, and run the tests again. When the refactoring is in place and the tests all run clean, you go back and remove the old smelly parallel code. Once the tests run clean after that, you are done.

Refactoring Automation in IDEs

Refactoring is much, much easier to do automatically than it is to do by hand. Fortunately, more and more Integrated Development Environments (IDEs) are building in automated refactoring support. For example, one popular IDE for Java is eclipse, which includes more auto-refactorings all the time. Another favorite is IntelliJ IDEA, which has historically included even more refactorings. In the .NET world, there are at least two refactoring tool plugins for Visual Studio 2003, and we are told that future versions of Visual Studio will have built-in refactoring support.

To refactor code in eclipse or IDEA, you select the code you want to refactor, pull down the specific refactoring you need from a menu, and the IDE does the rest of the hard work. You are prompted appropriately by dialog boxes for new names for things that need naming, and for similar input. You can then immediately rerun your tests to make sure that the change didn’t break anything. If anything was broken, you can easily undo the refactoring and investigate.


Specific „Refactorings“ and Refactoring to Patterns

Specific “Refactorings”

Refactorings are the opposite of fiddling endlessly with code; they are precise and finite. Martin Fowler’s definitive book on the subject describes 72 specific “refactorings” by name (e.g., “Extract Method,” which extracts a block of code from one method, and creates a new method for it). Each refactoring converts a section of code (a block, a method, a class) from one of 22 well-understood “smelly” states to a more optimal state. It takes awhile to learn to recognize refactoring opportunities, and to implement refactorings properly.

Refactoring to Patterns

Refactoring does not only occur at low code levels. In his recent book, Refactoring to Patterns, Joshua Kerievsky skillfully makes the case that refactoring is the technique we should use to introduce Gang of Four design patterns into our code. He argues that patterns are often over-used, and often introduced too early into systems. He follows Fowler’s original format of showing and naming specific “refactorings,” recipes for getting your code from point A to point B. Kerievsky’s refactorings are generally higher level than Fowler’s, and often use Fowler’s refactorings as building blocks. Kerievsky also introduces the concept of refactoring “toward” a pattern, describing how many design patterns have several different implementations, or depths of implementation. Sometimes you need more of a pattern than you do at other times, and this book shows you exactly how to get part of the way there, or all of the way there.


Code hygiene

A popular metaphor for refactoring is cleaning the kitchen as you cook. In any kitchen in which several complex meals are prepared per day for more than a handful of people, you will typically find that cleaning and reorganizing occur continuously. Someone is responsible for keeping the dishes, the pots, the kitchen itself, the food, the refrigerator all clean and organized from moment to moment. Without this, continuous cooking would soon collapse. In your own household, you can see non-trivial effects from postponing even small amounts of dish refactoring: did you ever try to scrape the muck formed by dried Cocoa Crispies out of a bowl? A missed opportunity for 2 seconds worth of rinsing can become 10 minutes of aggressive scraping.


Code refactoring

Code refactoring is the process of clarifying and simplifying the design of existing code, without changing its behavior. Agile teams are maintaining and extending their code a lot from iteration to iteration, and without continuous refactoring, this is hard to do. This is because un-refactored code tends to rot. Rot takes several forms: unhealthy dependencies between classes or packages, bad allocation of class responsibilities, way too many responsibilities per method or class, duplicate code, and many other varieties of confusion and clutter.

Every time we change code without refactoring it, rot worsens and spreads. Code rot frustrates us, costs us time, and unduly shortens the lifespan of useful systems. In an agile context, it can mean the difference between meeting or not meeting an iteration deadline.

Refactoring code ruthlessly prevents rot, keeping the code easy to maintain and extend. This extensibility is the reason to refactor and the measure of its success. But note that it is only “safe” to refactor the code this extensively if we have extensive unit test suites of the kind we get if we work Test-First. Without being able to run those tests after each little step in a refactoring, we run the risk of introducing bugs. If you are doing true Test-Driven Development (TDD), in which the design evolves continuously, then you have no choice about regular refactoring, since that’s how you evolve the design.


Test-First technique and tools

It is not always trivial to write a unit test for every aspect of a system’s behavior. What about GUIs? What about EJBs and other creatures whose lives are managed by container-based frameworks? What about databases and persistence in general? How do you test that an exception gets properly thrown? How do you test for performance levels? How do you measure test coverage, test granularity, and test quality? These questions are being answered by the Test-First commmunity with an ever evolving set of tools and techniques. Tremendous ingenuity continues to pour into making it possible to cover every aspect of a system’s behavior with unit tests. For example, it often makes sense to test-drive a component of a system in isolation from its collaborators and external resources, using fakes and Mock Objects. Without those mocks or fakes, your unit tests might not be able to instantiate the object under test. Or in the case of external resources like network connections, databases, or GUIs, the use of the real thing in a test might slow it down enormously, while the use of a fake or mock version keeps everything running quickly in memory. And while some aspects of functionality may always require manual testing, the percentage for which that is indisputably true continues to shrink.


Test-First versus Debugging

It’s useful to compare the effort spent writing tests up front to time spent debugging. Debugging often involves looking through large amounts of code. Test-First work lets you concentrate on a bite-size chunk, in which fewer things can go wrong. It’s difficult for managers to predict how long debugging will actually take. And in one sense, so much debugging effort is wasted. Debugging involves time investment, scaffolding and infrastructure (break points, temporary variable watching, print statements) that are all essentially disposable. Once you find and fix the bug, all of that analysis is essentially lost. And if not lost entirely to you, it is certainly lost to other programmers who maintain or extend that code. With Test-First work, the tests are there for everybody to use, forever. If a bug reappears somehow, the same test that caught it once can catch it again. If a bug pops up because there is no matching test, you can write a test that captures it from then on. In this way, many Test-First practitioners claim that it is the epitome of working smarter instead of harder.


Test-Driven Development: Taking it Further

Test-Driven Development (TDD) is a special case of test-first programming that adds the element of continuous design. With TDD, the system design is not constrained by a paper design document. Instead you allow the process of writing tests and production code to steer the design as you go. Every few minutes, you refactor to simplify and clarify. If you can easily imagine a clearer, cleaner method, class, or entire object model, you refactor in that direction, protected the entire time by a solid suite of unit tests. The presumption behind TDD is that you cannot really tell what design will serve you best until you have your arms elbow-deep in the code. As you learn about what actually works and what does not, you are in the best possible position to apply those insights, while they are still fresh in your mind. And all of this activity is protected by your suites of automated unit tests.

You might begin with a fair amount of up front design, though it is more typical to start with fairly simple code design; some white-board UML sketches often suffice in the Extreme Programming world. But how much design you start with matters less, with TDD, than how much you allow that design to diverge from its starting point as you go. You might not make sweeping architectural changes, but you might refactor the object model to a large extent, if that seems like the wisest thing to do. Some shops have more political latitude to implement true TDD than others.


Benefits of Test-First work

Thorough sets of automated units tests serve as a kind of net for detecting bugs. They nail down, precisely and deterministically, the current behavior of the system. Good Test-First teams find that they get substantially fewer defects throughout the system life cycle, and spend much less time debugging. Well-written unit tests also serve as excellent design documentation that is always, by definition, in synch with the production code. A somewhat unexpected benefit: many programmers report that “the little green bar” that shows that all tests are running clean becomes addictive. Once you are accustomed to these small, frequent little hits of positive feedback about your code’s health, it’s really hard to give that up. Finally, if your code’s behavior is nailed down with lots of good unit tests, its much safer for you to refactor the code. If a refactoring (or a performance tweak, or any other change) introduces a bug, your tests alert you quickly.


Test-First programming

Agile teams often find that the closer the unit test coverage of their code is to some optimal number (somewhere between 75% and 85%, many teams find), the more agile their code is. Which is to say, it is easier for them to keep the defects in the code to very low levels, and therefore easier for them to add features, make changes, and still deliver very low-defect code every iteration.

After experimenting with different ways to keep test coverage up at those optimal levels, agile teams hit upon the practice of Test-First programming. Test-First programming involves producing automated unit tests for production code, before you write that production code. Instead of writing tests afterward (or, more typically, not ever writing those tests), you always begin with a unit test. For every small chunk of functionality in production code, you first build and run a small (ideally very small), focused test that specifies and validates what the code will do. This test might not even compile, at first, because not all of the classes and methods it requires may exist. Nevertheless, it functions as a kind of executable specification. You then get it to compile with minimal production code, so that you can run it and watch it fail. (Sometimes you expect it to fail, and it passes, which is useful information.) You then produce exactly as much code as will enable that test to pass.

This technique feels odd, at first, to quite a few programmers who try it. It’s a bit like rock climbers inching up a rock wall, placing anchors in the wall as they go. Why go to all this trouble? Surely it slows you down considerably? The answer is that it only makes sense if you end up relying heavily and repeatedly on those unit tests later. Those who practice Test-First regularly claim that those unit tests more than pay back the effort required to write them.

For Test-First work, you will typically use one of the xUnit family of automated unit test frameworks (JUnit for Java, NUnit for C#, etc). These frameworks make it quite straightforward to create, run, organize, and manage large suites of unit tests. (In the Java world, at least, they are increasingly well integrated into the best IDEs.) This is good, because as you work test-first, you accumulate many, many unit tests.


Best practices become Agile software programming

Long before we thought about agile software, programming teams were finding which patterns correlated to greater success. These patterns and practices have been proven over many decades at organizations writing some of industry’s most complex software. First catalogued as Extreme Programming (XP), these practices have also come to be referred to as Agile Engineering Practices, Scrum Developer Practices, or simply Agile Programming. XP goes into the most depth concerning how programmers can keep themselves and their code agile. The XP practices have been embraced as enablers for all of the popular agile practices and lean approaches, including ScrumSAFe, and Lean Startup. The community of developers passionate about these practices lives on in the Software Craftsmanship movement.

The core agile software programming practices are the following:

  • Test-first programming (or perhaps Test-Driven Development),

  • Rigorous, regular refactoring,

  • Continuous integration,

  • Simple design,

  • Pair programming,

  • Sharing the codebase between all or most programmers,

  • A single coding standard to which all programmers adhere,

  • A common “war-room” style work area.

Such practices provide the team with a kind of Tai Chi flexibility: a new feature, enhancement, or bug can come at the team from any angle, at any time, without destroying the project, the system, or production rates.