Buy this book on Amazon
Recently I finished reading the classic programming book: Refactoring, by Martin Fowler. The first edition of this book was written way back in 2000, however, updated in 2018, it still provides timeless advice for improving codebases through the disciplined process of refactoring.
The book is split into two sections. The first section provides an introduction to refactoring – the purpose and principles of refactoring, how to spot ‘code smells’ and a methodology for effective refactoring. The second section contains a catalogue of refactoring techniques with examples and detailed descriptions of their implementation and purpose.
Who should read it?
Refactoring is an excellent book for anyone looking to build their craft as an engineer and construct better software. If you are responsible for maintaining an evolving codebase or inherit a project from someone else, this book will provide a lot of actionable advice for making improvements and adding new features to the code in a controlled and logical manner.
“Refactoring is the process of changing a software system in a way that does not alter the external behavior of the code yet improves its internal structure."
“The key to effective refactoring is recognizing that you go faster when you take tiny steps, the code is never broken, and you can compose those small steps into substantial changes."
”…the true test of good code is how easy it is to change it. Code should be obvious: When someone needs to make a change, they should be able to find the code to be changed easily and to make the change quickly without introducing any errors."
“If someone says their code was broken for a couple of days while they are refactoring, you can be pretty sure they were not refactoring."
“compile-test-commit” (Refactoring workflow)
Rather than listing out the refactoring techniques covered in the book (of which there are many!), I will focus on my main takeaways from the first section of the book covering the principles and purpose of refactoring.
A full catalogue of refactoring techniques is available for free on the book’s accompanying website - www.refactoring.com. I highly recommend checking it out for some great resources
Principles of Refactoring
Refactoring (noun): a change made to the internal structure of software to make it easier to understand and cheaper to modify without changing its observable behaviour
Refactoring (verb): to restructure software by applying a series of refactorings without changing its observable behaviour
Refactoring is risky
Refactoring makes changes to working code. Adapting the original code runs the risk of introducing new bugs to the system. When refactoring is performed without purpose or strategy you will play ‘whack-a-mole’ and end up with a new set of problems.
Be systematic and purposeful about refactoring. Understand which refactoring techniques pose the most risk of introducing bugs, and avoid large changes to the code base.
Tests are your friend
Before attempting refactoring you should ensure a solid set of tests for each section of the code. It can take time to build the tests (if they are not already in place) but it is an investment which will pay dividends later on.
Failing tests are the best protection against introducing subtle bugs during refactoring.
The key to effective refactoring is taking small steps
“If someone says their code was broken for a couple of days while they are refactoring, you can be pretty sure they were not refactoring.”
Refactoring is an incremental process. I have been guilty in the past of ‘refactoring’ a project by literally starting with a blank file and recoding from scratch using the old code as a reference - I kid you not. This generally took a long time and until I was finished with the new code, the application would not be working. Additionally, I risked introducing new bugs and commonly found myself with a different set of problems which needed further refactoring.
Changing the program in small incremental steps means the program will never be broken. You can easily stop and pick up from where you left off at a different time. Incremental steps will help focus on the parts of the code which need changing and avoids unnecessary work of re-coding the aspects of the program which work well.
Small changes will compound into substantial changes.
The true test of good code is how easy it is to change
Code should be obvious. When someone needs to make a change they should be able to find the code and adapt it quickly without introducing any errors.
Start with the simple things – understand your codebase
Before changing code you need to understand what it does. There are many simple refactorings which simulatenously improve your understanding of the code base and make the code better such as renaming variables or functions to be more descriptive. Martin Fowler describes this as ‘Comprehension Factoring’.
Start with these simple refactoring techniques. This process will also give you ideas on how to improve the overall design of the application as you become more familiar with it.
‘Litter-pickup Refactoring’ is another variation. It covers the case where you understand what the code is doing, but realise it is doing it badly. Clean up the trash. Always leave the code base in a better state than when you arrived.
“Best Practice” is not a reason to refactor
Refactoring should be carried out purely for economic reasons. Refactoring is used to accelerate the proccess of adding features in the future and finding bugs. ‘Good practice’ in itself if not a good enough reason to refactor a code base.
Periodically assigning time to refactor is important for maintaining a healthy code base, however, product ownersand managers may be hesitant for developers to spend time refactoring instead of focusing on implementing ‘high priority’ new features. When trying to persuade your PM to allow you to refactor, argue in economic terms they will relate to (i.e. short dev cycles in the future, less time finding bugs etc.) rather than ‘this code is messy can I clean it up?’.
Favour Clarity over Optimisation
You should refactor in order the make the code more readable and maintainable for humans. The computer doesn’t care less if your system is well designed – only if it compiles successfully.
Focus on code clarity rather than the most optimal way to run the code. Most bottlenecks are found in a very small part of the overall code. You can optimise these small areas if required later.
Normally bottlenecks are caused by workarounds from poorly designed code. Favouring clean and self-documenting code will reduce the likelihood of poor performing code in the first place. If there is still poor performing code after refactoring it should be much easier to change.
Bad Smells in Code
“When and what code should you look to refactor?"
So you have decided you need to refactor – what should you be looking for as obvious places to start?
The following section titles describe some of the key things to look for when refactoring a code base.
“People are often afraid to rename things, thinking it’s not worth the trouble, but a good name can save hours of puzzled incomprehension in the future."
A good variable name or function name can save a new developer on the project (and your future self) hours of time of confusion.
Whenever you spot code in multiple places carrying out the same task, extract it and create a single function.
Arrange your code so code with similar functionality is located in the same place. This will help organise code and understand which functions have been implemented. This reduces the chances of duplicating code and helps formulate new ideas for combining or refactoring code with similar functionality.
Programs with short functions are easier to understand and maintain over the long term.
A function should do one thing and do it well. If the function is responsible for many processes they become coupled which can be harder to reuse elsewhere and reliably test.
Note, however, the real key to making functions easily understandable is good naming.
Long Parameter List
Long lists of parameters generally indicate the function is responsible for too many things or that it is strongly coupled to another function or class.
Using classes is a great way to reduce parameter list sizes.
Global data objects can be modified from anywhere in the codebase and there is no mechanism to discover which bit of code touched it. It can be a nightmare to find bugs and maintain code with lots of global data objects. Avoid at all costs.
Global data objects can be removed by encapulsating the object as a variable which is passed around functions as an input argument, rather than accessing from the global environment.
Data should never change. Updating a data structure should always return a new copy of the structure.
For example, avoid updating values in a list. Create a new list with the updated values.
Shotgun Surgery/Divergent Change
If you have to change the code in many different places to implement a new feature, you should be suspicious. This is normally a signal there is a better way to structure things.
Feature Envy occurs when a function in one module spends more time communicating with functions or data inside another module than it does in its own module.
Comments should explain ‘why’ the code is doing something rather than ‘what’ it is doing. What the code is doing should be self-explanatory. If there are comments explaining what the code is doing (unless it is for a tutorial) it is symptomatic of code that is too complicated.
When you feel the need to write a comment, first try to refactor the code so that any comment becomes superfluous.
Comments are still useful for explaining why the code is doing what it is doing. Your future self will be grateful for explanations on why you decided to implement the code as you did rather than another way.
“Tests are so important for speeding up development in the long run and helping collaboration with multiple developers, reduces the risk of someone accidentally changing code they don’t understand”
Fixing the bug is usually pretty quick - finding it is a nightmare
Most of a developers time is spent reading code and finding bugs, rather than actually writing code. This is especially true when refactoring.
Refactoring is risky, a failing test can be the canary in the coal mine allowing you to catch bugs and be alerted to an alteration in behaviour before it is too late.
Writing a suite of good tests for your code is essential for efficient and effective refactoring.
Test-Driven Development to the rescue
Whenever you want to add new functionality, begin by writing a test.
Test-driven development involves writing a (failing) test, writing code to make the test work, and refactoring to ensure the result is as clean as possible.
Writing tests can take time, however, the time investment upfront will pay dividends in the future. Full test coverage of the codebase is ideal but it is better to write and run incomplete tests than not to run any.
If it is not easy to write tests for your code, be suspicious
If it is not easy to test your function with a set of different inputs and test against a set of expected outputs it suggests you should look to refactor.
Make your life easier with smaller functions
It is easier to write tests for small functions which have one responsibility. Spend effort looking at a fragment of code and figure out what it is doing. Extract it into a function and name the function are the ‘what’.