Posts tagged 'legacy'

Unit tests in legacy code

published on November 22, 2019.

Legacy code has many definitions. It is code without tests. It is code written by someone else. It is code written X time ago. It is obsolete code that is difficult to replace with newer code. It is code that has no documentation. It is all of these things.

Besides that, I like to think of legacy code as code that is actively used by clients and brings in value. If it’s not used by others then it can’t bring in value, and if it does not bring in value then working on that code is just a waste of time and money.

Whatever the definition of legacy code is, we need to be careful when we start working on it. We can’t really know under what kind of circumstances did the original authors write it. Were they under pressure to deliver on time to beat the competition to market? Were they tired, sick? Maybe they didn’t know better at the time? Maybe when they wrote it 10 years ago, they did use the best tool for the job, but now, 10 years later, there are better tools for that?

Unit tests first

When I start working on a piece of legacy code I try to cover it with unit tests first, if there aren’t any for the code I have to change. I find it a good way to start documenting the behaviour of the code, to understand what it does, and gain confidence in the changes I’m about to make. To do this, I have to make an assumption, and a pretty big one at that.

I have to assume that the legacy code I’m writing the unit tests for is bug free. After all, it is in production and it brings in value, right? I also have to assume that the programmers before me knew what they were doing and why.

I try and run the code to see how it behaves under normal circumstances. Try out what I believe would be regular input and see what is the output based on that. I will also step-debug every line to see how data changes over time, how the unit of code I’m looking at communicates with its dependencies. I’ll write down in a file or on a piece of paper all that I see. Focusing on unit tests means I focus on understanding the smallest possible unit of the code, instead of an entire code path from the moment the user interacts with the software until they get an answer back. That’s just too much stuff to try and keep track of.

In these unit tests I’m using the input I used when I ran the code and assert against the output I got. I use a lot of test doubles for the dependencies the unit has and set up expectations against the data I saw while step debugging.

Method by method, class by class, I grow this suite of unit tests. They are not the best unit tests in the world, they are not easy to write, but they give me confidence in my understanding of the code and in any future changes I might make. The code is not in great shape, then the tests can’t be either. Working with unit tests like this requires a high level of discipline as the test double dependencies need to be kept in sync with the changes in the code.

Not everyone likes to work with tests like these. They fail often, and they fail easily. But every fail for me means challenging the initial assumptions I had about the code. It’s a step forward for better code and better tests. Over time these tests get replaced with better tests, the same way the code gets replaced with better code.

Happy hackin’!

Legacy code is 3rd party code

published on July 19, 2018.

Within the TDD community there’s an advice saying that we shouldn’t mock types we don’t own. I believe it is good advice and do my best to follow it. Of course, there are people who say that we shouldn’t mock in the first place. Whichever TDD camp you’re in I think this “don’t mock what you don’t own” advice has an even better advice hidden in it. An advice that people often overlook because they see the word “mock” in it and go full berserk.

This hidden advice is that we should create interfaces, clients, bridges, adapters between our application and the 3rd party code we use. Will we create mocks from those interfaces in our tests doesn’t even matter that much. What matters is that we create and use these interfaces so that our application is better decoupled from the 3rd party code. A classic example of this in the PHP world would be to create and use an HTTP client within the application that uses the Guzzle HTTP client, instead of using the Guzzle client directly in the application code.

Why? Well, for one, Guzzle has a much bigger public API than what your application (in most cases) needs. Creating a custom HTTP client that exposes only the required subset of Guzzle’s API will limit what the application developers can do with it. If Guzzle’s API changes in the future, we’ll have to change how we call it in only one place, instead of trying to make the required changes in the entire application and hope that we broke nothing. Two very good reasons and I haven’t even mentioned mocks! gasp

I don’t think this is that hard to achieve. 3rd party code lives in a separate folder from our application code, usually in vendor/ or library/. It also has a different namespace and naming convention than our application code. It is fairly easy to spot 3rd party code and with a bit of a discipline we can make our application code less dependant on 3rd parties.

What if we apply the same rule to legacy code?

What if we start looking at our legacy code the same way we look at the 3rd party code? This might be difficult to do, or even counterproductive, if the legacy code is in a maintenance-only mode, where we only fix bugs and tweak bits and pieces of it. But if we are writing new code that is (re)using legacy code, I believe we should look at legacy code the same way we look at 3rd party code, at least from the perspective of the new code.

If at all possible legacy and new code should live in different folders and use different namespaces. It’s been a long time since I last saw a system without autoloading so this is doable. But instead of just blindly using legacy code within the new code, what if we create interfaces for the legacy code and use those in the new code?

Legacy code is all too often full of “god” objects that do way too many things. They reach out to global state, have public properties or magic methods that expose privates as if they were public, have static methods that are just so convenient to call from anywhere and everywhere. Well, guess what? That convenience got us in this situation in the first place.

Another, maybe an even bigger issue with legacy code is that we are so ready to change it, fix it, hack it, because we don’t see it as a 3rd party code. What do we do when we see a bug or when we want to add a new feature to 3rd party code? We open up an issue and/or create a pull request. What we don’t do is go inside the vendor/ folder and make our changes there. Why would we do that to legacy code? And then we cross our fingers and hope we didn’t break anything.

Instead of blindly using legacy code within new code, let’s try writing interfaces that will expose only the required subset of the legacy “god” object’s API. Say we have a User object in the legacy code that knows everything about everyone. It knows how to change emails and passwords, how to promote forum members to moderators, how to update a user’s public profile, set notification settings, how to save itself, and so much more.

src/Legacy/User.php
<?php
namespace Legacy;
class User
{
    public $email;
    public $password;
    public $role;
    public $name;

    public function promote($newRole)
    {
        $this->role = $newRole;
    }

    public function save()
    {
        db_layer::save($this);
    }
}

It’s a crude example, but shows the problems: every property is public and can be easily changed to whatever value, we have to remember to explicitly call the save method after any change for these changes to persist, etc.

Let’s limit and prohibit ourselves from reaching out to those public properties and having to guess how does the legacy system work any time we want to promote a user:

src/LegacyBridge/Promoter.php
<?php
namespace LegacyBridge;
interface Promoter
{
    public function promoteTo(Role $role);
}
src/LegacyBridge/LegacyUserPromoter.php
<?php
namespace LegacyBridge;
class LegacyUserPromoter implements Promoter
{
    private $legacyUser;
    public function __construct(Legacy\User $user)
    {
        $this->legacyUser = $user;
    }

    public function promoteTo(Role $newRole)
    {
        $newRole = (string) $newRole;
        // I guess you thought $role in legacy is a string? Guess again!
        $legacyRoles = [
            Role::MODERATOR => 1,
            Role::MEMBER => 2,
        ];
        $newLegacyRole = $legacyRoles[$newRole];
        $this->legacyUser->promote($newLegacyRole);
        $this->legacyUser->save();
    }
}

Now when we want to promote a User from the new code we use this LegacyBridge\Promoter interface that deals with all the details of promoting a user within the legacy system.

Change the language of the legacy

An interface for the legacy code gives us an opportunity to improve the design of the system. An interface can free us from any potential naming mistakes we did in the legacy. The process of changing a user’s role from a moderator to a member is not a “promotion”, but rather a “demotion”. Nothing stops us from creating two interfaces for these two different things, even though the legacy code sees it the same:

src/LegacyBridge/Promoter.php
<?php
namespace LegacyBridge;
interface Promoter
{
    public function promoteTo(Role $role);
}
src/LegacyBridge/LegacyUserPromoter.php
<?php
namespace LegacyBridge;
class LegacyUserPromoter implements Promoter
{
    private $legacyUser;
    public function __construct(Legacy\User $user)
    {
        $this->legacyUser = $user;
    }

    public function promoteTo(Role $newRole)
    {
        if ($newRole->isMember()) {
            throw new \Exception("Can't promote to a member.");
        }
        $legacyMemberRole = 2;
        $this->legacyUser->promote($legacyMemberRole);
        $this->legacyUser->save();
    }
}
src/LegacyBridge/Demoter.php
<?php
namespace LegacyBridge;
interface Demoter
{
    public function demoteTo(Role $role);
}
src/LegacyBridge/LegacyUserDemoter.php
<?php
namespace LegacyBridge;
class LegacyUserDemoter implements Demoter
{
    private $legacyUser;
    public function __construct(Legacy\User $user)
    {
        $this->legacyUser = $user;
    }

    public function demoteTo(Role $newRole)
    {
        if ($newRole->isModerator()) {
            throw new \Exception("Can't demote to a moderator.");
        }
        $legacyModeratorRole = 1;
        $this->legacyUser->promote($legacyModeratorRole);
        $this->legacyUser->save();
    }
}

Not that big of a change, yet the intent of the code is much clearer.

Now the next time you want to reach out to that legacy code and call some methods on it, try and make an interface for it. It might not be feasible, it might be too expensive to do. I know that static method on that god object is really easy to use and will get the job done much quicker, but at least consider this option. You might just improve the design of the new system you’re building a tiny little bit.

Happy hackin’!

Open source taught me how to work with legacy code

published on April 28, 2017.

Contributing to open source projects has many benefits — you learn and you teach, you can make friends or find business partners, you might get a chance to travel. Even have a keynote at a conference, like Gary did.

Contributing to open source projects was the best decision I made in my professional career. Just because I contributed to, and blogged about Zend Framework, I ended up working and consulting for a company for four and a half years. I learned a lot during that time.

What I realized just recently is that open source also taught me how to work with legacy code. It taught me how to find my way around an unknown codebase faster, where to look and what to look for when investigating an issue. Most importantly, it taught me how to react to legacy code.

Usually when people hear “legacy code”, they think code that was written by a bunch of code monkeys who know nothing about writing good software. The past was stupid, the present is smart and wise, and will make everything better for the future. A long time ago, I was the same.

Today, my thinking and my approach is completely different.

I have the utmost respect for the programmer and their code that is before me. Rarely do I have the privilege knowing the circumstances under which a piece of legacy code was written.

In many cases the original author of the code is not on the team any more, or they just don’t remember why was some decision made and a piece of code written in a certain way. It might be a hack workaround for a code that was written by someone even before their time on the project. Maybe they didn’t know better at the time, or maybe they indeed made an error and now it’s my bug to fix.

Whatever the reason is, the code is written, used, and it delivers business value. It requires maintenance, fixes, and improvements and I welcome the challenges it brings.

Happy hackin’!

Robert Basic

Robert Basic

Software developer making web applications better.

Let's work together!

I would like to help you make your web application better.

Robert Basic © 2008 — 2020
Get the feed