Conf42 Python 2022 - Online

How to TDD in legacy code

Video size:

Abstract

“TDD is great, but it won’t work on our legacy code”. I hear that a lot. That’s why people don’t even give TDD a try. Their code is killing their hope.

TDD’s basic examples are, well, basic, and have no relationship to real-world code. But it can work on legacy code, and everyone’s got that. You just need to remember a few techniques, stick to the principles, and you can start doing TDD in your application code tomorrow.

In this session I’ll show how to do it, the techniques and principles involved. And I’ll show how to add TDD code inside an ugly application.

No more excuses then. It’s possible to do TDD right there in your own legacy code. Let’s do it. Key points - TDD may not seem applicable in “real code”, but the principles apply - Use characterization tests as a safety net - Use test-first principles to add new features and fix bugs - Refactor before and after adding code

Summary

  • Gil: We're going to talk about TDD, test driven development in legacy code. Basically we have three steps. Red, green refactor, we call them. This is the Java thing and we mostly in Python today. The main thing to remember from this talk is that good software processes don't matter.
  • Two examples of how you can use TDD. The first one is a bug fix and the second one is adding a feature. The application that I'm going to show you is something that I've taken from my workshops. It's a messy calculator.
  • A characterization test is not a unit test or integration test. It's something that tells you how the system is working. The first thing I'm going to do is a bug fix. And now I know what works and what doesn't. What works still needs to work at the end.
  • So my solution only works here. So I probably need to duplicate it in the things of oil. But I'm not just going to copy it. Because we're doing it in a better fixed way. Now that I've done this, I can refactor. It's not a major refactoring. But a lot of the times a bit of refactororing is okay as well.
  • So we currently have a behavior that if you press a number and then the operation. But we don't have minus yet. So I'm going to take a couple of refactoring patterns that make things simpler. How do you make an arrow less of an arrow?
  • So I want to summarize what we did and the principles behind them. You don't want to break whatever else worked until now. Second thing is like you focus on what you want to do, where you want it. Final thing, the refactoring. Write a test, fix a bug, do it over and over.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hi everyone, thank you for coming for my talk today. We're going to talk about TDD, test driven development in legacy code. I'm Gil, I'm the one on the left, not the one reading the book. And I'll be with you for the next few minutes talking about TDD. So first of all, I'm Gil, I'm coming to you from Israel. I'm a consultant and trainer and do all kinds of workshop and teach people, developers, testers, automation people, agile people, whatever, to write better software. And this is what I do. I'm also the author of two books, everyday unit testing, everyday spring testing. This is the Java thing and we mostly in Python today, although there's going to be one couple of slides that talk about Java. You can get me@testingil.com or everydayunitetesting.com, which is the book site, and I'll leave on Twitter so you can actually meet me there if you have any questions afterwards and so on. Test driven development TDD, it's a methodology of writing software, although it has test in it. Basically we have three steps. Red, green refactor, we call them. Red is when we write the test and the test fails for any reason. Second one is green when we write just the code to make it work. And then we refactor because once we have tests and everything's passing, until now, we can actually move stuff around and refactor it. How does it look? Like I said, it's going to be but of Java here, before we jump into the code. So we have a test. You see that the squiggly lines under the calculator class, because I haven't created it yet. So the id will shout what are you doing? What are you doing with that? And that's okay because the test can fail because it doesn't compile. Then we add the code. You can see that, no more squiggly lines. I have the calculator there and I've added the method and I have two parameters, Inj and this works on it. I can make it a bit more readable and I can transform inj to something a bit more readable, which is first number and second number. And this is the refactoring bit. And I can do this over and over and over and over and over again and it sounds. Could it be that easy? It could be. It usually is. Unless we're talking about legacy code. Because we have legacy code. It's not like the calculator examples, right? It works very well in new stuff, but examples are usually too simplistic when you look it up and you probably look at it and say, well, yes, but it probably won't work on my code. We have a lot of dependencies in our code, and that means we usually focus on the code that we're writing, mock everything else if needed, or so it depends on what we're doing. If we have a lot of dependencies, we might need to mock a lot. Tests become crappy and coupled to the code, not what TDD aims us to do. So we say, well, we need to change the code, but we haven't written tests for the code. So thats means that the code resists and we don't want to break stuff whenever we are in the process. And we've heard about a new development process, TDD included, you want to introduce it because you've heard that it's good, but you don't want to break anything that you've done until now because you put a lot of effort into that. So we don't want to change the code, but we probably could do that fairly simple or fairly easy if we had tests to tell us while you're breaking stuff. But we don't. But it is a good thing to do. And this is what, the main thing I want you to remember from this talk, it's not like the techniques, it's how this affects your time and somebody else's time. Because everything we talk about in software processes, like good software processes, whether it's TDD or automation test automation or design reviews or code reviews, it really doesn't matter. Whatever we do and other people do as well, is really making sure that we consistently deliver code and we don't waste time on other stuff. And everything that we do when we are afraid to move code around because we might break anything that extends the time to delivery. When we read logs and kind of read code, when we're trying to bugs features and understand what it does because it's not readable. That's about time. If the code was more readable, if we had tests around, it would be easier to change it, fix it quicker and get it to something that's working to deliver it. Also, new features, we love working on new features. But again, if this requires us not just to write new code, but code inside other code that works, it's going to take time. So every process that we're talking about, clean code, unit testing, integration tests, we're going to talk. But today on TDD, code reviews, design reviews, every process that is basically other people are doing it or you're doing it to make code better, don't think about it on making code better, think about how to make a better use of your time. Because if things are working, you'll waste less time on the debugging stuff, on the reading logs, on staring at code, trying to understand what it does. You'll be quicker to move over these hurdles and get to something that we like getting working code out there very quickly. So TDD included in that, it's about respecting your time. Now, I'm going to take the principles of TDD and I'm going to look at them got in the whole cycle thing, but also in principles. So the things I want to do though is I'm going to start with legacy code, is that I don't want to break anything. Now with TDD, it's about writing a test and then you haven't written the code in because how can you? That's TDD. But before you do that, what about the code that you already have? So how do you know that you haven't broken anything that worked? Now, if you're looking at a UI or something, you probably know, but if you're looking at the code of a component and you're going to add stuff to it, how do you know that the code, the remaining code in that component still works? Think about that. Try to answer that. Then we're going to talk about how to add tests. I'm not going to teach you uni tests in this session. I'm assuming that it's easy to pick up. I mostly in my courses, but very little on tools, more about how you use them. But then we're going to add tests according to what we want to do and then we're going to add code. Right? And after we tdd the code, we're going to do some refactor. Refactoring, for the benefit of our session is changing the way the code looks without changing its functionality. It could be ruining it for the worst, but we don't do that here. That's basically refactor. It could be like renaming it could be like adding blank white space or blank lines to make code readable. That's refactoring too. And it could be something a lot more complex. Like I'm going to show you something thats can be done manually. Sometimes it can be done by the ide. The complex ones are usually you need an ide for that. And that's basically it. Once you get the ideas of what you're doing, you can do these steps and if TDD, regular, proper TDD is like every ten or 15 minutes, you'll have something working, something new working because you're working in small steps. You may not have this kind of cycle, but you'll have a cycle and this cycle works and you'll be able to trim it to focus on the small things. And once you get small things working, that is the most important thing. Small things working all the time, you'll see that you get quicker to deliver. And this is where my personal advice comes in. TDD is the best way, the best, and I've experienced a few to deliver working code consistently in a consistent manner. You can sure fire without test something thats works fairly well once, but if you're building something for a few weeks, even a few weeks, not mentioning months, TD is the best way to go because every time you pass a test, you'll be able to show something. Well, I know at this point all what I've written is working. I'm going to show you two examples of how you can use TDD. The first one is a bug fix and the second one is adding a feature. And as you probably know, this is most of our lives. The application that I'm going to show you is something that I've taken from my workshops. It's a messy calculator, so we're going to start with something that works. I know that I showed you examples of calculator before. It's not going to be something like that. Our calculator is going to be, it has two functions, two APIs. One is pressing the keys. It's not going to be a UI as messy as that, but it could be. And you pass a string in. And the other thing that we have is a got display, kind of shows what's on the screen to show it. And that's basically it. It's not complete and it does all kinds of things in a very messy way. So now I'm going to jump into the code, which is python code, and see what's up. So let's end the show. And we're in Pycharm. Okay, so we're in the beginning and the first thing I'm going to show you is what we have and then I'm going to do the bugs fix. So like I said, this is our calculator class has a press method. You see this thing here? It's called arrow code. Like if else, if something like that, which is something that you've probably seen, you've seen in the wild and you know that it's something that's very hard to add something to or fix something in it, because you don't know what the effect would be on the other branches. So this is something that we have. This is Python, so it actually fits in one string in one screen. We also have the get display method, so self display. It features zero if it's empty. And you can see that it works with plus and Div, but it doesn't work in minus and multiplying. It's not complete. You have sort of calculation here. And we have something that's weird here. And the thing is, this is a good examples of something thats somebody tells you, yeah, you go fix a bug and you're not sure how to deal with that because you haven't tested it yet. So this is where we start. I'm starting with a very uni testy generated code coming from when I create a unit test just does this. Absolutely, it fails and that's it. I have code and I have a very not useful tests. So what am I going to do first? Talk about a bug fix. I'm going to do something before that. That something is called characterization test. A characterization test is not a unit test or integration test. It's not a type of test, but it's something that tells you how the system is working. Now you usually do this by operating the system and see what comes out. And we have also all kinds of expectations of how it should work. So I'm going to show you something that I've written, so calculators stay the same for the beginning. And I wrote this test that creates a calculator class. And I feel that at the beginning, my expectation, it should show zero. And when I press three, it should show three. And this tests works. It works according to my expectations. So I take this and I multiply that, because I know that one test will not be enough here. So I've written and kind of refactor it in this way. So I'm creating calculator once and I've accumulated all the things that are working under the test successes. So I'll share the code in a minute. But sulfate star should show zero, pressing c should show zero, pressing three should show three. You know this from the beginning. And then we have three, plus should return three. Because this is an old calculator. When you press an operation, the display doesn't change. But when you write a number, operation number, it shows only the last number. And I've added this, all these things are working, which is cool. But then I found out some things are not working, so pressing plus at the beginning I expect it to show zero, but it doesn't. And if I press zero three, it doesn't show three because this is my expectation of all calculators and it doesn't do it. How do I know? I try it. I write these things and you can see the code at the bottom. I also have something that I thought was weird to work and I was expecting something else, but it fails. So I kind of created all kinds of cases through it system. And now I know what works and what doesn't because I know that what works still needs to work at the end. How does it look? Basically it's like wrapper methods using fluent interface things. So I collect the result at the start, pressing calls, first reset and then press for every character in a string. So I can actually write this as a shorthand. So it press three plus one plus four. And finally it should show does the assert. So now I have characterization test. I know this still needs to work at the end and I can add more here, but I decided to stop. Okay, so now I know how the system behaves. What are we going to do? So, going to fix a bug? The bug that we're going to fix is, I think one of the first one. I think it's this one, but I'll show you in a minute. So all the test successes, I move them into test characterization. Haven't touched them. Basically that's it. Just remove the features, because I want to see everything that's failing in front of my eyes rather than just obscured by all the other stuff. So this is the bug fix I'm doing. So what I did here is, first of all, let's define it. If I'm pressing plus, oh, that's the first one. If I'm pressing plus and then zero, I have a failure. I run it at this point you'll see that I have a failure invalid literal for int base ten. I'll show you what's the problem in a minute. I've copied here all the pressing and thats should show here, because I want something that works first, I refactor the code later. This is the separation between writing code for functionality and changing the code for quality is a whole separate thing. So that's the first thing. I have a failing test. Now let's look at what's causing it. You'll see thats in this line. I'm trying to cast something that at the beginning is empty and I'm trying to cast it to an int and that's the problem. Cool. How do I fix that? Well, now I have a test failing. What's the most easiest way or simplest way to pass the test? So apart from the if that was here, I added another condition. If the display is not empty, then do this. Then no casting occurs. So now my test is passing. Excellent. Excellent. Not completely. Why? Because you can see that this pattern returns. So my solution only works here. So I probably need to duplicate it in the things of oil. But I'm not just going to copy it. No, because we're doing it in a better fixed way. Which means we write three tests, one for each, one for the plus, one for the div, one for the equals, and then we can do this thing. Now you're probably asking why is he not using and or something like that? Because this is the functionally working part. Now that I've done this, I can refactor. And the refactoring I did is if you look at the pattern here is that if we go in, these two are tied. So once I have thats test working, I actually created the method called parse key number and probably need a better name and basically put the check here. And if it's here just return zero because this is what we want to display. Otherwise we do the casting. And I put this thing here because there's no point in duplicating the code. It's not a major refactoring. I could do a bit more than that, but I wanted to show you that a lot of the times a bit of refactoring is okay as well. Don't need to do a major refactoring but we'll get to that in a minute. So are we done here? Almost. Remember I copied things from the main test mix, from the characterization tests, the setup at start pressing and should know I copied that into the bug fix as well because I wanted things to work. But do I need this application? No, I can refactor that application out. So I'm going to refactor the test as well. So I've created base class which has all called base calculator tests. Has the pressing sheet no. And the setup and then both the characterization tests are inherit these things and become a lot more shorter. And I've also renamed the bugs fix thing for test calculator at start. So before that, if you look at the bug fixes descriptions or names I call them when pressing operation at start should show zero and pressing div at start should show zero and pressing equal at start should show zero. I can also reflect the dome's name. So I put them at a class called tests calculator at start and then I don't need at start for everyone. This is also a refactoring in strings, which is very cool. Makes things a lot more easier to read. We've done a bug fix. Well that's very cool. Now let's talk about adding a feature. Now the functionality. We're going to start at the same point we are. But before we do, I want to show you what we are going to add. So we currently have a behavior that if you press a number and then the operation. Forgive the semicolon copy thing. If you press can operation a number, thats the operation. You see the number only. And we have that for plus and div. But we don't have minus yet. And I want this got just to add minus as an a key operation. I want to make sure thats the behavior of pressing an operation after a number still stays the same. So this is the case that we're trying to do. But this time I'm going to take a bit another approach. And this is refactoring the code based on what we have right now. All the tests, the characterization test and the bug fixed test that we have right now and try to see if we make it better. So the risk of adding this new functionality becomes less. So I'm going to take you through a couple of refactoring patterns that make things simpler. So we are here. So we have the test before we have the calculator as before with a power ski number. That's basically it. First thing I'm going to do is look here. So remember I told you about the arrow code. So error code is like you have. If it looks like an arrow and you don't know what's working and thats is going to be broken. We don't want it. How do you make an arrow less of an arrow? And one of the techniques is called adding a guardrail. So you see that if I have if do something else and this is the last else in the function, I can replace it with what we call a guard term. So it looks like this. If the key do something and return because everything else is the else. I can just move everything out of the else, shift it to the left. And we already got something of less and of can arrow. And you can do a got of the things with guard terms like that. You usually do it with validation and stuff like that. I showed you an example. Next thing I'm going to refactor is, well, you look at this, the if, this else, this else and so on, it kind of screams that you have like what we call a switch case or a match case. And I'm going to use the match case from python 310 to replace these ifs with a match case. Now this has to be manual. No tool will do this. Very smartly for you. Very smartly means guarantee that it works. So you have to be a bit careful. But you have some of the tests thats you already collected. And if you're not really satisfied with thats tests that you have, add more characterization tests so you can feel covered. So how does this look in 310? So have a match key and cases in all other languages, you don't need to go through a dictionary of things. So 310 came out in September. Think very good. And the default is here we've done the guard term, we've done the match case. What else we see at least here a repetition, it's can application. The only difference is the operation type. So we can extract that into a method. So let's see how it looks like. I called it a handle op key and I pass in the enum of the thing, the operation that I'm doing, it's here basically the same code extraction method. Extraction. And look at the code that I have right now. First of all, less of an arrow. A lot things are separated by functionality. And now I come to the point where, you know, what if I now TDD minus, it will just work. Am I adding the minus? No, I'm adding a test first. So that's the refactor bit. Let me close that up. I'll close everything here so it won't confuse us. I'm going to TDD a test. This one, this is the tests I'm adding. And this obviously fails on the color and calculator. So I'm adding the code, I added the minus to the enum and TDD. Thats. And because all the behavior regarding operation is already there and extracted and you just need to call the right parameter and that makes the test simply work. It's like magic really. We worked hard at this magic, but it just works. And do we refactor after that? Yes. What kind of refactoring? Well, if you look at the code of the calculator, first of all, you see, well, this is already refactored. This is not refactored yet. There's an opportunity to do that and I'm going to do that. But before that, look at the code that we have. So by extracting this code to a method, we have something that's a higher level of code than this one because this is like the implementation details it was before that as well. So our code, our press method, really what it does, it talks in two different languages. And we don't want that. We want the same language, level of language at that thing. So extracting this and extracting this will create a higher level of, and the same level of reading readability in this thing. So I'm going to extract those into functions and that's what's left. So I called handle equals, and the rest is what handle number does. And basically just extracting the code, nothing else. And look how a simple it looks right now and readable, you know, where to look for things, where to add stuff and so on. And it reads the same language, the user language, rather than the implementation language, because you don't see all the other variables like staring you in the face. So things to remember when you're doing refactoring. That's basically all the code I wanted to show you today. So I want to summarize what we did and the principles behind them, because that's the more important stuff. So first of all, this is the most important stuff. You're introducing something new. You don't want to break whatever else worked until now. You know that from code, what I'm talking at the process level. So whatever you're trying to do, including TDD, you're trying to minimize the risks. And in order to do that, I've introduced you to characterization tests. You lock in the behavior that works. Now, sometimes it's not that easy to write them, but sometimes it is. Sometimes it's not easy to figure out what's expected. But that's okay. Run it, you'll see what's happening. Then you add an assert to that. So the more you feel you need more tests, add more tests, that's okay. Second thing is like you focus on what you want to do, where you want it. So we talked about adding a functionality. In all the bug fix, we defined what we want. We answered that question and only that questions. At the beginning, we saw an opportunity to fix two more bugs. We went back, added tests and added the code. Then don't jump ahead, don't storm the code. Final thing, the refactoring. I refactored the code. I've refactored the tests. I came back, I looked at it in terms of not do I understand what it does? Because your programmers, I'm a programmer eventually will understand what it does. But does it take me a lot to understand what it does or does it take me a short time and do I know where to look for stuff and do I know where to add stuff? So this is the thing that's really important. And so refactoring really gets to the point where next time I'm going to be here, it's going to be less time than I probably would. And that's basically it. That's how we are using TDD for legacy code. Write a test, fix a bug, do it over and over and over and over and over again. That's all I wanted to show you today. So final words, if you have questions, will be on the Discord channel and you can get back to me@gilatestingguild.com. On Twitter. I have an Instagram channel. When you have all these memes going there, the bookstanggill.com. I want to thank you for having me here today and thank you and see you somewhere else.
...

Gil Zilberfeld

Wizard of Testing @ TestinGil

Gil Zilberfeld's LinkedIn account Gil Zilberfeld's twitter account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways