Unit Testing in Software Development: why is it unpopular among many Developers?

Introduction

This post is motivated by my previous post titled: Car Repair vs Software Update: Regression Errors and Lack of Testing. As promised, I will now talk about Unit Testing, with particular reference to Python.

By the end of this post, one should have got a better idea and answers to the following questions:

What is Unit Testing?
Why do we do Unit Testing?
When do we do Unit Testing?
How do we implement Unit Testing?
A case study of Python Unit Testing
Why is Unit Testing unpopular among many Developers?

Although the case study is done using Python, the principles that we shall cover apply to any other programming language.

What is Unit Testing?

Unit Testing is a kind of software testing that is done on the smallest individual and independent unit of a system, i.e. the testing is done on a class, method or function in isolation.

A Unit Test is used for testing that the component is working as expected. It reduces the cost of bug-fixes since bugs are uncovered early enough in the development process. Its use can also be extended later for regression testing, in which case Unit Tests are all re-run to see whether the integrity of any unit has been broken after making some system updates.

Unit Tests are a common requirement for most projects, especially larger-scale ones. It is, therefore, important to get accustomed to using them for testing relevant system components.

An example of a unit is the following Python function which validates whether a provided string is a valid SQL statement (select or update) according to the custom rules as described in the docstring:

def validate_sql(sql_text, select_or_update):
    """
    Check if a given SQL statement is valid - to avoid issues like SQL injection.
    We allow only selects and updates without semicolon to pass.
    Return False if sql_text does not contain word select or update; or if it contains semicolon, '-', drop, delete, truncate, alter
    """

    # make sure to: import re

    if ((select_or_update == 'select' and not re.search('select', sql_text.lower())) or (select_or_update == 'update' and not re.search('update', sql_text.lower())) or not re.search('where', sql_text.lower()) or re.search(';', sql_text.lower()) or re.search('-', sql_text.lower())):
        return False

    return True

Why do we do Unit Testing?

Unit Testing is done to ensure that a software unit is executing as expected. Given a set of inputs for the unit, does the unit produce the expected output? Note that Unit Testing is not concerned with the internal implementation algorithm of the unit. It is just about the interface of the unit (i.e. the input and output).

When do we do Unit Testing?

We do Unit Testing during development – i.e. when we are coding the particular unit.

Traditionally, one would write the unit test after writing the function (unit), but with the advent of Test Driven Development (TDD), we write the test upfront before writing the unit. This means that the test would initially be a failing test until we have completed writing the correct unit.

How do we implement Unit Testing?

We shall illustrate using our example of validate_sql function as already shown earlier. We implement Unit Testing by asking ourselves the following questions:

Given a set of inputs, e.g.

sql_text = 'update district set notes=”no comment” where id > 1', and 
select_or_update = 'update'

Does the function give us the expected output of True since sql_text is a valid sql?

Given another set of inputs, e.g.

sql_text = 'update district set notes=”no comment” where id > 1;', and 
select_or_update = 'update'

Does the function give us the expected output of False since sql_text is an invalid sql because it has a comma?

Other examples:

sql_text = 'drop table district', and 
select_or_update = 'select'

when subjected to the function, the above input will return False.

sql_text = 'select * from district where 1', and 
select_or_update = 'select'

when subjected to the function, the above input will return True.

The reality is that we should throw in any inputs, especially those that we think may cause problems to the functioning of the unit. This will make a strong test.

A case study of Python Unit Testing

For ease of continuity, we shall stick to our example of validate_sql function above. I have also noticed that many examples of Python Unit Tests use very simple cases of addition, subtraction, multiplication, etc. These are very simple to understand, but because of that simplicity, it took me time to extend their use to more complicated cases. I believe my example is somewhere in mid-ground and offers better ability to adapt.

Supposing we want to write a Unit Test for the function validate_sql. Then we have already done so much by thinking of the test cases we listed in earlier sections. We shall use them as they are in our tests.

Python has two main libraries for unit testing: unittest and pytest. Unittest is a built-in testing framework, whereas pytest is third-party. However, pytest is a superset of unittest, which means you can run tests written in unittest with pytest.

We shall use unittest in this post. If opportunity allows, I may share the equivalent pytest version later.

First, we create our test file named in the format test_***.py. The file naming convention is important for test discovery. Then we shall have the following content, which is self-explanatory. Please pay attention to test_validate_sql method mostly.

import unittest #import the test framework - library module named unittest
from etoolbox.common import secret_code, validate_sql #import the code to be tested
#from etoolbox import app

#define class EtoolboxTests as a subclass of unittest.TestCase
class EtoolboxTests(unittest.TestCase): #subclassing TestCase

    # mock dependencies: if a unit has an external dependency (API call, db connection, etc), you can use mock to assign some value to the dependency

    # Class Fixture - TestCase class has a setUpClass() and tearDowm()
    #if you use @classmethod below, then setUp will run only once, not per method
    def setUp(self): #run at the start of every test case
        pass
        #print('setUp')

    #if you use @classmethod below, then tearDown will run only once, not per method
    def tearDowm(self): #run at the end of every test case
        pass
        #print('tearDown')

    #just a dummy test
    def test_sum_list(self): #a test method
        self.assertEqual(sum([1, 2, 3]), 6, "Should be 6") # test case

    #just a dummy test
    def test_sum_tuple(self): #a test method
        self.assertEqual(sum((1, 3, 2)), 6, "Should be 6") # test case

    #a real test for my function
    def test_secret_code(self): #a test method
        self.assertEqual(type(secret_code()), str, "Should be string") # test case

    #a real test for my function
    def test_validate_sql(self): #a test method
        self.assertEqual(validate_sql('update district set notes=”no comment” where id > 1', 'update'), True, "Should be True") # test case
        self.assertTrue(validate_sql('update district set notes=”no comment” where id > 1', 'update'), "Should be True") # test case
        self.assertFalse(validate_sql('drop table district', 'select'), "Should be False") # test case
        self.assertTrue(validate_sql('select * from district where 1', 'select'), "Should be True") # test case


if __name__ == '__main__':
    unittest.main()

With the above test, your favorite IDE should be able to work magic for you by automatically running the Unit Tests every time you change or refactor code. You may consult your IDE's manual for more information on how to set up tests. I use VS Code and it is straightforward.

In summary, this is how we create the test cases, although I have condensed my code above into one line for each test case.

Define the input
For each input, define the function's expected return value (or values)
Write the tests (assertions) by asserting the function call with the defined input, against the expected return value

Easy!

Please find time and understand more about mocking and fixtures. They are important in optimizing test implementation. And they are simple to understand. Mocking allows you to handle external dependencies, while fixtures are for handling repetitive code.

Why is Unit Testing unpopular among many Developers?

In my assessment and experience, many developers shun Unit Testing for various reasons. I am a practical convert who now believes that in spite of all the negative arguments about Unit Testing, one would find it really hard to cope while working with a team developing a large-scale system ... and after all, it is standard practice.

Among other benefits, testing provides better Continuous Integration / Continuous Delivery workflow.

The key reasons for the negativity, in my view. are:

It is hard to understand and appreciate the concept of Unit Testing. This is true, until you get dirty with it and try hands-on. First practice Unit Testing at home until you can take it to work. You will never understand it without the persistence to do so. As always: the taste of the pudding is in the eating.
Unit Testing wastes a lot of time, yet there are deadlines for the project. This seems true to an extent, and it is a state of mind. However when you consider the benefits and safety net that it provides, then surely the time spent on writing Unit Tests is worth it.

As a person who is coming to terms with Unit Testing, I would recommend that a Developer seriously considers using unit testing. I know many out there who are candidates for immediate conversion. There are many hidden advantages of Unit Testing, which may only be seen when things have gone wrong – and too late at that.

Orama's Data Blog

sub-title

Monday, 16 May 2022

Unit Testing in Software Development: why is it unpopular among many Developers?

No comments:

Post a Comment