I have mentioned in previous posts that I've been testing a file system. The metadata used to access the files are stored in a non-relational database. As I described in this post, non-relational databases store their data in document form rather than in the table form found in SQL databases.
Several months ago, my team made a change to the metadata for our files. After deploying the change, we discovered that older files weren't able to be downloaded. It turned out that the change to the metadata had resulted in older files not being recognized, because their metadata was different. The bug was fixed, so now the change was backwards-compatible with the older files.
I added a new test to our smoke test suite that would request a file with the old metadata. Now, I thought, if a change was ever made that would affect that area, the test would fail and the problem would be detected.
A few weeks ago, my team made another change to the metadata. The code was deployed to the test environment, and shortly afterwards, someone discovered that there were files that couldn't be downloaded anymore.
I was perplexed! Didn't we already have a test for this? When I met with the developer who investigated the bug, I found out that there was an even older version of the metadata that we hadn't accounted for.
Talking this over with the developers on my team, I learned that a big difference between SQL databases and non-relational databases is that when a schema change is made to a relational database, it goes through and updates all the records. For example, if you had a table with first names and last names, and someone wanted to update the table to now contain middle names, every existing record would be modified to have a null value for the middle name:
With non-relational databases, this is different. Because each entry is its own document and there are no nulls, it's possible to create situations where a name-value pair simply doesn't exist at all. To use the above example, in a non-relational database, Prunella wouldn't have a "MiddleName" name-value pair:
If the code relies on retrieving the value for MiddleName, that code would return an exception, because there'd literally be nothing to retrieve.
The lesson I learned from this situation is that when we are using non-relational databases, it's important to keep a record of what data structures are used over time. This way whenever a change is made, we can test with data that uses the old structures as well as the new structure.
And this lesson is applicable to situations other than non-relational databases! There may be other times where an expected result changes after the application changes. Here are some examples:
- A customer listing for an e-commerce site used to display phone numbers; now it's been decided that phone numbers won't be displayed on the page
- A patient portal for a doctor's office used to display social security numbers in plain text; now the digits are masked
- A job application workflow used to take the applicant to a popup window to add a cover letter; now the cover letter is added directly on the page and the popup window has been eliminated
In all these situations, it may be useful to remember how the application used to behave in case you have users who are using an old version, or in case there's an unknown dependency on the old behavior that now results in a bug, or in case a new product owner asks why a feature is behaving in the new way.
So moving forward, I plan to document the old behavior of my applications. I think my future self will be appreciative!