Relational databases, such as MySQL and Microsoft SQL Server, are based on tables. Each table relies on a schema, which defines what columns will be in the table, what data types they will have, and which columns will accept null values. Here's an example of a typical SQL table:
contactId
|
firstName
|
lastName
|
email
|
phone
|
city
|
state
|
10000
|
Prunella
|
Prunewhip
|
pprunewhip@fake.com
|
8005551000
|
Phoenix
|
AZ
|
10001
|
Joe
|
Schmoe
|
jschmoe@alsofake.com
|
NULL
|
NULL
|
RI
|
Note that there are seven different columns in the table. The first column in the table, contactId, is the primary key for the table. This will be a unique value; there will never be two contactIds with the same value.
With a relational database, the schema remains unchangeable, so when Joe Schmoe is added to the database without a phone or city, those places in the table need to be filled with NULL.
Tables in a relational database can connect to each other. Here is a table in the same database that shows the contact's favorite foods:
foodId
|
contactId
|
food
|
1
|
10000
|
Pizza
|
2
|
10000
|
Ice cream
|
3
|
10001
|
Sushi
|
In this table the primary key is the foodId. But notice that the contactId is present in this table. The contactId here is the same as the contactId in the first table. So we can see in this table that Prunella has two different favorite foods, pizza and ice cream, and Joe's favorite food is sushi.
When testing a relational database, you can use SQL query language to verify that the values you are looking for are present in the database. For example, if you had just added a new contact with the name of Amy Smith to the Contacts table, you could query the database to see if it had been added, like this:
select * from Contacts where lastName = 'Smith' and firstName = 'Amy'
and the query would return a table row in response:
contactId
|
firstName
|
lastName
|
email
|
phone
|
city
|
state
|
10003
|
Amy
|
Smith
|
amysmith@faketoo.com
|
8885551001
|
Boise
|
ID
|
In the above query, the asterisk * tells SQL that we want all of the columns for the record returned.
Because this is a relational database, you could also do a query with a join. A SQL join combines the data from two tables, joining on a column that they have in common.
In the example above, both columns have the contactId column. Let's say that you have given your new contact Amy a favorite food (chocolate), and you want to verify that it saved to the database correctly, but you don't know what Amy's contactId is. You can't just query the Food table for "Amy Smith" because her first and last names aren't in there. And you can't query the Contacts table for the food, because it's not in that table. But you could query the Contacts table with that information, get the contactId from that, and then use the contactId to query the Food table for the favorite food.
This is what such a query would look like:
select food from Foods
inner join on Contacts
where Foods.contactId = Contacts.contactId
and Contacts.firstName = 'Amy'
and Contacts.lastName = 'Smith'
and the query will return this response:
food
|
Chocolate
|
Let's walk through what happens in the query.
select food from Foods - this tells SQL to return just the food column from the Foods table
inner join on Contacts - this tells SQL that the query will be joining information from the Foods table with information from the Contacts table
where Foods.contactId = Contacts.contactId - this is instructing SQL to find the contactIds in the Foods table and match them up with the contactIds from the Contacts table
and Contacts.firstName = 'Amy' and Contacts.lastName = 'Smith' - these last two lines are telling SQL that we are only interested in the record with the first name Amy and the last name Smith
There are many more complicated ways to query a relational database, but with these two query types you will be able to do most of your data validation.
Be sure to check out next week's post, where I'll talk about how to test with non-relational databases!