Email Subscription Form

Sunday, July 14, 2019

The Easiest MongoDB Tutorial on the Web

MongoDB is one of the most popular non-relational databases in use today.  Its versatility, speed, and scalability make it popular with applications that need to store data in a JSON-like format.  It's very easy to install MongoDB and create a database, but the query language it uses is quite different from the SQL query language.  When I first started using MongoDB, I was frustrated by the documentation I found on queries; either they tried to explain too much or the examples were too complicated.  More than once I said things in frustration like "How can I simply ask for 'select lastName from contacts where id = 3?!!'".

It is because of this frustration that I have created this tutorial.  In the tutorial, I'll be including several really simple examples of queries that you are probably used to running in SQL. 


Installing Mongo:
Because this post is really about writing queries, I'm going to skip the installation instructions, and instead send you here for MacOSX and here for Windows.  Once you have finished the installation instructions, you should have a command window open that is running mongo.

Creating a Database:
Creating a new database is unbelievably easy.  We're going to name our new database tutorial.  To create it, simply type use tutorial.  You'll get the response switched to db tutorial.  Tada!  Your database is created.

Adding a Document:
Of course, right now your database is completely empty.  Let's change that by typing
db.tutorial.insertOne( { _id: 1, firstName: "Prunella", lastName: "Prunewhip" } ).  
You will get a response of 
{ "acknowledged" : true, "insertedId" : 1 }
You have just added your first document!  Note that a "document" is the equivalent of a "record" in a SQL database.  Your document has an id (which is preceded by an underscore, by convention), a first name, and a last name.

Retrieving All Documents:
Let's make sure that your document was really added by asking for it.  Type 
db.tutorial.find() 
and you should get this as a result: 
{ "_id" : 1, "firstName" : "Prunella", "lastName" : "Prunewhip" }
The empty find() command will find all of the documents in the database.  At the moment, we only have one document, so that's all that was returned.

Adding Multiple Documents:
To add several documents at the same time, use the InsertMany command, like this:

db.tutorial.insertMany([ { _id: 2, firstName: "Joe", lastName: "Schmoe" }, { _id: 3, firstName: "Amy", lastName: "Smith" }, { _id: 4, firstName: "John", lastName: "Smith" }, { _id: 5, firstName: "Joe", lastName: "Bagadonuts" }, { _id: 6, firstName: "Carol", lastName: "Jones" }, { _id: 7, firstName: "Robert", lastName: "Johnson" } ])

Note that each document is wrapped in curly braces, separated by commas.  You'll get a response like this: 
{ "acknowledged" : true, "insertedIds" : [ 2, 3, 4, 5, 6, 7 ] }
Now you have seven records in your database.

If you retrieve all documents at this point using db.tutorial.find(), you'll get a result like this:
{ "_id" : 1, "firstName" : "Prunella", "lastName" : "Prunewhip" }
{ "_id" : 2, "firstName" : "Joe", "lastName" : "Schmoe" }
{ "_id" : 3, "firstName" : "Amy", "lastName" : "Smith" }
{ "_id" : 4, "firstName" : "John", "lastName" : "Smith" }
{ "_id" : 5, "firstName" : "Joe", "lastName" : "Bagadonuts" }
{ "_id" : 6, "firstName" : "Carol", "lastName" : "Jones" }
{ "_id" : 7, "firstName" : "Robert", "lastName" : "Johnson" }

Retrieving a Single Document:
To retrieve a single document, use this syntax:
db.tutorial.find( { _id: 1 } ).  
This will return the document with the id of 1: 
{ "_id" : 1, "firstName" : "Prunella", "lastName" : "Prunewhip" }

Search for All Documents With a Single Value:
The previous search on id will always return just one document, because the id is unique.  If you want to search for all documents that have a particular value, you can use 
db.tutorial.find({ lastName: "Smith"}).  
This will return all documents that have the last name Smith:
{ "_id" : 3, "firstName" : "Amy", "lastName" : "Smith" }
{ "_id" : 4, "firstName" : "John", "lastName" : "Smith" }

Search for One Value in One Document:
Let's say you want to find the last name of the document with the id of 3.  To do this, type:
db.tutorial.find({ _id: 3}, {lastName:1, _id:0}).  
You will get this result:  
{ "lastName" : "Smith" }
The _id:0 is there to specify that you don't want the id returned in the response; returning the id in the response is a default behavior in MongoDB.

Return All the Values for a Specific Field:
If you wanted to get a list of all the last names in your database, you would use
db.tutorial.find({}, {lastName:1, _id:0}).  
This would return
{ "lastName" : "Prunewhip" }
{ "lastName" : "Schmoe" }
{ "lastName" : "Smith" }
{ "lastName" : "Smith" }
{ "lastName" : "Bagadonuts" }
{ "lastName" : "Jones" }
{ "lastName" : "Johnson" }

Search with "Starts With":
MongoDB uses regex to search on field values.  To search for all the documents that have last names that begin with S, you'd do this search:
db.tutorial.find({ lastName: /^S/}).  
This will return 
{ "_id" : 2, "firstName" : "Joe", "lastName" : "Schmoe" }
{ "_id" : 3, "firstName" : "Amy", "lastName" : "Smith" }
{ "_id" : 4, "firstName" : "John", "lastName" : "Smith" }

Search with "And":
If you wanted to search for a document that had a specific first name AND a specific last name, you'd search like this:
db.tutorial.find( {$and: [{ lastName: "Smith" },{ firstName: "Amy"}]} )
which would return 
{ "_id" : 3, "firstName" : "Amy", "lastName" : "Smith" }.

Search with "In":
To search for all the documents that have either the last name Smith or the last name Jones, you'd use:
db.tutorial.find({lastName :{$in :["Smith","Jones"]}}).  
This will return
{ "_id" : 3, "firstName" : "Amy", "lastName" : "Smith" }
{ "_id" : 4, "firstName" : "John", "lastName" : "Smith" }
{ "_id" : 6, "firstName" : "Carol", "lastName" : "Jones" }

Update a Document:
If you'd like to change an existing document, you can use the Update command.  For example, to change the last name of the third document from Smith to Jones, you would type:
db.tutorial.updateOne({_id: 3 }, {$set: {lastName: "Jones"}}).  
You'll get this response: 
{ "acknowledged" : true, "matchedCount" : 1, "modifiedCount" : 1 }.

To verify that the record was updated correctly, you can use db.tutorial.find( { _id: 3 } ), which will return { "_id" : 3, "firstName" : "Amy", "lastName" : "Jones" }.

Delete a Document:
Finally, there may be times where you want to delete a document.  This can be done with
db.tutorial.deleteOne({_id: 4 })
which will return a response of 
{ "acknowledged" : true, "deletedCount" : 1 }.

To verify that the document has been deleted, you can run db.tutorial.find() and get this response:
{ "_id" : 1, "firstName" : "Prunella", "lastName" : "Prunewhip" }
{ "_id" : 2, "firstName" : "Joe", "lastName" : "Schmoe" }
{ "_id" : 3, "firstName" : "Amy", "lastName" : "Jones" }
{ "_id" : 5, "firstName" : "Joe", "lastName" : "Bagadonuts" }
{ "_id" : 6, "firstName" : "Carol", "lastName" : "Jones" }
{ "_id" : 7, "firstName" : "Robert", "lastName" : "Johnson" }
and you can see that the document with the id of 4 is no longer in the database.

This is by no means a complete record of everything that you can do with MongoDB, but it should be enough to get you started.  You can also refer to last week's post to get a few examples of interacting with nested values in MongoDB.  I hope that you will find today's post helpful in understanding how Mongo works, and that you will use it as a reference whenever you need it.  Happy querying!

Saturday, July 6, 2019

Testing With Non-Relational Databases

Last week, I took a look at ways to query relational databases for testing.  This week I'm going to look at non-relational databases, describe how they are different from relational databases, and discuss how to query them in your testing.  Non-relational databases, such as MongoDB and DynamoDB, are sometimes called "NoSQL" databases, and are becoming increasingly popular in software applications.



The main difference between relational and non-relational databases is that relational databases use tables to store their data, where non-relational tables use documents.  The documents are often in JSON format.  Let's take a look at what the records in the Contacts table from last week's post would look like if they were in a non-relational database:


{
              contactId: "10000",
              firstName: "Prunella",
              lastName: "Prunewhip",
              email: "pprunewhip@fake.com",
              phone: "8005551000",
              city: "Phoenix",
              state: "AZ"
}
{
              contactId: "10001",
              firstName: "Joe",
              lastName: "Schmoe",
              email: "jschmoe@alsofake.com",
              state: "RI",
}

Note that Joe does not have a value for phone or city entered, so they are not included in his document.  This is different from relational databases, which are required to include a value for every field. Instead of having a NULL value for phone and city as Joe's record did in the SQL table, those fields are simply not listed.

Another key difference between relational and non-relational databases is that it's possible to add a new field into a table without adding it in for every document.  Let's imagine that we are adding a new record to the table, and we want that record to include a spouse's name.  When that record is added, it will look like this:

{
              contactId: "10002",
              firstName: "Amy",
              lastName: "Smith",
              email: "amysmith@faketoo.com",
              phone: "8885551001",
              city: "Boise",
              state: "ID",
              spouse: "John"
}

The original documents, 10000 and 10001, don't need to have this spouse value.  In a relational database if a new field is added, the entire schema of the table needs to be altered, and Prunella and Joe will need to have spouse values or NULL entered in for those fields.

With a non-relational database, it's not possible to do joins on table data as you saw in last week's post.  Each record should be treated as its own separate document, and you can do queries to retrieve the documents you want.  What that query language looks like depends on the type of the database used.  The examples below are using MongoDB's query language, which is JavaScript-based, and are querying on the documents listed above:

db.contacts.find() - this will return all the contacts in the table
db.contacts.find( { contactId: "10001" } ) - this will return the document for Joe Schmoe

To make the responses easier to read, you can append the command .pretty(), which will organize the data returned in JSON format rather than a single line of values. 

You can also run a query to return a single field for each document:

db.contacts.find({}, {firstName:1, _id:0}) - this will return just the first name for each contact

Because the documents in a non-relational database have a JSON-like structure, it's possible to have documents with arrays.  For example, our Contacts table could have a document that lists the contact's favorite foods:

{
              contactId: "10000",
              firstName: "Prunella",
              lastName: "Prunewhip",
              email: "pprunewhip@fake.com",
              phone: "8005551000",
              city: "Phoenix",
              state: "AZ",
              foods: [ "pizza", "ice cream" ]
}

It's even possible to have objects within arrays, as follows:

{
              contactId: "10001",
              firstName: "Joe",
              lastName: "Schmoe",
              email: "jschmoe@alsofake.com",
              state: "RI",
              pets: [ { type: "dog", name: "fido" }, { type: "cat", name: "fluffy" } ]
}

You can see how this type of data storage might be advantageous for your application's data.  Nesting data in this fashion makes it easier to read at a glance than it would be in a relational database, where the pets might be in their own separate table.

To run a query that will return all the contacts that have cats, you would simply request:

db.contacts.find( {"pets.type":"cat"} )

To run a query that will return all the contacts that have cats named Fluffy, you would request:

db.contacts.find( {$and: [{"pets.type":"cat"},{"pets.name":"fluffy"}]} )

These are just a few simple examples of how to query data with a non-relational database, and they should be enough to get you started in your testing.  To learn more, be sure to read the documentation for the type of database you are using.  As non-relational databases become increasingly popular, this knowledge will be extremely useful.  


Saturday, June 29, 2019

Testing With Relational Databases

In last week's post, I discussed various ways to test your application's database.  In order to verify that your data has been saved correctly, you'll need to query the database, and the way to query the database will depend on what type of database you have.  In the past, most databases were relational, but in recent years there has been a trend towards using non-relational databases.  In this week's post, I'll address relational databases, and in next week's post, I'll talk about non-relational databases.



Relational databases, such as MySQL and Microsoft SQL Server, are based on tables.  Each table relies on a schema, which defines what columns will be in the table, what data types they will have, and which columns will accept null values.  Here's an example of a typical SQL table:


contactId
firstName
lastName
email
phone
city
state
10000
Prunella
Prunewhip
pprunewhip@fake.com
8005551000
Phoenix
AZ
10001
Joe
Schmoe
jschmoe@alsofake.com
NULL
NULL
RI

Note that there are seven different columns in the table.  The first column in the table, contactId, is the primary key for the table. This will be a unique value; there will never be two contactIds with the same value. 

With a relational database, the schema remains unchangeable, so when Joe Schmoe is added to the database without a phone or city, those places in the table need to be filled with NULL.

Tables in a relational database can connect to each other.  Here is a table in the same database that shows the contact's favorite foods:


foodId
contactId
food
1
10000
Pizza
2
10000
Ice cream
3
10001
Sushi


In this table the primary key is the foodId.  But notice that the contactId is present in this table.  The contactId here is the same as the contactId in the first table.  So we can see in this table that Prunella has two different favorite foods, pizza and ice cream, and Joe's favorite food is sushi.

When testing a relational database, you can use SQL query language to verify that the values you are looking for are present in the database.  For example, if you had just added a new contact with the name of Amy Smith to the Contacts table, you could query the database to see if it had been added, like this:

select * from Contacts where lastName = 'Smith' and firstName = 'Amy'

and the query would return a table row in response:


contactId
firstName
lastName
email
phone
city
state
10003
Amy
Smith
amysmith@faketoo.com
8885551001
Boise
ID

In the above query, the asterisk * tells SQL that we want all of the columns for the record returned.

Because this is a relational database, you could also do a query with a join.  A SQL join combines the data from two tables, joining on a column that they have in common.  

In the example above, both columns have the contactId column.  Let's say that you have given your new contact Amy a favorite food (chocolate), and you want to verify that it saved to the database correctly, but you don't know what Amy's contactId is.  You can't just query the Food table for "Amy Smith" because her first and last names aren't in there.  And you can't query the Contacts table for the food, because it's not in that table.  But you could query the Contacts table with that information, get the contactId from that, and then use the contactId to query the Food table for the favorite food.  

This is what such a query would look like:

select food from Foods 
inner join on Contacts 
where Foods.contactId = Contacts.contactId 
and Contacts.firstName  = 'Amy'
and Contacts.lastName = 'Smith' 

and the query will return this response:

food
Chocolate
  

Let's walk through what happens in the query.
select food from Foods - this tells SQL to return just the food column from the Foods table
inner join on Contacts - this tells SQL that the query will be joining information from the Foods table with information from the Contacts table
where Foods.contactId = Contacts.contactId - this is instructing SQL to find the contactIds in the Foods table and match them up with the contactIds from the Contacts table
and Contacts.firstName  = 'Amy' and Contacts.lastName = 'Smith'  - these last two lines are telling SQL that we are only interested in the record with the first name Amy and the last name Smith 

There are many more complicated ways to query a relational database, but with these two query types you will be able to do most of your data validation.  

Be sure to check out next week's post, where I'll talk about how to test with non-relational databases!


The Easiest MongoDB Tutorial on the Web

MongoDB is one of the most popular non-relational databases in use today.  Its versatility, speed, and scalability make it popular with appl...