32

What’s a database? If you think of word processors as… well… processors of words, and spreadsheets as number processors, then you can think of databases as processors of unstructured information, aka “data”. Feed a database data in any sort of guise – as numbers, text, dates, images, web links – and it will digest all that data and help you turn it into usable information.

A multi-file database for managing fundraising campaigns for a non-profit organisation. It consists of eight tables (listed in the top of the pane on the left)

Another way to help distinguish a database from a spreadsheet (and there are, in fact, many times when you could just as easily use one as the other) is in terms of what sort of information you can get out of it. A spreadsheet lets you posit questions such as “Can I afford the monthly payments on that $390,000 house?” or “What’s the average daily rainfall for the first six months of this year?” or “If we increase our sales of doohickies by four percent while reducing the price by a dollar a piece, how will it affect profits?” Spreadsheets excel at answering financial, numerical and statistical questions such as these.

You can ask quite different questions of a database. For example, “What are the phone numbers and addresses of the five nearest post offices to the school?”, “Do we have any books in our library that deal with irradiated food? If so, on which shelves are they located?”, “Show me the personnel records and sales figures of our five best-performing sales people for the current quarter, but don’t bother me with their address details.”

As you can see, databases deal with a broader scope of information than spreadsheets. The chief job of a database is to provide answers to questions. In fact, one of the major components of most databases is a query tool – a special component which lets you formulate questions (‘queries’) that will elicit useful answers from all your stored data.

There are databases and then there are databases…

This series of tutorials is aimed at average computer users. If you’re looking for an academically rigourous approach to database theory and design, you won’t find it here. What you will find is practical guidance for the average computer user interested in databases in general or in designing a database for a small business, a community group or an everyday home application, such as managing a music collection. If that sounds like you, read on!

Database building blocks

Of course, before you can ask any questions you need to build a structure to hold your data, and you need a way of adding data to that structure. So, most databases come with multiple components: a table designer for structuring the data; a forms designer for building data entry forms; a query engine for asking questions of that data; and a report builder for printing the results of a query.

It doesn’t stop there. Many databases also provide a way to tie together and automate procedures so you can build entire applications that offer users push-button ease (well, that’s the goal, anyway). Some database programs, such as Microsoft Access, include fully fledged programming languages for building applications. Others, such as FileMaker Pro, have beefed up macros or scripting languages that are easier to use but not quite as flexible as a programming language.

Think ahead

The most important thing to do when you start creating a database is to think ahead. Before you even switch on the computer, think about the type of information you have to work with and the types of questions you’ll want your database to answer.

For instance, say you want to create a membership list for your organisation. You’ll want to store the name, address and phone number of each member, plus other details such as the year they first joined, type of membership, whether they’re up to date with their membership dues, and so on. That gives us the following structure:

  • Name
  • Address
  • Phone
  • Year Joined
  • Membership Type
  • Dues owed

That seems clear enough. But what happens if you decide to print a membership list sorted alphabetically by family name? After all, that’s a very common requirement. There’s no easy way to do it with our current structure because names are stored as “Jane Doe”, “Gary Noble”, “Linda K. Brackman” and so on and sorting them alphabetically will give you a list sorted by the first name. So it makes sense to break the Name into its component parts: Given Name, Middle Initial, Family Name; once you do that, it’s a simple matter to sort your list by Family Name.

The same goes for the address. It, too, can be usefully broken down into Street Address, Suburb, State and Postal Code. That will enable you to print off envelopes sorted by postcode/zip and help you qualify for pre-sorted, bulk mail rates.

Payoffs for planning

As you can see, even in a seemingly simple database such as this, a little forethought can make a huge difference. Imagine if you’d used the original database structure above, entered the details for all 1700 of your members, and only then realised you need an alphabetical list sorted by family name. You’d have to add the new fields and then either re-enter all the affected data or use some advanced techniques – which, as a beginner, you’re unlikely to know – to reorganise the existing data.

So before you dive into creating your first database, get out the old pencil and paper and do some hard thinking about what you want to achieve.

In later articles in this series, you’ll have a chance to build some databases from scratch, starting with simple, flat-file tables and working up to powerful relational databases. As you can tell from that last sentence, you’ll probably need the guide to database-speak below, so keep it handy. If you want a more thorough grounding in database terminology, you can always check out the more expansive Database Dictionary.

Database dialect

Here’s a quick guide to some of the more common database terms.

database: A collection of related information stored in a structured format. The term ‘database’ is sometimes used interchangeably with ‘table’, but technically they’re different: a table is a single store of related information; a database, on the other hand, may consist of one or more tables of information that are related in some way. For instance, you could track all the information about  students in a school in a students table. If you then created separate tables containing details about teachers, classes and classrooms, you could combine all four tables into a timetabling database. Such a multi-table database is called a relational database.

data entry: The process of getting information into a database. Data entry is usually done by people typing the data into the computer by way of data-entry forms designed to simplify the process.

dbms: Database management system. A program which lets you manage information in databases. Alpha 5, Microsoft Access and FileMaker Pro, for example, are all DBMSs, although the term is often shortened to ‘database’. So, the same term is used to apply to the program you use to organise your data and the actual data structure you create with that program.

field: Fields describe a single aspect of each member of a table. A student record, for instance, might contain a last name field, a first name field, a date of birth field and so on. Each record in a table has exactly the same structure, so each contains the same fields. While the field structure is the same, the values in each field vary from record to record, of course.

flat file: A database that consists of a single table. Lightweight database programs such as the database component in Microsoft Works are sometimes called ‘flat-file managers’ (or list managers) because they can only handle single-table databases. More powerful programs, such as Access, FileMaker Pro and Alpha 5, can handle multi-table databases, and are called relational database managers, or RDBMSs.

index: A summary table which lets you quickly locate a particular record or group of records in a table. Think of how you use an index in a book: as a quick jumping off point for finding full information about a subject. A database index works in a similar way. You can create an index on any field in a table. Say, for example, you have a customer table which contains customer numbers, names, addresses and other details. You can make indexes based on any information, such as the customer number,the  last name + first name (a composite index based on more than one field), or the postal code. Then, when you’re searching for a particular customer or group of customers, you can use the index to speed up the search.

key field: You can sort and quickly retrieve information from a database by choosing one or more fields to act as keys. For instance, in a students table you could use a combination of the last name and first name fields as a key field. The database program will create an index containing just the key field contents. Using the index, you can quickly find any record by typing in the student’s name. The database will locate the correct entry in the index and then display the full record containing all the student’s details.

primary key: A field that uniquely identifies a record in a table. In a students table, a key built from last name + first name might not give you a unique identifier (two or more Jane Does in the school, for example). To uniquely identify each student, you might add a special Student ID field to be used as the primary key.

record: A record contains all the information about a single ‘member’ of a table. In our students table, each student’s details (name, date of birth, contact details, and so on) will be contained in its own record.

relational database: A database consisting of more than one table. In a multi-table database, you not only need to define the structure of each table, you also need to define the relationships between each table in order to link those tables correctly.

table: A single store of related information. A table consists of records, and each record is made up of a number of fields. You can think of the phone book as a table: It contains a record for each telephone subscriber, and each subscriber’s details are contained in three fields – name, address and telephone.

Read the next article in this series: Databasics II: Creating your first database

  • Martin

    This is a nice, clear intro article. I particularly like the mini-glossary. But I have a simple question that I can’t seem find an answer to anywhere. Please forgive me – I’m a beginner.

    Everywhere says a primary key must be unique. That’s ok, but let’s say I have two tables – customers and employees – and each has a primary key, customer_id and employee_id.
    The values of each must be unique in their own tables, but could I have a cutomer_id value of 1000 and also an employee_id value of 1000 becuase they are different keys? or do the values of primary keys actually have to be truly unique – ie that number cannot appear in any other field in the database?? Or at least in any other primary key field in the database??

  • Martin

    This is a nice, clear intro article. I particularly like the mini-glossary. But I have a simple question that I can’t seem find an answer to anywhere. Please forgive me – I’m a beginner.

    Everywhere says a primary key must be unique. That’s ok, but let’s say I have two tables – customers and employees – and each has a primary key, customer_id and employee_id.
    The values of each must be unique in their own tables, but could I have a cutomer_id value of 1000 and also an employee_id value of 1000 becuase they are different keys? or do the values of primary keys actually have to be truly unique – ie that number cannot appear in any other field in the database?? Or at least in any other primary key field in the database??

  • Hi Martin,

    Yes, you can have a customer_id value of 1000 and an employee_id value of 1000. You simply need a unique primary key value for each record in a particular table. Even if you create a relationship between the Customers table and the Employees table, this is still the case.

    All you’re trying to do with the primary key is ensure that each record in a table is distinguishable from the others in that table and that each record is easily and unambiguously retrievable when you need it.

    As far as the database program is concerned, you could have two tables with identically named primary keys and they could still have the same value (although I don’t recommend you name your fields in this way). For example, say the primary key in your Customers table was called PrimeKey and your Employees table also had a primary key called PrimeKey. As far as the database program is concerned, these fields don’t have the same name; one is called:

    CustomerTable -> PrimeKey

    and the other is:

    EmployeeTable -> PrimeKey

    Just because our human brains look at the two fields and think of them as being named the same, that’s not what the computer program does. It “sees” them in terms of their fully identifying names, which includes tablename -> fieldname -> value.

    This is just like filenames on our computers. For example, you could have a folder in your Documents folder called Meetings with a bunch of sub-folders called January, February, March, April and so on. In each of those sub-folders you could have a Word document called “Meeting minutes.doc”. Even though those files are named identically, your operating system doesn’t get them confused, because the full name of the file is not “Meeting minutes.doc” but something like:

    C:DocumentsMeetingsJanuaryMeeting minutes.doc
    C:DocumentsMeetingsFebruaryMeeting minutes.doc

    and so on.

    Now, just as I don’t recommend you name all your primary key fields PrimeKey, I also don’t recommend you use this sort of naming convention for your files. For humans like us, it’s much better to use names like “Meeting minutes January 2010.doc”. But to the computer, the former naming convention makes files as uniquely identifiable as the latter.

    Cheers,

    Rose

  • rosevines

    Hi Martin,

    Yes, you can have a customer_id value of 1000 and an employee_id value of 1000. You simply need a unique primary key value for each record in a particular table. Even if you create a relationship between the Customers table and the Employees table, this is still the case.

    All you’re trying to do with the primary key is ensure that each record in a table is distinguishable from the others in that table and that each record is easily and unambiguously retrievable when you need it.

    As far as the database program is concerned, you could have two tables with identically named primary keys and they could still have the same value (although I don’t recommend you name your fields in this way). For example, say the primary key in your Customers table was called PrimeKey and your Employees table also had a primary key called PrimeKey. As far as the database program is concerned, these fields don’t have the same name; one is called:

    CustomerTable -> PrimeKey

    and the other is:

    EmployeeTable -> PrimeKey

    Just because our human brains look at the two fields and think of them as being named the same, that’s not what the computer program does. It “sees” them in terms of their fully identifying names, which includes tablename -> fieldname -> value.

    This is just like filenames on our computers. For example, you could have a folder in your Documents folder called Meetings with a bunch of sub-folders called January, February, March, April and so on. In each of those sub-folders you could have a Word document called “Meeting minutes.doc”. Even though those files are named identically, your operating system doesn’t get them confused, because the full name of the file is not “Meeting minutes.doc” but something like:

    C:\Documents\Meetings\January\Meeting minutes.doc
    C:\Documents\Meetings\February\Meeting minutes.doc

    and so on.

    Now, just as I don’t recommend you name all your primary key fields PrimeKey, I also don’t recommend you use this sort of naming convention for your files. For humans like us, it’s much better to use names like “Meeting minutes January 2010.doc”. But to the computer, the former naming convention makes files as uniquely identifiable as the latter.

    Cheers,

    Rose

  • Esmerai

    wow tht really helpd me alot on my work for class thnx guys 🙂

  • Esmerai

    wow tht really helpd me alot on my work for class thnx guys 🙂

  • vlad

    Excellent intro. Explains simple terminology which is often overlooked in the more “advance” materials.

    Thank you!

  • vlad

    Excellent intro. Explains simple terminology which is often overlooked in the more “advance” materials.

    Thank you!

  • vlad

    Excellent intro. Explains simple terminology which is often overlooked in the more “advance” materials.

    Thank you!

  • Rhonda

    Hi,

    I am new to access and am trying to create a form from a database in infopath that will let me bind a drop down box to a field in access and based on the selection in that box, update the control box below it to the correlating field in the database… how in the world do I do this???

  • Rhonda

    Hi,

    I am new to access and am trying to create a form from a database in infopath that will let me bind a drop down box to a field in access and based on the selection in that box, update the control box below it to the correlating field in the database… how in the world do I do this???

  • Alam

    Hi,

    I am new to access and am trying to become a database developer.
    How can I improve myself as a database designer ?please ,help me.

  • Alam

    Hi,

    I am new to access and am trying to become a database developer.
    How can I improve myself as a database designer ?please ,help me.

  • Dharmendra

    @Rhonda

    Hi..wil you please specify more ..actually what you want to do?
    do you want to fill another dropdown based on selection of previous one?

  • Sajida

    Dharmendra can you please help me I ma struggling with my database at the moment can you please advise how do I create a database with drop down only related to specific drop downs? is this possible?

  • Nims

    Thanks budy,
    I have an exam in next few days and you have solved my prob.
    Nims

  • Pingback: Take a step in the right direction: WebMatrix/Razor learning path : Rachel Appel()

  • Pingback: Taking steps in the right direction: WebMatrix/Razor learning path : Rachel Appel()

  • SAMBRE VICENTE

    Hallo I don’t see atributes in the database
    Sambre

  • Nozomi

    Great post! Thank you so so much! Helps me a lot!!! I am a beginner and this tutorial is very easy to understand! You rocks!!! 🙂