In an earlier series of articles, I wrote about getting started with Microsoft Access, which is one of the foremost desktop database packages available and an excellent tool for small to mid-size database projects. While it was not the first database software that I used, it was the one that took me from novice to professional status. Once I reached that level, however, I had to start learning more advanced tools in order to advance in my career and take on new projects. One important set of tools was Microsoft SQL Server which remains one of the most popular network database systems on the market.
In the last chapter, I talked about some of the current options when it comes to getting started in the programming field. I mentioned several development tools that you can get familiar with to decide what path you want to take as a software developer but getting started is only half the battle. As I said, the field is constantly changing and with so much evolving technology, it can be hard to keep up even if you’re working with it every day. When planning a career, there’s also a desire to have some idea of what things will look like five years down the road and where you should be in relation to them. That’s harder to do in the I.T. world but there are some things that you can plan on and overall strategies that you can use to develop a solid foundation of knowledge.
So far, I’ve given you some general ideas of what it takes to start and maintain a career in software development but you’re probably still wanting some direction as to the actual first steps of learning how to program. In this chapter, I’ll give you a sampling of the options available. Pay close attention to the links sprinkled throughout for additional information.
There are a lot of options out there because there are a lot of different computer languages and types of devices to program. The type of device for which you’re creating an application and the operating system it uses is sometimes called the platform. For example, if you’re programming Windows applications for the PC, then you’d be writing for the Windows platform. The Android OS that powers smartphones and tablets would be a separate platform and would require different programming tools. Some people might get confused by the term platform so it’s easiest just to say “I write Windows programs.” or “I create websites.”. You’ll find that one of the challenges of a programming career is communicating ideas to non-programmers.
You’ve probably heard people talk about how we’re living in the Information Age with an emphasis on how much information there is to process from all different directions. This is certainly true and it can be a challenge for the average person, especially if they were born even a few decades ago before there was a personal computer in almost every home and when most people were happy with a few local channels on their TV.
The flip side of living in the information cloud is that it’s never been easier for you to find information and learn virtually any subject. Universities no longer have a monopoly on education and people are no longer dependent on their local library, bookstore or media outlets for materials. Conversely, I never imagined when I was growing up in a small town in the 1970s that, one day, I could be a published author without going through the trials of manuscript submissions, editorial reviews and multiple rejections … but here we are.
Ask a lot of people about the requirements to be a computer programmer or software developer and they’ll probably start talking about computer science degrees and years of formal training. Those things don’t hurt but they’re not strictly necessary, either. I personally do not have a degree aside from the accounting diploma I earned from a local college many years ago. I am certified by Microsoft in Windows application development, a certification I earned through self-directed and cooperative study with a few co-workers. Most of what I know about programming is through self-teaching and experience. Nevertheless, with my current experience, all I have to do is post my resume and send copies to a couple of recruiters and my phone starts ringing. It’s about the demonstrated skills, not the paper.
“Do you have any questions for us?”
For some people, that’s the most challenging question in a job interview. They’ll spend time preparing a resume and anticipating the interviewer’s questions but thinking of questions to ask in return gets the least attention. Some people don’t even see the point in asking questions when they’re just hoping to get the job so they can start paying bills again.
Still, the questions that you ask tell the interviewer a lot about you as a potential employee, especially your level of interest in working for the company rather than just collecting a paycheck. When a potential employee has no questions about the company at the end of the interview, he or she can appear lazy or apathetic, which are never desirable traits in an employee.
In any professional position, it’s important to find out as much as you can about the company where you might be working for years. That company will have a place on your resume and will impact the future of your career. The right questions will also demonstrate your knowledge and perspective on your chosen field and that you take the time to do research before making a decision. This demonstrates an intelligence and analytical skill that companies like to see in their professional employees.
Tackling the Data
One of the strengths of Microsoft Access is its ability to import and link to data from different sources. With Access you can also query and report on the data in ways that a spreadsheet program like Excel doesn’t readily provide. This can come in handy as part of a database application but it’s also useful for quick analysis of new information.
I found a new example of this the other day when I was looking over the website statistics for my site. This data is stored in log files which record every incoming request to the site. Every time someone pulls up a page, like the one you’re reading now, their web browser generates a number of file requests that are stored in these logs.
This adds up to thousands of requests per day, each one being stored as a separate line in the log file. The log stores such basic information as the time and date of the request, the file name, the IP address of the computer making the request and the type of browser being used. This information comes in very handy for analyzing the site traffic. My hosting service provides analysis software that summarizes these logs but it’s not customizable, I don’t fully trust its analysis and it doesn’t always answer the questions that I have like the one that came up the other day;
“Why are my visitors getting so many 404 errors??”
For those who don’t know, 404 is the error code you get when you try to pull up a page that doesn’t exist. The analysis software was telling me I had hundreds of these results which is worrisome because they’re not very attractive to visitors and can indicate problems with the site. The statistics didn’t tell me what page requests were generating the errors, however.
I decided to look at the logs directly and see what I could find out. I figured if the analysis software could read it, I could find a way to work with it. The initial view was not encouraging.
It only looks like it’s from NASA …
After I looked at the file for a few moments, and turned off the word-wrap, I saw that it was a delimited text file. This is a text file that contains the same fields on each line and where each field is separated by a specific character. That character could be a comma, a space or another character so long as it doesn’t show up within any of the fields that are presented. In this case, the character is a blank space. Every time a blank space occurs, it means one field has ended and the next one is beginning.
Okay, that’s a little clearer.
Another type of text data file is a Fixed Width file where each field is allocated a certain number of characters on the line. The fields do not need to be separated although if the value for a field exceeds the allocated width, it might be cut off. If a field value falls short of its specified width, the remaining space may be filled by spaces or other characters. Either way, this type of file is read by knowing the list of fields and their character allotments.
Now I knew it was just a matter of importing the data somewhere and manipulating it as needed.
This Looks Like a Job for ACCESS!
I was working with Access 2002 for this project as shown by the screenshots. The Import Wizard is also present in Access 2007 and is virtually identical to the one shown here. This is not something that has changed a lot between Access versions.
Start with the Get External Data menu (Access 2002)
In Access 2007, the Import Wizard can be accessed from the External Data portion of the control ribbon.
Access provides a handy wizard that will guide you through the import process for a data file. It will import a variety of file types including other spreadsheet and database files. For this exercise, I just wanted to import the text file I’d saved the log contents to so I selected the text file in the Open File dialog that the wizard presented.
The wizard immediately recognizes the file contents as you can see below. The first screen presents a choice between the two file types, Delimited and Fixed Width.
The wizard makes its best guesses but, in the end, depends on the user to pick the right settings.
Clicking on the Advanced button brings up the Import Specification screen which provides additional settings that will help Access read the data. The below screenshot shows the settings I chose for the file. It’s a delimited file using a blank space for a field delimiter. From the above screenshots of the file, you can see the dates are shown with the four-digit year first and then the month and day and dashes in between. The months and days use leading zeroes (i.e. 2010-08-31). The timestamps use the colon to separate the hour and minute.
I could use the Text Qualifier selection to specify if quotations are marked with single or double quotation marks but I don’t need it here.
Under the field information, I could even specify the name, type and other information for each field. This would be useful as part of a regular import if the data was going to be referenced by an application or if Access had trouble with part of the data but this is a one-time import and Access seems to have figured out the fields okay so I’m not bothering with that.
(Article continues after ad)
Using the command buttons on the right, you can even save and load specification settings for reuse.
The second screen of the wizard specifies the delimiter character that’s being used although I already did that on the Advanced Settings screen. This screen also presents the option to use the first row of the data file as a set of fields names which is the case with this file.
The wizard can store the data in either a new or existing able. Using an existing table could be useful if you had a number of files to import and wanted to combine them all in a single table or if you were importing similar data from other database.
The next screen is another chance to manually define the fields as was earlier shown on the Advanced Settings screen. You can also mark fields to be skipped in the import.
Access specializes in relational data so, unlike Excel, it wants to add a primary key to any tables that are created. This key acts as a unique identifier for each row in the table and can be used to link the table to other tables in the database. On the next screen of the wizard, you are prompted to set the table key.
- If you let Access add a primary key, the program will add an AutoNumber field to your table which will increment by 1 for each row in the table.
- With the second option, you can choose an existing field within the table that provides a unique value. This will be used as the identifier.
- The third option skips the primary key altogether.
The final screen in the wizard simply asks for a table name if the data is to be imported to a new table.
Once the import is completed, the wizard notifies you of the results. As you can see from the above message, things don’t always go perfectly. There were thousands of records in the log file I imported so that’s not a surprise. The data was still imported but not quite as it exists in the file. The wizard also creates an ImportErrors table with details on what went wrong.
Out of 3783 rows imported, it seems that 153 of the rows had at least one field that was too long for the field type selected by Access. I could start over and specify a memo type for the field but I’m not concerned with the that particular field right now so I won’t bother.
For my purposes, I call that a successful import. Now let’s look at the data.
Under the Microscope
Once the import wizard has put the data into a nice neat table, all that’s left is to query it at will. Here’s the one that I used to find out where all the 404 errors were coming from.
After looking at the file, I was able to figure out that the cs-uri-stem field is the name of the file that’s being requested, whether it’s a page, a graphic or any other type of file.
The cs(Referer) field is used to hold the page from which the visitor followed a link to the requested page. This could be a page on another site or a page on this site. If the visitor typed the address of the page in manually (i.e. “http://www.drewslair.com/about.shtml”), the field is blank.
The sc-status field shows the status of the request. If the file was retrieved successfully, as in most cases, it should be 200. I’m specifying “404” as the criteria because those are the only rows I want to see.
The results paid off and told me about a few things that were going on with the site.
- A problem with a relative URL reference caused one of the page borders to look to the wrong directory for the cascading style sheet. Knowing that all the errors were coming from pages in one directory helped me to track this one down. Fortunately, it probably wasn’t noticeable to visitors.
- Many browsers are looking for the favicon.ico file. That’s the file that supplies that little icon you see for so many websites. It shows up in your Favorites or Bookmarks menu and on the title bar of your browser when you’re viewing the site. Drewslair.com doesn’t have one. It probably should but it’s not a priority.
- I’ve republished a few articles elsewhere on the web and these articles link back to graphics and other files from the site. Some of those files have changed names since the Knols were published. Updating the Knols fixed the problem.
- Then there are the external sites linking to files that have long since been moved or deleted or just getting the address wrong altogether. There will always be a need for a 404 page.
Having fixed the problems that I could, I’ll keep an eye on the stats for a few days and look forward to seeing the number of 404 errors go down. In addition to tracking down this issue, it’s turning out to be a quick way to check the popularity of specific pages and downloads with confidence in the results.
Filling in the Gaps
This is not the first time I’ve used Microsoft Access to analyze data in a way the original software couldn’t. It’s a great tool for both rapid development and quick analysis of many types of data. Many more people have Excel on their desktops which also does a great job of manipulating external data but if you have Access at your disposal, it’s worth your time to learn as much as you can about it beyond it’s native data format.