Category Archives: Articles

Tweaking the System – Finding the most efficient application design

Like any system, one of the important traits of a good computer program is the efficiency with which it runs. Many people might think of this in terms of how fast it loads or how dependable it is. It can also be measured by how well organized the interface is which determines how successful the program is in enabling the user to do their job. Programmers like myself also look at how much memory the program uses which can determine whether the program will run reliably or whether the user will be plagued with crashes and blue screens.

Continue reading

Using Typed Datasets in Visual Studio

If you’ve worked with Visual Studio or .NET for long, you’re probably familiar with the System.Data.DataSet class which enables you to create a collection of data in memory, often retrieved from an SQL database. Within a dataset object, you can have a series of related DataTables that categorize the data. Moving further down, each table can have multiple
DataRow objects. In effect, you can have an entire relational database in memory to which your application can refer and save changes, although in practice, I usually limit the data in memory to one datatable or even a single datarow if possible, so as to avoid using too much memory.

using System.Data;
using System.Data.SqlClient;

public static DataSet GetQuotes()
{
    DataSet result = new DataSet();
    DataTable dtQuotes = new DataTable("Quotes");

    SqlDataAdapter daQuotes = new SqlDataAdapter(
       "SELECT * FROM Quotes",
        Properties.Settings.Default.AppDataConnectionString);
    daQuotes.Fill(dtQuotes);

    result.Tables.Add(dtQuotes);

    return result;
}

The above C# example code shows a function that returns a dataset containing a table of quotations like the ones you see at the top of the page.  It declares the new dataset and empty table and then uses SQLDataAdapters to retrieve the data from the table into the datatable. The datatable is then added to the dataset which is returned as the output of the function.

The code that calls this function could then reference the data within the dataset by referring to a specific table, rows and items. The following sample calls the function and refers to the Quotes table within the returned dataset, retrieving the QuoteText value from the first row.  (Remember that row and column numbers start with 0.)

DataSet dsQuotes = GetQuotes();
dsCustomer.Tables["Quotes"].Rows[0]["QuoteText"]

Once you understand the ADO.NET dataset and other classes involved, this is a quick way to get data for your application from a database or an XML file and it’s something that every programmer should be able to do. However, it’s also a relatively manual process. As you can see in the last example above, referencing the data that’s returned involves referring to rows by number and columns by number or name. You can use a loop to iterate through the rows or you can bind the table to a data grid but when referring to the columns, you still have to remember the ordinal number or the name that comes from the database. Unless you write nice, organized, classes that group all the database calls in one place, you could also have SQL statements, or preferably stored procedure references, scattered throughout your project.

Typed Datasets

Another type of dataset object in .NET solves some of these problems. The Typed Dataset object is a graphical interface that enables you to simply drag database objects such as tables and stored procedures onto the dataset work area. The interface will then discover and document those objects and create custom data classes for them based on the ADO.NET classes mentioned above. For example, if you drag a Customers table onto the work area, it will create a CustomersDataTable class that can hold a collection of CustomerRow objects based on the table’s design. The CustomerRow class will include a property for each of the fields in the Customers table and these properties can be referred to in code instead of the column name or number. A TableAdapter will also be created to go with the CustomersDataTable and this adapter will store the connection information and all the necessary SELECT, UPDATE, INSERT and DELETE queries can be automatically generated by Visual Studio based on the table structure.

To use a typed dataset, you have to add one as a new component in the project. You can do this by right-clicking on the project in the Solution Explorer or by selecting Add Component from the Project menu in Visual Studio. Select DataSet from the Data components, enter a name at the bottom and click Add.

021-dataset

Typed Datasets are added as new components in a Visual Studio project. (Click for larger view.)

 

Visual Studio will then add the typed dataset to your project and open it in design mode. As you can see by the help text in the screenshot below, you can actually create dataset objects independently from a backend database using the Dataset toolbox but when starting out, your main purpose will probably be to work with database objects. The Server Explorer is the Visual Studio window where you can work with SQL Server databases. If it’s not visible, you can select it from the View menu.

022-newdataset

Data sources such as tables and stored procedures are dragged onto the dataset workspace in order to include them in the dataset.

023-datasetquotes

When the table is added to the dataset, Visual Studio creates the necessary classes and objects to represent the table and its fields, along with table adapters to manage the connection with the database.

The screenshots above show the dataset designer before and after the addition of the Quotes table. As you can see, the dataset designer has discovered three fields within the table and generated a Quotes datatable object for the table along with a QuotesTableAdapter object attached to it. It’s even correctly identified the QuoteID field as the primary key. In the lower
right-hand corner of the screenshot, you can see the properties for the TableAdapter which include the connection details and all queries. The SELECT command is expanded to show part of the SELECT statement that has been automatically generated based on the DataSet’s analysis of the table.

In the Solution Explorer, the typed dataset is contained in an XSD file and you’ll notice a few other files underneath it. These are automatically generated as you make changes within the dataset interface. The main file is the Designer.cs file which holds all of the code that defines and supports the objects created in the interface.

024-codebehind

Automatically generated code for a typed dataset.

As it says in the file header, this file is not meant to be manually changed, although you should take a few minutes to look over it to see the sheer volume of code that Visual Studio generates in the background for just one or two objects within the interface.

Once the typed dataset is in place, you can refer to the custom objects through your code:

using DataDemo.AppDataTableAdapters;

public static AppData.QuotesRow GetSampleData()
{
    QuotesTableAdapter tbaQuotes = new QuotesTableAdapter();
    AppData.QuotesDataTable dtQuotes = tbaQuotes.GetData();

    return (AppData.QuotesRow)dtQuotes.Rows[0];            
}

The above function actually returns an object from one of the custom classes in the typed dataset. Notice the extra Using statement at the top that references the collection of TableAdapters created by the dataset. The function declares a QuotesTableAdapter object which already includes a GetData function that uses the automatically generated SQL SELECT statement to select rows from the table. That function returns a QuotesDataTable object and the first row of that table is returned by this function as the result. The row collection in the QuotesDataTable object actually consists of generic datarow objects and, by default, C# is stricter about conversions than VB.NET so it’s necessary to explicitly cast the row as a QuotesRow. Whatever code is calling this procedure, however, can do something like this:

AppData.QuotesRow rwQuote = GetSampleData();
String QuoteText = rwQuote.QuoteText;
String Author = rwQuote.Author;

The text of the quote and the author which are stored in fields within the table are now properties of the QuotesRow object and can be called by name in the code.

In addition to the advantages in selecting and referencing data, the custom TableAdapter objects have auto-generated methods for updating and deleting rows that can be used as easily as the GetData method was used above.  The TableAdapter manages the connections to the database and the typed dataset contains all of the SQL Statements or stored procedure references needed to carry out the operations.

For more information:

Using ADO.NET for Beginners – CodeProject.com

Using Strongly-Typed Data Access in Visual Studio 2005 and ASP.NET 2.0
4GuysFromRolla.com

Complete Comparison for VB.NET and C# – CodeProject.com

Looking for Square Pegs: Finding a Solution and Turning it Into Code

(Previously published on AndrewComeau.com.)

Introduction

Programming and software development is a type of engineering. More than just sitting down and typing code, it’s about designing solutions to everyday problems using technical knowledge gained over years of training and experience. Larger projects can include a series of challenges to solve, some of which can’t be foreseen until they’re staring you right in the face and laughing at your project deadlines.

Being a software developer is more about being able to solve problems than it is about what languages or tools you know just like being a carpenter is about more than knowing how to use a hammer or a saw.  Programming technologies come and go and a skilled programmer can pick up a new one in a short time.  Being able to evaluate the requirements of a project, break them down into a series of tasks and come up with a robust solution is a much more valuable skill.  Without that, mastery of a dozen languages would be next to useless.

For coding challenges, this series of tasks is often called an algorithm.  Algorithms are logical methods used to accomplish a specific objective.  They can be written out in plain English (or any other human language) as a list of steps and decisions that must be followed in order to achieve the desired result.

A recent experience with this was a small problem that I had to solve as part of a project I was working on and the problem was this:

How do you program a comparison of three or more text values and make sure they’re all the same? 

For two values, it’s easy; you use an IF … THEN statement and an equality operator (= or ==, depending on the language) or, in the .NET languages, you could use the String.Compare function which compares one string to another.  Three or more values starts getting more complex, especially if the number of items to be compared changes to four or more. It’s also best if the code returns more than a TRUE / FALSE value.  If the values don’t match, there needs to be some indicator of what went wrong.

I could just throw a bunch of code on the page showing how I solved this problem and the average programmer could analyze the algorithm from it pretty quickly and then even more quickly tell me how I could have done it better. (Programmers like to do that to each other; it’s almost a compulsion.) As this article is meant for people who are relatively new to programming, I’d rather take you through the process and see how I came to the solution.

How Big of a Hammer Do We Need?

One of the first things to determine is the scope of the problem and the size of the needed solution. Often this means asking questions like:

  • How many users will be working with the program or how many machines it will be running on.
  • How many different systems will be affected?
  • How much data will be moved and stored?
  • Is this a repetitive process that needs to be especially efficient because of how often it will run?

This can make the difference between a small program that can be worked up in a day or two and an enterprise solution designed by a team of programmers that takes a couple of months to plan, build and test.

On a lower level like this one, it’s just a question of how much code is needed and how it should be organized within the program.  In this case, a single function should do it. That function will, of course, be part of a larger program in the end but for now we’re focusing on this one task.

Since it is a self-contained function, my next thought is about code re-use.  If I’m going to take the time to write a specialized function, I’m going to design it so that it can be re-used or adapted elsewhere when needed, if possible by just copying it from one program into another and calling it.  In a sense, functions should behave as much as possible like interchangeable parts in any system. If I was talking about the parts of a stereo system, the first thing I’d consider after the function of the component itself is it’s connections. What kind of inputs does it have for external devices and how will it output its signal to the speakers or other systems?  Again, the same questions are asked when writing code.

With functions, the inputs take the form of the values that can be passed to it as arguments or parameters.  The function will use these values during its calculations. In this case, the function will need to accept a list of text values to compare to each other.  That list could be passed in a number of ways; as an array of strings or a collection object specific to the .NET framework.  Sometimes, there could be reasons to use a type of input that’s specific to a given programming technology but right now, I want the code to be adaptable. I’m going to use the simple choice of a delimited string; text that includes more than one value separated by a specific character.  Comma Separated Values (CSV) in which each value is separated by a comma is the most common. Spaces and semi-colons could also be used. The input might look something like this:

1234,1234,1234,1235,1234

A one-dimensional string array would be my next choice. With an array, the values are pre-formatted so that the function can examine all the values in sequence.  Since I’m designing with C#, however, there’s a simple function that I can use to create such an array from the CSV shown above so I’ll let my function do that step as well and make the input a little simpler.

The output of the function is a bigger question. The requirement states that PASS / FAIL is not enough; if not all the values match, there has to be some indication of why. I also like outputs that can be used in more ways than one so I decided to return a list of the distinct values within the delimited string.  The code that calls this function can then use that list in whatever way it wants or simply count the number of list items. If there’s more than one, it means there was no match.

The next question is how will the function format that list? The function’s output is more easily changed when and if it’s re-used so I’m going to choose an output based on what works for the program that the function is part of right now.  At first, I was going to return it as a string array that could be easily enumerated in code.  The problem is that string arrays need to be declared with a specific number of items in C# and are not easily resized which is a problem since I don’t know how many values I’ll find. After searching some more, I settled on the Generic List class where the individual values are easily referenced and counted by the calling code.

Devising the Plan

Once we know what the function is accepting and returning, we need to figure out how it will arrive at the result. This is the step that gets easier with experience but there can still be some trial and error involved.

One of the often-repeated truths of programming is that there is more than one way to accomplish most tasks, often several ways.  Programming languages are getting more powerful every year and often reduce the amount of custom code that it takes to accomplish any given task.  When it comes down to it, though, a programmer is going to use the methods that are the easiest to remember and understand given the time available to think through a problem and maybe his or her caffeination level.  There will always be room for improvements but the first priority is often getting the job done.

With this function, I know that I have a delimited list and I know that I can split it into an array to make it easier to work with the values so that’s my first step in the algorithm.  I need to go from “1234,1234,1234,1235,1234″ to:

[0] = 1234
[1] = 1234
[2] = 1234
[3] = 1235
[4] = 1234

Remember that arrays in C# are zero-based so the numbering of the elements starts with 0.

1. Split the input string into an array.

I also know that I need to define the Generic.List that the function will build and return.

2. Declare a return list.

So now I have an array of values to check and an empty return list. Now the question is how I’m going to check the array for duplicates. Sometimes the best way is to start with how a human would do it and be glad a computer can use the same method hundreds of times faster.

If you were manually checking a list to verify everything matched, your method would depend on the length of a list and what you were checking. A list of five items or less could be verified at one sweeping glance. If you had more than five items, you might look for one item that you knew was correct and compare everything to that.  This function checks a list of indefinite length and doesn’t have a reference value to work with so the next method is to compile the list of distinct values … one at a time.

3. Add the first value in the input array to the return list.

After all, the first value in the input array is unique so far so that makes a good start.  Now I need to start moving through the input array and looking for discrepancies.  To do that, I use the return list as the reference value(s) to check against.

4.  Check the next item in the input array. Does it match a value in the return list?

a. Yes?  Do nothing.

b. No?  We have a new distinct value.  Add it to the return list.

5.    Repeat #4 until we reach the end of the input array.

After we’ve checked the entire input array, there’s nothing left to do but …

6.    Return the List.

Writing the Code

Method 1: The Scenic Route

So, after all that consideration, we can start writing the function:

private List<String> DistinctList(string ValueList)
{

}

The C# function declaration above defines both the inputs and outputs of the fuction. The List<String> return type means that the function will return a Generic List object that can hold only text strings.  The ValueList argument at the end is the delimited string that will be input to the function.  This first line is also referred to as the function’s signature. A call to this function would look like this:

List<String> resultList = DistinctList("1234,1234,1234,1235,1234")

Now we need to take care of the first three steps in the algorithm.

string[] inputList = ValueList.Split(new char[] {','});

List<String> returnList = new List<String>();

returnList.Add(inputList[0]);

I won’t go into detail about every keyword used in this code – you can find more information in the help files and elsewhere online – but the first line splits the ValueList argument that was passed into the code into a string array called ‘inputList’ using the commas as a delimiter.  The second line creates an empty List object called ‘returnList’, setting the required type as a
String.  The third line copies the first value from inputList to returnList.

arrayIndex = 0

foreach (string inputValue in inputList)
{
  match = false;
  if (arrayIndex > 0)
  {
    foreach (String compareValue in returnList)
    {
      if (!match && (compareValue == inputValue))
        {
          match = true;
        }
    }
    if (!match)
    {
      returnList.Add(inputValue);
    }
 }
  arrayIndex++
}

The above code is the main part of the function.  It uses two nested foreach loops, the first iterating through the input array and matching each value against each of the return list values with the second ForEach loop. (The IF statement on the third line skips over the first item in the input array since it’s already been handled.)  A boolean variable called ‘match’ is used to record a match between the input array and return list. If there is no match, the value is added to the end of the return list.

Despite all the comparisons, this code works extremely fast, instantaneously even when I used it with a space instead of a comma as a delimiter and tested it with a few long paragraphs of text.

Once the code is finished,  all that’s left is to return the returnList object as the output of the function.

Method 2: Getting on the Freeway

Now, the code above works very well and it’s perfectly acceptable for the job.  It’s reliable, it’s fast and it’s easily understood by any programmer with even a basic level of experience.  That last point is important because as a professional programmer, you’ll spend a lot of time analyzing other people’s code whether to support it or to learn from it. There’s also a good chance that someone else will one day be reading your code.  The clearer it is, the better for everyone involved.

Still, as I was writing this article, I kept thinking that with all the power behind C#, there had to be a more advanced way to do this than having the program mimic a human method of manual comparison.  Also, as I said earlier, I knew that if I didn’t find that better method, it was only a matter of time before someone else wrote to me telling me about it and I’d be kicking myself for not finding it.  So I started looking through the documentation again.

The second version of the method starts out the same; a delimited string is passed to the function which will be returning a List<String> object. The Split function is once again used to split the string into an array.

string[] inputList = ValueList.Split(new char[] {','});

The next lines change everthing …

List<string> returnList = inputList.Distinct().ToList();
return returnList;

Done.

That’s right.  With some extra research, we’ve gone from over 15 lines of code (including the variable declarations I didn’t show) down to three, including the return line.  The Distinct() function performs a query on the array elements much like the following SQL query:

SELECT DISTINCT ItemValue
 FROM Collection

The ToList() function then converts the resulting collection back to a List object that can be returned from the function.

Conclusion

So which method is better?  Well, that depends …

As I said early on, being a programmer goes beyond knowing a specific language. If you’re working in a .NET language like C# or VB.NET and have access to functions like these, that’s great.  On the other hand, what if you’re working in one of the earlier flavors of Visual Basic like VBA or VB 6.0?  (Yes, there are companies still using both.)  There are also other languages that are just as current but are specialized and have more limited vocabularies. You might also be on a deadline without the time to find that one function that will do everything for you.  In those cases, it’s necessary to know how to think through an algorithm that will get the job done with the basic functions available from a foundation-level knowledge of whatever language you’re working with.  The fancy stuff can come later. Even when you do have access to powerful functions like those above, you will eventually find that your programs take on greater levels of complexity and those functions become mere pieces in larger puzzles that you have to solve.

Besides, I had more fun taking the scenic route.