When writing programs, it’s often necessary to perform repeating operations on collections of items such as customer orders or invoices. Often, you can just iterate through the collection or count the items to determine how many times to perform the operation. When working with a hierarchy of items such as a directory structure where you have an unknown and varying number of levels under each branch, it’s a different story. For this, the typical method is to use recursive programming, often just called recursion. This is a method in which one routine is designed to analyze the items on one level of the hierarchy, look for any sublevels and then call itself to analyze each sublevel. Each time the routine calls itself, it creates another instance of itself that works independently until it’s finished and then returns to the instance that called it.
To show how a program would use recursion, let’s look at a file directory containing a music collection as shown in Figure 1. There are two copies of the collection stored on different drives and they need to be synchronized. The hierarchy of the primary copy starts with the F:\My Music directory for which the recursive subroutine is called first. The recursive subroutine looks at this directory and finds a few files which it processes by looking for corresponding files in the other music collection. Mostly, though, it finds subfolders because Windows Media Player (which I use) automatically groups songs by artist and then by album.
At some point while iterating through the folders under F:\My Music, the subroutine reaches the subfolder for Electric Light Orchestra. The routine then calls itself to examine F:\My Music\Electric Light Orchestra. That instance doesn’t find any files but it does find the subfolders for the two ELO albums I have so that instance of the subroutine calls itself and creates another instance to examine the first subfolder in the list.
At this point, there are three instances of the subroutine running; one for each of the following folders:
- F:\My Music (waiting …)
- F:\My Music\Electric Light Orchestra (waiting …)
- F:\My Music\Electric Light Orchestra\A New World Record (processing)
The third instance doesn’t find any additional subfolders because there are none but it does find a file for each song on the album A New World Record so it processes those, ends and returns to instance #2 which moves on to the next subfolder it finds (..\Electric Light Orchestra\ELO’s Greatest Hits) and calls another instance of itself to analyze that folder. When instance #2 is finished, it will return to the first instance (F:\My Music) which will then look for another subfolder and the entire process repeats until there are no more folders to work with.
I originally coded this in VB.NET and here’s a very abbreviated version of the code itself so you can see how this works:
Private Sub CompareDirectories(ByVal subDirectory As String) ' Count the number of files and subdirectories under this folder. Dim noFiles As Integer = FileIO.FileSystem.GetFiles(subDirectory).Count Dim noFolders As Integer = FileIO.FileSystem.GetDirectories(subDirectory).Count 'If there are files, process each one. If noFiles > 0 Then For Each ... (Code to examine each file, determine if there's a corresponding file in the other collection and take the appropriate action either way. Next End If 'If there are subfolders, process each one. If noFolders > 0 Then For Each folder... CompareDirectories(ByVal FolderName As String) End If End Sub
As you can see, it’s a simple matter for the routine to call itself and after that instance ends, it returns to the instance that called it, continuing with the For Each loop until all folders and files have been processed.