It's no secret that sites with dynamic content tend to attract more visitors.
The dynamic nature of a site portrays a "fresher" look making it more
attractive, more fun, and more useful. Some sites achieve this effect through
displaying content such as weather, related news, market indices, or an
assortment of other items. So where can you go to get the data and information
that can make your site more "dynamic", especially if you don't already have the
data available to you?
One alternative is to pay an outside agency for the data. This method
normally allows you to have access to a text-based file or to a preformatted
image. However, with all the free information available on the web, why not get
it yourself? Having said that, you obviously must be careful not to grab
information from sites containing copyrighted material, or material that is not
authorized for distribution or use on other sites. When in doubt, get the site's
permission before capturing any data from it.
Assuming that you have "legal" information to grab, by using Visual Basic and
Excel 97 or higher, you can easily capture the data from a web page and then use
it as you'd like on your own site. You can accomplish this task without Excel,
but you'll see that by using Excel you can save a lot of time parsing strings to
get what you want. Excel's power lies in the fact that it can grab HTML tables
and automatically break them up into rows and columns for you through its "Web
Query" functionality.
Why use Visual Basic?
If Excel can do most of the parsing for us, why do we need Visual Basic?
Well, in all honesty you could use VBA and develop the entire application within
Excel. However, I've had problems with Excel 97 raising memory errors when used
for this purpose on a 24 hour by 7 day a week schedule. I haven't run this
application for a long period of time with Excel 2000 so I can't comment on
whether or not this is still a problem.
Due to Excel's rich object model and the ease Visual Basic affords us in
manipulating automation objects, it is a logical choice. To avoid the memory
problems I experienced whilst running Excel for extended periods, I decided to
use Visual Basic to automatically start and stop Excel after a given time period
so that it never has the chance to raise an out of memory error.
The Application Structure
The premise behind this application is quite simple: grab a particular item,
or items, from a web page and use it within your own (giving the appropriate
source credit if necessary.) Although there are a lot of things you could
capture, we're going to concentrate on gathering weather information for display
on an Intranet. This will allow us to have full control over the presentation of
the data using XML/XSL. If your server doesn't support XML yet, you can easily
customize this to write to a database instead. If you want to use XML on your
server but aren't sure where to get Microsoft's stand-alone XML parser, then see
the informational links for more information.
The application is controlled by a Visual Basic executable that utilizes a
class module to start and stop Excel and determine how often to pull data from
the web. Once Excel is started, it gathers information and then sends the data
to an ASP page. This page updates our XML file. We'll cover the details of how
to move data from Excel to an ASP page in Part II of this article.
The graphic, shown below, outlines the general structure of the application.
The user interface simply provides a way of starting and stopping the "datapull"
process. It also allows us to enter various parameters that determine how often
pulls can occur and when pulls should start and stop. These
parameters are received by the clsExcelObj and used to manipulate
Excel as appropriate. Once Excel is started, it does all the work of parsing a
web page for us. The following sections discuss each element of the application
in more detail.
User Interface
Let's start with the main VB form named frmGetData. The form's
purpose is quite simple…..capture information about the Excel file we desire to
use and capture details about how often Excel should go to the web for updates.
Figure 1, below, shows the form.
Figure 1
The form specifically allows us to start and stop Excel, it
lets us specify how often to pull data, establishes the Excel file to use, and
monitors events to let us know when data was pulled and if the pulls were
successful or not.
The code with this form is fairly straightforward. One important part is the
WithEvents section. The code in Figure 2 is used to notify
us: when pulls are occurring and the status of each pull. In
the general declarations area of the form, you will find the following line:
Public WithEvents excelObject As WebDataPull.clsExcelObj
If you have not used the WithEvents keyword before, it simply
provides a way to get the status of certain operations performed within an
object and handle the events as appropriate. Without using
WithEvents, we would have no way to easily pass back status
information to the user interface. There are other more painful methods of doing
this, such as writing to a log file, but this method is the easiest way to pass
data from the object (detailed below) back to the user interface.
Private Sub excelObject_Status(ByVal statusText As String)
txtStatus.Text = txtStatus.Text & statusText & vbCrLf
End Sub
Figure 2