Pulling Stock Quotes into Microsoft Excel
Pages: 1, 2
A Web query retrieves text or data from tables or other formatted areas on Web pages. Finding decent help information about Web queries from the Excel 2000 help files is as frustrating as you might expect, but fortunately we don't need any information; we can just mimic a Web query that comes with Excel.
In fact, a handful of Web queries are installed along with Excel. For Excel 2000 (and Excel XP), one of these Web queries retrieves stock information from the MSN Web site. The actual text file is named
Microsoft Investor Stock Quotes.iqy and is located in the Queries folder under the main Office folder. Its contents are:
WEB 1 http://investor.msn.com/external/excel/
fund or other MS Investor symbols separated by commas."] Selection=EntirePage Formatting=All PreFormattedTextToColumns=True ConsecutiveDelimitersAsOne=True SingleBlockTextImport=False
While I don't understand the meaning of each line, it really doesn't matter. The important part is the URL line. If we want to retrieve stock information for a specific list of stocks, we can programmatically create a text file, with the extension
.iqy, that contains each of these lines, but where the URL line is modified by replacing the prompt with a string of stock symbols, as in:
The last symbol ($INDU) specifies the Dow Jones Industrial Average. Other index symbols are contained in the LookupInfo worksheet in the accompanying Excel application.
|Figure 2: An example WebQuery|
The WebBrowser Control
For placing the Web into our Office documents, Microsoft provides the meager and very-poorly-documented WebBrowser control, which can be placed in an Office document to gain access to the Web.
Of course, once the Web browser is on, say, an Excel spreadsheet, we probably want to extract the data from a particular Web page and place it into the cells of the spreadsheet. For this, there is Dynamic HTML, which provides an object model that enables us to extract the data on the Web page.
|Figure 3: A portion of the Historical worksheet, with the WebBrowser control on the right.|
To add the WebBrowser control to an Excel spreadsheet, display the Control Toolbox toolbar and choose the More Controls button (the last button). Then search for Microsoft WebBrowser control. If you don't find it, then choose the last item, which is Register Custom Control and search for the file
SHDOCVW.DLL in your Windows System32 folder.
Once the control is placed on your spreadsheet, you may need to set its Locked property to False (the default seems to be True) so you can enter logon information for Yahoo (or whatever Web page you want to access).
Next, you must remember to set a reference to the Microsoft HTML Object Model (under the References item on the Tools menu in the VBA editor), in order to programmatically control the Web browser's contents.
Let me also mention that the WebBrowser control will automatically return to the previous URL every time the workbook is opened. If you want to prevent this (as I do), then you can add the following line to the Workbook's Open event:
Private Sub Workbook_Open() Worksheets("Historical").WebBrowser1.Navigate "about:blank" End Sub
I should also mention that the WebBrowser control seems to work fairly well in Excel 2000, but I have had a lot of trouble with it under Excel 97, with many General Protection Faults.
The WebBrowser control and the HTML object model are very poorly documented, even by Microsoft's standards. So here is what you need to know (you can get a list of the objects in the model by searching for the MSDN index entry DHTML Objects):
- To refer to a document that has been browsed to by the WebBrowser control, use a variable declared as follows:
Private mDocument As MSHTML.HTMLDocument
- Each item on a Web page of relevance to us is an HTML Element. The
Allproperty of the
HTMLDocumentobject returns the collection of all HTML elements. (The
Allproperty has about 70 entries in the MSDN library index, but none seem to relate to the
HTMLDocumentobject. Nevertheless, the property does work.)
- We can declare an element variable using the generic declaration
Dim elem As Object
or the specific declaration
Dim elem As MSHTML.HTMLBaseElement
As usual, there are two potential disadvantages to using the generic declaration. One is performance, but this is not an issue for this type of application. The other is loss of Intellisense. Since the documentation is so poor, Intellisense is a big help, so I opt for the specific type declarations whenever possible. (This is how I found the
All property of the
tagNameproperty that helps identify the type of element. Since HTML tables have
tagNameTABLE, we can identify the tables in the document. A Table variable is declared as follows:
Dim tbl As MSHTML.HTMLTable
Tableobject, if you prefer) has a
Rowsproperty that returns the collection of
HTMLTableRowobjects, as in:
Set tr = tbl.Rows(0)
Rows collection has a
Length property that returns the number of rows in the table. (As you probably know, almost every Microsoft collection object uses a
Count property for this purpose, but why be consistent?)
HTMLTableRowobject has a
Cellsproperty that retrieves the collection of all cells in the row. Each cell can be accessed by index, as in:
Cells collection also has a
Length property that returns the number of cells in the row.
The WebBrowser control itself has a
DocumentComplete event that fires when the complete Web document has arrived, so we can set the
mDocument variable, in that event. This is also we here can dissect the Web page to get the data we want. The code for this event is shown below, but before we get to that, let us take a look at Yahoo's Web page.
c specify the month, day, and year of the starting date, and the parameters
f specify the month, day, and year of the ending date. The parameter
g specifies the frequency of stock quotes:
w (weekly), or
m (monthly). The parameter
s specifies the symbol. I don't know what the parameters
z specify, but it seems that we need to set
y to 0, and we can ignore
Of course, we need to change the URL and do a separate browse for each stock. Figure 3 shows only two stocks. (I use the blank cell in column A to stop the browsing.) In each case, we remove the data from the Web page and place it in the worksheet.
The page returned from Yahoo contains a second table below the input form, as shown in Figure 3. This table holds the historical stock data. The
MHTML object model can be used to grab that data, as shown in the
DocumentComplete event. I have used the fact that the table of historical data has the word "Date" in its first cell to identify it amongst the seven tables that are actually in this Web page (most of which don't look like tables).
Private Sub WebBrowser1_DocumentComplete(ByVal pDisp As Object, URL As Variant) ' Get the data from the WebBrowser control ' Columns of table Const conHistColDate = 0 Const conHistColHigh = 2 Const conHistColLow = 3 Const conHistColClose = 4 Const conHistColVol = 5 ' DHTML objects Dim elem As MSHTML.HTMLBaseElement Dim tbl As MSHTML.HTMLTable Dim tr As MSHTML.HTMLTableRow Dim iSourceRow As Integer Dim iTargetRow As Integer Dim iFreeCol As Integer Dim dtDate As Date Dim sDate As String ' Get out if not getting data If Not mbGetData Then Exit Sub ' Get reference to completed document Set mDocument = WebBrowser1.Document If mDocument Is Nothing Then Exit Sub ' Look for table of historical data ' This is the table that has "Date" in first cell Set tbl = Nothing For Each elem In mDocument.All ' Table element If elem.tagName = "TABLE" Then ' Get the table Set tbl = elem ' Does table have at least one row? If tbl.Rows.Length >= 1 Then Set tr = tbl.Rows(iSourceRow) If tr.Cells.Length >= 1 Then ' Does the first cell contain word "Date"? If tr.Cells(0).innerText = "Date" Then Exit For End If End If End If End If Next If tbl Is Nothing Then MsgBox "Could not find table of historical data", vbCritical Exit Sub End If ' Find the first available column For iFreeCol = 1 To 255 If Me.Cells(3, iFreeCol) = "" Then Exit For Next ' Place symbol in first col for this stock Me.Cells(2, iFreeCol).Value = msSymbol ' Column headings below symbol Me.Cells(3, iFreeCol).Value = "Date" Me.Columns(iFreeCol).NumberFormat = "@" Me.Cells(3, iFreeCol + 1).Value = "Close" ' Loop through rows of HTML table iTargetRow = 3 For iSourceRow = 1 To tbl.Rows.Length - 1 ' Get the next row of the HTML table Set tr = tbl.Rows(iSourceRow) ' Skip if not full complement of columns If tr.Cells.Length >= 7 Then iTargetRow = iTargetRow + 1 Me.Cells(iTargetRow, iFreeCol).Value = tr.Cells(conHistColDate).innerText Me.Cells(iTargetRow, iFreeCol + 1).Value = tr.Cells(conHistColClose).innerText End If Next End Sub
I have covered the main points of my Excel application, and the accompanying application will allow you to fill in the details. If you find any interesting extensions of this application, please let me know. You can leave a comment on my Web site at http://www.romanpress.com.
Steven Roman is a professor emeritus of mathematics at the California State University, Fullerton.
Return to the .NET DevCenter.