Using Data Compression in .NET 2.0
Pages: 1, 2
To see how compression will improve the application we have built, let's modify the project so that it supports compression. On the web service's end, add the
Compress() function in Service.vb as follows:
Public Function Compress(ByVal data() As Byte) As Byte() Try '---the ms is used for storing the compressed data--- Dim ms As New MemoryStream() Dim zipStream As Stream = Nothing zipStream = New GZipStream(ms, _ CompressionMode.Compress, True) '---or--- 'zipStream = New DeflateStream(ms, _ ' CompressionMode.Compress, True) '---compressing using the info stored in data--- zipStream.Write(data, 0, data.Length) zipStream.Close() ms.Position = 0 '---used to store the compressed data (byte array)--- Dim compressed_data(ms.Length - 1) As Byte '---read the content of the memory stream into ' the byte array--- ms.Read(compressed_data, 0, ms.Length) Return compressed_data Catch ex As Exception Return Nothing End Try End Function
Basically, this function compresses the data stored in a byte array using the
GZipStream class and then stores the compressed data in a stream object. The compressed data is then returned as a byte array.
To use this
Compress() function, modify the
getRecords() web method as follows:
<WebMethod()> _ Public Function getRecords() As Byte() Dim connStr As String = _ "Data Source=.\SQLEXPRESS;Initial Catalog=Northwind;" & _ "Integrated Security=True" Dim sql As String = "SELECT * FROM Employees" Dim conn As SqlConnection = New SqlConnection(connStr) Dim comm As SqlCommand = New SqlCommand(sql, conn) Dim dataadapter As SqlDataAdapter = New SqlDataAdapter(comm) Dim ds As DataSet = New DataSet() '---open the connection and fill the dataset--- conn.Open() dataadapter.Fill(ds, "Employees_table") conn.Close() '---convert the dataset to XML--- Dim datadoc As System.Xml.XmlDataDocument = _ New System.Xml.XmlDataDocument(ds) Dim dsXML As String = datadoc.InnerXml '---perform compression--- Dim compressedDS() As Byte compressedDS = Compress(UTF8.GetBytes(dsXML)) Return compressedDS '------------------------- End Function
On the client's side, add the
Decompress() function to the code behind of Form1:
Public Function Decompress(ByVal data() As Byte) As Byte() Try '---copy the data (compressed) into ms--- Dim ms As New MemoryStream(data) Dim zipStream As Stream = Nothing '---decompressing using data stored in ms--- zipStream = New GZipStream(ms, CompressionMode.Decompress) '---or--- 'zipStream = New DeflateStream(ms, _ ' CompressionMode.Decompress, True) '---used to store the decompressed data--- Dim dc_data() As Byte '---the decompressed data is stored in zipStream; ' extract them out into a byte array--- dc_data = ExtractBytesFromStream(zipStream, data.Length) Return dc_data Catch ex As Exception Return Nothing End Try End Function
The compressed data is copied into a memory stream object and then decompressed using the
GZipStream class. The decompressed data is extracted into a byte array using the
ExtractFromStream() method, which is defined next:
Public Function ExtractBytesFromStream( _ ByVal stream As Stream, _ ByVal dataBlock As Integer) _ As Byte() '---extract the bytes from a stream object--- Dim data() As Byte Dim totalBytesRead As Integer = 0 Try While True '---progressively increase the size ' of the data byte array--- ReDim Preserve data(totalBytesRead + dataBlock) Dim bytesRead As Integer = _ stream.Read(data, totalBytesRead, dataBlock) If bytesRead = 0 Then Exit While End If totalBytesRead += bytesRead End While '---make sure the byte array contains exactly the number ' of bytes extracted--- ReDim Preserve data(totalBytesRead - 1) Return data Catch ex As Exception Return Nothing End Try End Function
Because you do not know the actual size of the decompressed data, you have to progressively increase the size of the data array used to store the decompressed data. The
dataBlock parameter suggests the number of bytes to copy at a time. A good rule of thumb is to use the size of the compressed data as the block size, such as:
'---data is the array containing the compressed data dc_data = ExtractBytesFromStream(zipStream, data.Length)
Since the data returned by the
getRecord() web method is now compressed, you need to decompress it before it can be loaded onto a dataset object. Modify the Load button event handler as follows:
Private Sub btnLoad_Click( _ ByVal sender As System.Object, _ ByVal e As System.EventArgs) _ Handles btnLoad.Click '---create a proxy obj to the web service--- Dim ws As New dataWS.Service '---create a dataset obj--- Dim ds As New DataSet '---create a stopwatch obj--- Dim sw1, sw2 As New Stopwatch sw1.Start() Dim dsBytes As Byte() = ws.getRecords Label1.Text = "Size of download: " & dsBytes.Length '---perform decompression--- Dim decompressed_dsBytes() As Byte sw2.Start() decompressed_dsBytes = Decompress(dsBytes) sw2.Stop() Label3.Text = "Decompression took: " & _ sw2.ElapsedMilliseconds & "ms" ds.ReadXml(New _ IO.StringReader(ASCII.GetString(decompressed_dsBytes))) '--------------------------- sw1.Stop() Label2.Text = "Time spent: " & sw1.ElapsedMilliseconds & "ms" DataGridView1.DataSource = ds DataGridView1.DataMember = "Employees_table" End Sub
I have also timed how long it takes to perform the decompression so that you have a good idea of how much time is actually spent on performing the decompression.
That's it! Press
F5 to test the program. As usual, click the Load button a few times and observe the size of the data as well as the time taken for each task. Figure 4 shows you the average time I observed.
Figure 4. Using compression
Analyzing the Numbers
It is good to examine the numbers that you have obtained so that you can understand the usefulness of using compression in your application.
Table 1 summarizes the data that you have obtained before using compression and after using it.
|Tasks||Size of Download (bytes)||Time taken (ms)||Decompression Time (ms)|
Table 1. Data obtained before and after using compression
First, observe that with compression, the data is reduced from 266330 to 124148 bytes, yielding a compression ratio of 46 percent. Although I only measured the decompression time (which takes about 13ms), compression time is about the same. The decompression time of 13ms is small compared to the relatively longer time required to transmit and load the data (83ms). Overall, with compression, the time needed to populate the DataGridView control is slighted increased.
In experimenting with different data sizes, it is observed that the compression and decompression times are more or less constant. For example, instead of loading the
Employees table, I loaded the
[Order Details] table. The uncompressed data size for the table is 343902 bytes, and after compression the size is 24987 bytes, giving a compression ratio of 7.3 percent. However, the decompression time is almost similar to that when decompressing a smaller data block.
Interestingly, two blocks of data with the same size but different contents might yield very different compression ratios (the lower the number, the better it is), and text files are much more receptive to compression than binary files such as .exe and .jpg files. For compression to be effective, the block of data to be compressed should be large; compressing small blocks of data actually inflates the data size and wastes precious time in compressing and decompressing.
You should also note that using compression on the web service side will increase the workload of the web server, and hence you need to factor this into your consideration of whether to use compression or not.
In this article you have seen how to use the new compression classes in .NET 2.0. While the implementation of these classes are not as efficient as those utilities in the market (which Microsoft has admitted), they are nevertheless very useful in cases where you need to reduce your data size. What's more, they are free and hence, I have no complaints!
Wei-Meng Lee (Microsoft MVP) http://weimenglee.blogspot.com is a technologist and founder of Developer Learning Solutions http://www.developerlearningsolutions.com, a technology company specializing in hands-on training on the latest Microsoft technologies.
Return to the Windows DevCenter.