WindowsDevCenter.com
oreilly.comSafari Books Online.Conferences.

advertisement


AddThis Social Bookmark Button

Using Data Compression in .NET 2.0
Pages: 1, 2

Adding Compression

To see how compression will improve the application we have built, let's modify the project so that it supports compression. On the web service's end, add the Compress() function in Service.vb as follows:



    Public Function Compress(ByVal data() As Byte) As Byte()
        Try
            '---the ms is used for storing the compressed data---
            Dim ms As New MemoryStream()
            Dim zipStream As Stream = Nothing

            zipStream = New GZipStream(ms, _
                            CompressionMode.Compress, True)
            '---or---
            'zipStream = New DeflateStream(ms, _
            '                CompressionMode.Compress, True)

            '---compressing using the info stored in data---
            zipStream.Write(data, 0, data.Length)
            zipStream.Close()

            ms.Position = 0
            '---used to store the compressed data (byte array)---
            Dim compressed_data(ms.Length - 1) As Byte

            '---read the content of the memory stream into 
            '   the byte array---
            ms.Read(compressed_data, 0, ms.Length)
            Return compressed_data
        Catch ex As Exception
            Return Nothing
        End Try
    End Function

Basically, this function compresses the data stored in a byte array using the GZipStream class and then stores the compressed data in a stream object. The compressed data is then returned as a byte array.

To use this Compress() function, modify the getRecords() web method as follows:

    <WebMethod()> _
    Public Function getRecords() As Byte()
        Dim connStr As String = _
           "Data Source=.\SQLEXPRESS;Initial Catalog=Northwind;" & _
           "Integrated Security=True"
        Dim sql As String = "SELECT * FROM Employees"
        Dim conn As SqlConnection = New SqlConnection(connStr)
        Dim comm As SqlCommand = New SqlCommand(sql, conn)
        Dim dataadapter As SqlDataAdapter = New SqlDataAdapter(comm)
        Dim ds As DataSet = New DataSet()

        '---open the connection and fill the dataset---
        conn.Open()
        dataadapter.Fill(ds, "Employees_table")
        conn.Close()

        '---convert the dataset to XML---
        Dim datadoc As System.Xml.XmlDataDocument = _
           New System.Xml.XmlDataDocument(ds)
        Dim dsXML As String = datadoc.InnerXml

        '---perform compression---
        Dim compressedDS() As Byte
        compressedDS = Compress(UTF8.GetBytes(dsXML))
        Return compressedDS
        '-------------------------
    End Function

Adding Decompression

On the client's side, add the Decompress() function to the code behind of Form1:

    Public Function Decompress(ByVal data() As Byte) As Byte()
        Try
            '---copy the data (compressed) into ms---
            Dim ms As New MemoryStream(data)
            Dim zipStream As Stream = Nothing

            '---decompressing using data stored in ms---
            zipStream = New GZipStream(ms, CompressionMode.Decompress)
            '---or---
            'zipStream = New DeflateStream(ms, _
            '                CompressionMode.Decompress, True)

            '---used to store the decompressed data---
            Dim dc_data() As Byte

            '---the decompressed data is stored in zipStream; 
            ' extract them out into a byte array---
            dc_data = ExtractBytesFromStream(zipStream, data.Length)

            Return dc_data
        Catch ex As Exception
            Return Nothing
        End Try
    End Function

The compressed data is copied into a memory stream object and then decompressed using the GZipStream class. The decompressed data is extracted into a byte array using the ExtractFromStream() method, which is defined next:

    Public Function ExtractBytesFromStream( _
       ByVal stream As Stream, _
       ByVal dataBlock As Integer) _
       As Byte()

        '---extract the bytes from a stream object---
        Dim data() As Byte
        Dim totalBytesRead As Integer = 0
        Try
            While True
                '---progressively increase the size 
                ' of the data byte array---
                ReDim Preserve data(totalBytesRead + dataBlock)
                Dim bytesRead As Integer = _
                   stream.Read(data, totalBytesRead, dataBlock)
                If bytesRead = 0 Then
                    Exit While
                End If
                totalBytesRead += bytesRead
            End While
            '---make sure the byte array contains exactly the number 
            ' of bytes extracted---
            ReDim Preserve data(totalBytesRead - 1)
            Return data
        Catch ex As Exception
            Return Nothing
        End Try
    End Function

Because you do not know the actual size of the decompressed data, you have to progressively increase the size of the data array used to store the decompressed data. The dataBlock parameter suggests the number of bytes to copy at a time. A good rule of thumb is to use the size of the compressed data as the block size, such as:

'---data is the array containing the compressed data
dc_data = ExtractBytesFromStream(zipStream, data.Length)

Since the data returned by the getRecord() web method is now compressed, you need to decompress it before it can be loaded onto a dataset object. Modify the Load button event handler as follows:

    Private Sub btnLoad_Click( _
       ByVal sender As System.Object, _
       ByVal e As System.EventArgs) _
       Handles btnLoad.Click

        '---create a proxy obj to the web service---
        Dim ws As New dataWS.Service
        '---create a dataset obj---
        Dim ds As New DataSet
        '---create a stopwatch obj---
        Dim sw1, sw2 As New Stopwatch

        sw1.Start()
        Dim dsBytes As Byte() = ws.getRecords
        Label1.Text = "Size of download: " & dsBytes.Length

        '---perform decompression---
        Dim decompressed_dsBytes() As Byte
        sw2.Start()
        decompressed_dsBytes = Decompress(dsBytes)
        sw2.Stop()
        Label3.Text = "Decompression took: " & _
           sw2.ElapsedMilliseconds & "ms"
        ds.ReadXml(New _
           IO.StringReader(ASCII.GetString(decompressed_dsBytes)))
        '---------------------------

        sw1.Stop()
        Label2.Text = "Time spent: " & sw1.ElapsedMilliseconds & "ms"
        DataGridView1.DataSource = ds
        DataGridView1.DataMember = "Employees_table"
    End Sub

I have also timed how long it takes to perform the decompression so that you have a good idea of how much time is actually spent on performing the decompression.

That's it! Press F5 to test the program. As usual, click the Load button a few times and observe the size of the data as well as the time taken for each task. Figure 4 shows you the average time I observed.

Figure 4
Figure 4. Using compression

Analyzing the Numbers

It is good to examine the numbers that you have obtained so that you can understand the usefulness of using compression in your application.

Table 1 summarizes the data that you have obtained before using compression and after using it.

Tasks Size of Download (bytes) Time taken (ms) Decompression Time (ms)
Without Compression 266330 65 -
With Compression 124148 83 13

Table 1. Data obtained before and after using compression

First, observe that with compression, the data is reduced from 266330 to 124148 bytes, yielding a compression ratio of 46 percent. Although I only measured the decompression time (which takes about 13ms), compression time is about the same. The decompression time of 13ms is small compared to the relatively longer time required to transmit and load the data (83ms). Overall, with compression, the time needed to populate the DataGridView control is slighted increased.

In experimenting with different data sizes, it is observed that the compression and decompression times are more or less constant. For example, instead of loading the Employees table, I loaded the [Order Details] table. The uncompressed data size for the table is 343902 bytes, and after compression the size is 24987 bytes, giving a compression ratio of 7.3 percent. However, the decompression time is almost similar to that when decompressing a smaller data block.

Interestingly, two blocks of data with the same size but different contents might yield very different compression ratios (the lower the number, the better it is), and text files are much more receptive to compression than binary files such as .exe and .jpg files. For compression to be effective, the block of data to be compressed should be large; compressing small blocks of data actually inflates the data size and wastes precious time in compressing and decompressing.

You should also note that using compression on the web service side will increase the workload of the web server, and hence you need to factor this into your consideration of whether to use compression or not.

Summary

In this article you have seen how to use the new compression classes in .NET 2.0. While the implementation of these classes are not as efficient as those utilities in the market (which Microsoft has admitted), they are nevertheless very useful in cases where you need to reduce your data size. What's more, they are free and hence, I have no complaints!

Wei-Meng Lee (Microsoft MVP) http://weimenglee.blogspot.com is a technologist and founder of Developer Learning Solutions http://www.developerlearningsolutions.com, a technology company specializing in hands-on training on the latest Microsoft technologies.


Return to the Windows DevCenter.