WindowsDevCenter.com
oreilly.comSafari Books Online.Conferences.

advertisement


AddThis Social Bookmark Button

Hacking Word
Pages: 1, 2, 3, 4, 5

expert

Hack #89: Use Perl from Word

This hack shows you two very different ways of getting at Perl from within a Word macro.

Perl’s hard to beat for heavy-duty text processing, and if you’ve already got some Perl scripts lying around for performing certain tasks, you may want to use those from a Word macro rather than starting from scratch in VBA—a decidedly lighter-weight contender when it comes to text processing.

TIP: This hack assumes that you have Perl installed on your system and that you can run Perl scripts from the DOS command line. To download a free version of Perl for Windows, go to the ActiveState web site at http://www.activestate.com.

This hack demonstrates two ways to get at Perl code from a Word macro. The first is the more polished method and requires some special software from ActiveState, which lets you create standalone Windows .dll (dynamic link library) files. These files contain libraries of functions used to perform specific tasks. The second method is about as quick and dirty as they come and does not require any special software.

Creating COM Objects with the Perl Dev Kit

In addition to the free ActivePerl distribution, ActiveState sells software to help Perl developers create Windows applications. The Perl Dev Kit (PDK) lets you create standalone Windows executables, Microsoft installation files, and even .NET applications. You can try it free for 21 days, though any applications you build will expire at the same time as the evaluation period. You can, however, renew your trial period for an additional 21 days. The standard license costs $195. You can get the free trial version from http://www.activestate.com/Products/Perl_Dev_Kit/.

TIP: You need to download and install the PDK to do the rest of the stuff in this section. The method described in the next section, “Call Perl Directly with the VBA Shell Function,” doesn’t require the PDK.

One part of the PDK is PerlCTRL, which builds Windows .dll files from a Perl script. It involves a bit of setup work, but once you create the .dll, using it as a COM object [Hack #84] from VBA is a breeze.

This example is similar to “Use Python from Word” [Hack #88]; it shows you how to build a standalone interface to Perl’s split function. A detailed explanation of PerlCTRL is beyond the scope of this hack, which demonstrates only a simple example.

TIP: Although VBA also includes a split function, with Perl’s you can use a regular expression pattern as the delimiter rather than just a string, making it a much more powerful function. The following is the base Perl code used to build the COM object. It’s a simple wrapper around Perl’s built-in split function:

The following is the base Perl code used to build the COM object. It’s a simple wrapper around Perl’s built-in split function:

  package PerlSample;
  sub Split {
  my $pattern = shift;
  my $string = shift;
  my @list = split(/$pattern/, $string);
  return \@list;
  }

There are three main steps to turning this Perl code into a COM object:

  1. Create a template file using the PDK. The template file contains boilerplate code and examples of the information that PerlCTRL needs to generate the .dll.
  2. Modify the template file.
  3. Generate the .dll from the template.

First, create a new folder on your system and name it C:\PerlCOMSample\. Open a DOS command prompt and navigate to the folder you created. At the DOS prompt, type the following:

> PerlCtrl –t > template.pl

Now open the template.pl file in a text editor, such as Notepad. The file will look like the one shown in Figure 9-13.

Figure 9-13. The template file generated by PerlCTRL
Figure 9-13. The template file generated by PerlCTRL

The template file also includes three unique identifiers that Windows will use to keep track of your .dll. PerlCtrl creates these identifiers when you generate the template file. As the comments in the template file indicate, do not edit those lines. Otherwise, modify the template file as follows:

  package PerlSample;
  
  sub Split {
    my $pattern = shift;
    my $string = shift;
  
    my @list = split(/$pattern/, $string);
    return \@list;
  }
  
  =pod
  
  =begin PerlCtrl
  
  %TypeLib = (
  PackageName => 'PerlSample',
  TypeLibGUID => '{26798342-6F54-4271-9668-B4C0D31EB5C8}', # do NOT edit this
  line
  ControlGUID => '{BD48D84F-C5C9-4E3B-8E36-24E019E4F48D}', # do NOT edit this
  line
  DispInterfaceIID=> '{FF546B71-4492-4E07-BD44-1EDE507CB5A4}', # or this one
  ControlName => 'PerlSample',
  ControlVer => 1, # increment if new object with same ProgID
  # create new GUIDs as well
  ProgID => 'PerlSample.Split',
  DefaultMethod => '',

  Methods => {
    'Split' => {
	      RetType => VT_ARRAY|VT_VARIANT,
		        TotalParams => 2,
				NumOptionalParams => 0,
				ParamList =>[ 'pattern' => VT_BSTR,
				'string' => VT_BSTR ]      
	  },  
	}, 
	# end of 'Methods' 
	Properties => {  }
	, # end of 'Properties'  ); # end of %TypeLib
  
  =end PerlCtrl
  
  =cut

Again, use the three lines generated in your template file, not the ones shown in bold in this example.

Next, save this file as PerlCOMObject.ctrl in the same directory, and then run the following command at a DOS prompt:

> PerlCtrl PerlCOMObject.ctrl

You’ll see the following output:

    Created 'PerlCOMObject.dll'

Now you need to register the new .dll with Windows. At the DOS prompt, enter the following:

> regsvr32 PerlCOMObject.dll

In a few seconds, you’ll see the dialog shown in Figure 9-14, indicating that the .dll file was successfully registered.

Figure 9-14. Windows notifies you when your .dll file is registered successfully
Figure 9-14. Windows notifies you when your .dll file is registered successfully

Now you can call the .dll as a COM object from within a Word macro. Open Word and create the following macro in the template of your choice [Hack #50]:


  Sub TestPerlObject( )
  Dim pl As Object

  Set pl = CreateObject("PerlSample.Split")  
  Dim str As String  
  Dim var( ) As Variant  
  Dim v As Variant
  str = "Hello from Perl!"  
  var = pl.Split(" ", str)  
  For Each v In var    
  MsgBox v  
  Next v  
  End Sub

When you run the macro, you’ll see three dialog boxes displayed in sequence, each showing one of the words in the string “Hello from Perl!”

If you’d like to remove the .dll from your system, enter the following at a DOS command prompt:

> regsvr32 /u PerlCOMObject.dll

Call Perl Directly with the VBA Shell Function

VBA includes a function you can use to launch other Windows applications. At its simplest, the Shell function is roughly equivalent to entering a command at a DOS prompt. For example, enter the following in the Visual Basic Editor’s Immediate window [Hack #2] to launch the Notepad text editor:

   Shell("notepad.exe")

Because Perl is an executable file, you can use Shell to run Perl scripts. For example, if you had a Perl script called C:\foo.pl, you could enter the following in the Immediate window to run the script:

Shell("C:\perl\bin\wperl.exe C:\foo.pl")

After the Shell function executes and the executable program starts, the VBA code continues.

TIP: wperl.exe is the “windowless” Perl. When run, it won’t launch a new DOS window, unlike the regular perl.exe.

You can use the clipboard to pass and return values between VBA and Perl. For example, you can copy selected text to the clipboard and then call a Perl script that reads the clipboard, processes the text, and puts the result back on the clipboard to paste into your document.

However, the VBA macro might try to paste from the clipboard before the Perl script finishes. Thus, you also need a way to have VBA “wait” for the Perl script to finish. One solution is to use a semaphore; that is, have the macro create a temporary folder on your computer, and then have the Perl script delete it once it puts the script result on the clipboard. All you need is a few lines of VBA to check to see if the folder still exists and, if so, instruct the macro to wait a few seconds until the Perl script finishes.

TIP: For a more thorough discussion of semaphores, see http://interglacial.com/~sburke/tpj/as_html/tpj23.html and http://interglacial.com/~sburke/tpj/as_html/tpj24.html.

Since it's likely you'd want to access a variety of Perl scripts from within a Word macro, it's worthwhile to create a reusable function to act as a wrapper around the Shell function call to Perl. The following function takes three arguments: the name of the Perl script to run, the name of the semaphore folder the Perl script should delete when it finishes, and finally the maximum time to wait for the Perl script to run before giving up. The function returns a value of True if the Perl script deleted the semaphore folder, or False if the folder still exists when the time limit is reached. Put this code into the template of your choice [Hack #50]:

  Function RunPerl(sPerlScriptToRun As String, _
                   sSemFolderName As String, _
                   sngWaitMax As Single) As Boolean

  Dim sPerlPath As String
  Dim sFullShellCommand As String
  Dim sSemDir As String
  Dim sSemDirFullName As String
  Dim sngStartTime As Single

  ' Full path of "Windowless" Perl executable
  sPerlPath = "C:\perl\bin\wperl.exe"

  ' Get the full path from the environment variable
  sSemDirFullName = Environ("TEMP") & "\" & sSemFolderName

  ' Put quotes around full script path.
  ' This allows for spaces in script path names, common on Windows systems.
  sFullShellCommand = sPerlPath & " " & _
    Chr(34) & sPerlScriptToRun & Chr(34)

  ' Create semaphore directory, unless it already exists
  If Not LCase(Dir(sSemDirFullName, vbDirectory)) = LCase(sSemFolderName) Then
    MkDir (sSemDirFullName)
  End If

  ' Start the countdown to timeout
  sngStartTime = Timer

  ' Run Perl script
  Shell (sFullShellCommand)

  ' The script will stay in this loop until either
  ' the semaphore directory is deleted, or until the
  ' time limit set by sngMaxWaitTime has passed
  Do While LCase$(Dir$(sSemDirFullName, vbDirectory)) = _
                  sSemFolderName And _
     ((Timer - sngStartTime) < sngWaitMax)
    ' Display a countdown in status bar
    StatusBar = "Waiting " & _
      Int((sngWaitMax - (Timer - sngStartTime))) & _
      " more seconds for Perl ..."

  Loop

  If LCase$(Dir$(sSemDirFullName, vbDirectory)) = sSemFolderName Then
    ' Gave up waiting.
    RmDir (sSemDirFullName)
    StatusBar = "Gave up waiting for Perl"
    RunPerl = False
  Else
    ' Perl script successfully deleted semaphore folder
    StatusBar = ""
    RunPerl = True
  End If

End Function

To see an example of this function in action, and to borrow Tim Meadowcroft's example from Computer Science and Perl Programming (O’Reilly), the following code will demonstrate how to use Perl to standardize phone numbers that are in a variety of formats. (Note: This example uses U.K. phone numbers.)

For starters, the following Perl script called FixPhoneNumbers.pl pulls the text off the Windows clipboard, checks it using a series of regular expressions, then either puts the modified number on the clipboard, or the original, if it couldn't fix it. The standard ActiveState Windows Perl distribution includes the Win32::Clipboard module. Save this script as C:\FixPhoneNumbers.pl:

  use Win32::Clipboard;

  my $TEMP = $ENV{"TMP"};
  my $clipcontents = Win32::Clipboard();
  my $cliptext = $clipcontents->Get();
  my $num = PerlFixPhone($cliptext);

  if ($num != '') {
    $cliptext = $num
  }

$clipcontents->Set($cliptext);
rmdir("$TEMP/vba_sem") || die "cannot rmdir $TEMP\\vba_sem: $!";

sub PerlFixPhone {
    # Tests:
    # 020 xxxx xxxx : fine as is
    # xxx xxxx : assume 020 7xxx xxxx
    # 2xxx : Building 1 extension, assume 020 7457 2xxx
    # 8xxx : Building 2 extension, assume 020 7220 8xxx
    # 0171 xxx xxxx : convert to 020 7xxx xxxx
    # 0181 xxx xxxx : convert to 020 8xxx xxxx
    # Anything else is an error and should be ignored...
    #
    local $_ = shift;
    return $_ if /^020 \d{4} \d{4}$/;
    return $_ if s/^\s*(\d{3})[-\s]+(\d{4})\s*$/020 7$1 $2/;
    return $_ if s/^\s*(\d{3})[-\s]+(\d{4})[-\s]+(\d{4})\s*$/$1 $2 $3/;
    return $_ if s/^\s*(2\d{3})\s*$/020 7457 $1/;
    return $_ if s/^\s*(8\d{3})\s*$/020 7220 $1/;
    return $_ if s/^\s*0171[-\s]+(\d{3})[-\s]+(\d{4})\s*$/020 7$1 $2/;
    return $_ if s/^\s*0181[-\s]+(\d{3})[-\s]+(\d{4})\s*$/020 8$1 $2/;
    return '';
}

The following macro uses the RunPerl function shown above to run the FixPhoneNumbers.pl script. Put this code in the same template as the RunPerl function:

Sub UsePerlToFixSelectedPhoneNumber()
' Pass selected text to a Perl program
' to format/normalize phone numbers
Dim sel As Selection

Set sel = Selection
' Exit if selection doesn't include some text
If sel.Type = wdSelectionIP Then
  MsgBox "Please select some text first"
  Exit Sub
End If

' Copy selected text to clipboard for Perl
sel.Copy

' Run Perl script. If successful,
' paste in changed text from Perl
If (RunPerl(sPerlScriptToRun:="C:\FixPhoneNumbers.pl", _
            sSemFolderName:="vba_sem", _
            sngWaitMax:=5)) = True Then
    sel.Paste
Else
    MsgBox "Gave up waiting for Perl"
End If

End Sub

To see this macro in action, type the following four (U.K.) phone numbers into a Word document:

0171 123 6554
8000
220-8537
220 8537

Select each in turn and run the UsePerlToFixSelectedPhoneNumber macro from Tools -> Macro -> Macros. The macro will convert them to the following format, according to the rules laid out in the FixPhoneNumbers.pl script:

020 7123 6554
020 7220 8000
020 7220 8537
020 7220 8537

If your Perl program takes a long time to run, you may need to adjust the value passed in the sngWaitMax argument to the RunPerl function. This example sets it to five seconds, more than enough time for Perl to finish this little bit of text crunching.

—Sean M. Burke, Andy Bruno, and Andrew Savikas

Andrew Savikas is the VP of Digital Initiatives at O'Reilly Media, and is the Program Chair for the Tools of Change for Publishing Conference.


View catalog information for Word Hacks

Return to Windows DevCenter.