Talking Documents with SpVoice COM

This page describes how to use the Microsoft Speech API to let your office document talk to the end user.

TL;DR

Final proof of concept to make your Word document speak arbitrary text at https://github.com/0xbad53c/VBA-Talk-Dirty-To-Me/blob/main/payload.vba.

Introduction

While researching alternative ways to write binary files via office macros, I stumbled accross a hidden gem: The Microsoft Speech API. Apparently, this API exposes COM objects, which enables you to call it from VBA.

Discovery

As the original research purpose was to identify new ways to write binary files from maldocs, the first step was to identify interesting COM objects. Several interesting targets stood out, but this article will focus on SAPI.SpFileStream.1. The identified COM object appeared to have a method to write to a file.

To learn how to dump COM objects, refer to ired.team.

// COM Object
SAPI.SpFileStream.1


   TypeName: System.__ComObject#{af67f125-ab39-4e93-b4a2-cc2e66e182a7}

Name   MemberType Definition                                          
----   ---------- ----------                                          
Close  Method     void Close ()                                       
Open   Method     void Open (string, SpeechStreamFileMode, bool)      
Read   Method     int Read (Variant, int)                             
Seek   Method     Variant Seek (Variant, SpeechStreamSeekPositionType)
Write  Method     int Write (Variant)                                 
Format Property   ISpeechAudioFormat Format () {get} {set by ref}    

What if we would use this COM object to write an HTTP response body containing binary data (such as a dll) to a file. The following code snippet The following code snippet could achieve this, when combined with e.g. an xmlhttprequest or winhttprequest COM object.

// Some code
If SendReq.Status = 200 Then
    Set objFSTRM = CreateObject("SAPI.SpFileStream.1")
    Call objFSTRM.Open(tloc, 3, False)
    Call objFSTRM.Write(SendReq.responseBody)
    Call objFSTRM.Close
End If

Sadly, the API appeared to add WAV headers to the binary data written to disk. I could not identify a workaround at this time.

However, I kept digging and identified another interesting COM object related to the Speech API: SAPI.SpVoice.1. The Speak method of the Windows SpVoice interface allows you to speak text, even from, VBA as shown here.

// COM object
SAPI.SpVoice.1


   TypeName: System.__ComObject#{269316d8-57bd-11d2-9eee-00c04f797396}

Name                                   MemberType Definition                                                 
----                                   ---------- ----------                                                 
DisplayUI                              Method     void DisplayUI (int, string, string, Variant)              
GetAudioOutputs                        Method     ISpeechObjectTokens GetAudioOutputs (string, string)       
GetVoices                              Method     ISpeechObjectTokens GetVoices (string, string)             
IsUISupported                          Method     bool IsUISupported (string, Variant)                       
Pause                                  Method     void Pause ()                                              
Resume                                 Method     void Resume ()                                             
Skip                                   Method     int Skip (string, int)                                     
Speak                                  Method     int Speak (string, SpeechVoiceSpeakFlags)                  
SpeakCompleteEvent                     Method     int SpeakCompleteEvent ()                                  
SpeakStream                            Method     int SpeakStream (ISpeechBaseStream, SpeechVoiceSpeakFlags) 
WaitUntilDone                          Method     bool WaitUntilDone (int)                                   
AlertBoundary                          Property   SpeechVoiceEvents AlertBoundary () {get} {set}             
AllowAudioOutputFormatChangesOnNextSet Property   bool AllowAudioOutputFormatChangesOnNextSet () {get} {set} 
AudioOutput                            Property   ISpeechObjectToken AudioOutput () {get} {set by ref}       
AudioOutputStream                      Property   ISpeechBaseStream AudioOutputStream () {get} {set by ref}  
EventInterests                         Property   SpeechVoiceEvents EventInterests () {get} {set}            
Priority                               Property   SpeechVoicePriority Priority () {get} {set}                
Rate                                   Property   int Rate () {get} {set}                                    
Status                                 Property   ISpeechVoiceStatus Status () {get}                         
SynchronousSpeakTimeout                Property   int SynchronousSpeakTimeout () {get} {set}                 
Voice                                  Property   ISpeechObjectToken Voice () {get} {set by ref}             
Volume                                 Property   int Volume () {get} {set}                           

Proof of Concept

At this point, developing a proof of concept was trivial. The idea was to create a simple Word document for a phishing test. The speech should scare the targeted user into never accepting macros again. The result is shown below:

// VBA code
Sub run()
    Dim objVOICE As Object

    'create SpVoice interface COM object instance
    Set objVOICE = CreateObject("SAPI.SpVoice.1")
    
    Dim i As Integer
    i = 1
    
    'Scare the user into thinking this is ransomware
    objVOICE.Speak "Phishing alert! Hacker Detected. Pay 1 bitcoin to unlock your computer. Encrypting your files in 10 9 8 7 6 5 4 3 2 1."
    
    'repeat some stuff 10 times
    Do While i < 10
        objVOICE.Speak "Phishing alert! Hacker Detected."
        i = i + 1
    Loop

    'Inform the user about the test, which is the responsible thing to do
    objVOICE.Speak "This was a test, you failed."
End Sub

Conclusion

In this article, we identified a way to use the Microsoft Speech API from Office macros and other scripts to talk to users. What to do with it is up to the creativity of the developer! The example in this research covers a potential educational phishing scenario.

Last updated