the ceph zone

14 Dec 23

Using the new Siri voices in Shortcuts

Warning: The process described in this post is not safe and digs into the internals of a system application on macOS. Any modifications you make based on this article are at your own risk and may cause data loss in Shortcuts, or have other unforeseen consequences.

iOS 17 and macOS 14 shipped in September 2023 with new voice options for Siri in some regions. Broadly, these new voice options cover accents that tend to be more "regional" than their other options; for British English users, it's nice to have an option other than masculine and feminine shades of stiff Received Pronunciation.

In addition, presumably due to their models being developed after improvements in the underlying TTS model training, these voice options seem to have a slightly more natural tone to their output. I'm partial to Siri Voice 3 (UK), a roughly Mancunian voice options with shades of Philomena Cunk. Here's a comparison of how Siri Voice 3 (UK) sounds compared to Siri Voice 1 (UK):

Notice how the “eye” sound in “I'm” in Siri Voice 1 is clipped, sounding more like “Hi, mmm Siri”.

Some clipping of pronunciation is clear on Siri Voice 1, and it generally a more robotic affect; it's not major, but I find Siri Voice 3 easier to listen to. As such, once iOS 17 and macOS 14 rolled around, I switched my devices to using Siri Voice 3 for Siri and Narration. However, it's a bit more tricky to make use of Siri Voices outside of these cases.

I was playing around with some local LLM models this week in the evenings, desiring some way to produce speech from the model's responses. I specifically wanted to use the Siri voices because they sound more natural in conversation. Given that Siri is part of the Text to Speech system of macOS, it shouldn't be that hard to use our new favourite Siri voice to give some life to these prompts, so I set about trying to find a quick and dirty way to feed the response of privateGPT's chat completion API into Siri voices.

Text-to-speech is fragmented in the House of Apple

Macs have long had various implementations of text-to-speech and narration, going back to MacinTalk in 1985 (though not until System 7 shipped on CDs was it a standard feature). OS X 10.4 marked the introduction of VoiceOver, which significantly expandad narration and TTS capabilities on Macs; this included the CLI tool say, beloved by pranksters for the easy ability to make unattended computers appear haunted via SSH, or perhaps to get your room-mate to let you in when you forget your keys.

XKCD 530: I'm an Idiot. © Randall Monroe
XKCD comic 530 is an example from 2009 of nerds finding creative uses of say.
(© Randall Monroe)

say continues to be in macOS, and you can even get your nice Siri voice.

say here uses whatever TTS voice is selected as the system voice if none is specified. Note that you couldn't choose a Siri voice as the default system voice until macOS 13.

Maybe we need to put on our heirs and graces for a bit though, so let's select Siri Voice 1 (UK) instead. Per the documentation, say -v lets you choose which voice to use, with say -v '?' listing the available voices. Let's look for Siri Voice 1 (UK):

bash-5.2$ say -v '?' | grep -i Siri
bash-5.2$ 

Oh. That's weird. You can't actually choose a Siri voice, only use the Siri voice if it's a system default. But say is an old application using the Speech Synthesis Manager, whose docs are covered in big scary deprecation notices, and modern Apple devices have more modern APIs for speech synthesis. Surely we can use those to have arbitrary on-demand sentences rendered using the Siri Voices?

Apparently not; as this user points out on StackOverflow, the blessed AVSpeechSynthesiser explicitly doesn't support narration with Siri voices. What gives? Perhaps Apple are worried about their brand being ruined by putting their Siri voices over TikTok videos or something.

There is one way that officially supports having Siri directly speak lines of the user's choosing — Apple's Shortcuts.

A Shortcut to Siri

For those who are unfamilar, Shortcuts is an application which kind of like IFTTT or any other visual workflow builder. It is a worse, over-simplified version of the long-time Automator application in macOS that is primarily designed to do basic workflow automation on iOS devices which then worked it's way back to macOS, albeit in an existence where desktop use is an afterthought. But hey, its limited actions are freely shareable across devices and can be triggered by Siri, so that's nice.

One thing that Shortcuts can do is read out any arbitrary text. Unlike other applications, it's not shackled by some weird insistence by Apple that the Siri voices not be used, presumably because Siri-activated Shortcuts would sound weird if they switched voice part of the way through. Here's a simple shortcut that does just that.

The Speak Text action lets you feed any text to be read out by TTS voices, including Siri.

Okay, let's switch this to Siri Voice 3 now… wait, where is Siri Voice 3?

Screenshot of Voice selection dropdown in Shortcuts, only listing Siri Voice 1
No Siri for you! Well, some Siri for you.

Weirdly, if we use the Ask Each Time option and run it, Siri Voices 3 and 4 are visible and work if selected:

Screenshot of voice selection sheet when using Ask Every Time, including Siri Voices 1, 2 and 4.
Ask and you shall receive — but only when asked.

At this point I'm not sure what exactly is going on here. Maybe this is compatibility thing for sharing shortcuts with devices that aren't new enough to have the iOS 17/macOS 14 voices? Maybe it's simply a bug? Maybe Apple is once again concerned that it's shiny new voices will be used for evil? Either way, it doesn't work as-is, and there's no way to make it so in Shortcuts.

Nevertheless undeterred, I wrote a shortcut slowly and painfully which could transcribe voice prompts, send them to privateGPT through the completions API, and then read out the response. It worked, but I wasn't satisfied with Siri Voice 1. Knowing that it's probably as simple as just manually forcing the Shortcut to use Siri Voice 3, I proceeded to figure out how to do that.

Protecting Shortcuts from You

Apple really doesn't want you to create or modify Shortcuts any way except using their app.

The first step in actually editing the Shortcut would be finding a Shortcut file. Since Apple believes users in 2023 might die of fright if exposed to the concept of a file, Shortcuts does not use files for Shortcuts in the app itself. No Save As or Reveal are available in Shortcuts; individual shortcuts can be located by quick Spotlight search, but are absent in Finder's search results.

Shortcuts can be pinned to the Dock; doing Show in Finder on the Dock icon and opening the package contents reveals they are merely stub applications with a ShortcutMetadata.plist file which tells Shortcuts which shortcut should be launched; the actual shortcut contents are nowhere to be found.

Despite some heavy attempts to get you to use shareable iCloud links instead, Shortcuts does actually allow you to export individual shortcuts as files. Let's export a copy of our shortcut and see if we can modify that:

Screenshot of export from Shortcuts. The screen is captioned with the following warning: “When you export a shortcut file for anyone, Apple will validate a copy of your shortcut using iCloud.”
The “For” option here has a rather interesting disclaimer.

This already smells bad. You have to choose who you're exporting for? “Validate a copy using iCloud”? Apple seems to have some heavy restrictions in place here. I can't blame them for being vigilant against abuse for shortcuts, since they can do things that cost you money like sending messages or making phone calls, but it doesn't bode well here. We'll plough ahead regardless.

When trying to manipulate data in an unknown form, it's always really useful to gather as much information you can about the file format — such as whether it's a standard or custom format, or contains interesting information. Let's start with a really basic use of the file command, common on all UNIX-style systems for guesstimating file formats based on known signatures. Let's run file on our exported shortcut.

bash-5.2$ ls
say-anything.shortcut
bash-5.2$ file say-anything.shortcut
say-anything.shortcut: data

The bad start isn't any better; file has no clue what kind of file this is, and nor is it any sort of text based format. This is concerning because it usually means one of two things:

  1. The entire content of the file is encrypted, giving no clue to its contents which can be discerned.

  2. It uses a completely custom binary format unknown to file.

file is not particularly sophisticated, though. For a bit more info, the incredibly useful binwalk tool attempts to identify known file formats which may be embedded within a single file. It was originally developed for extracting files from firmware blobs, update packages for phones and embedded devices and the like. Making everyone mad, I will use Nix but use it wrong by just quickly running binwalk on the fly:

bash-5.2$ nix run nixpkgs#binwalk -- say-anything.shortcut 

DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
57            0x39            Certificate in DER format (x509 v3), header length: 4, sequence length: 775
840           0x348           Certificate in DER format (x509 v3), header length: 4, sequence length: 742
1590          0x636           Certificate in DER format (x509 v3), header length: 4, sequence length: 579

We definitely have more information now, but it doesn't look good. The presence of X.509 certificates within the file implies it is signed and/or encrypted. Now the meaning of Apple's message on the export screen seems a bit clearer - presumably, all exported shortcuts are verified by Apple, who then signs them to make sure they aren't doing Naughty Things.

We'll do our due diligence and check out the file in a hex editor to see if we can find any clues to if the file is encrypted as well as signed, or if the certificates might just be a false positive.

A screenshot of our shortcut in the Hex Fiend hex editor, at the start of the file. Strings are clearly present that indicate the presence of X.509 certificates.
“Apple System Integration CA” doesn't leave much room for interpretation nor does SigningCertificateChain.

Okay, this header part pretty unambiguously contains X.509 certificates “for functions internal to Apple Products and/or Apple processes”. As a note, the presence of bplist00 here indicates that this is some custom container, but one that contains binary PLists, which are a common format used in Apple software. Let's scrub through and see if anything outside this header is clearly in plaintext.

A screenshot of our shortcut in the Hex Fiend hex editor, now at the end of the file. The string we initially input in our Shortcut can be seen.
We can see “I can say anything you want me to!” plain as day in the file.

Later in the container, most of the content of the from the shortcut is very clearly in plaintext, including our string. So the files aren't encrypted, but the fact that certificates are present in the file means that they are almost certainly signed to prohibit tampering. Binary file formats are usually very sensitive to changes in length of fields, but one-byte changes are usually fine unless some checksumming or signing is used to verify the file hasn't been tampered. Let's see what happens with this small change.

Cropped screenshot of the changed portion of the file, changing `!` to `?`, which is highlighted.
This change slightly spoils what the outcome of this change will be.

Opening the file in Shortcuts presents us with this delightful message:

An alert dialogue from Shortcuts which says “Failed to extract the shortcut file data”.
Not exactly a surprising outcome.

Looks like it probably is signed. If you recall, there was a drop-down on the export dialogue which talked about who we were exporting for. Let's look at the other option.

Screenshot of the same export dialogue from Shortcuts as previous. The screen is captioned with the following warning: “Only people who have you in their contacts will be able to use the shortcut. Your contact info will be included in the shortcut file for verification. You can also use this option to make a personal backup of your shortcuts.”
How graceful of Apple to allow us to “make a personal backup of [our] shortcuts”!

The use of “verification” here implies there's still some signing going on; I won't go through the same steps again with this file, but it is indeed signed in a way that prevents trivial tampering. We could attempt to reverse-engineer this container format, extract the certificate and re-sign — after all, for the People Who Know Me option to work, the signing certificate should be located on this machine. However, reverse engineering the container format, extracting the signing certificate and re-signing our shortcut is a lot of work.

Taking the fight to Shortcuts

Maybe there's an easier way to get the shortcuts out? Let's investigate how the Shortcuts application actually stores this data. The home of application data in macOS is the $HOME/Library directory. Sometimes it's in the Application Support subdirectory, but in this case we have a Shortcuts directly in Library. Note that $HOME/Library/Shortcuts is a protected directory and accessing in the terminal requires the Full Disk Access permission.

Looking in $HOME/Library/Shortcuts we can see some interesting files:

bash-5.2$ ls -lah
total 3.8M
drwxr-xr-x   11 cepheus staff  352 Dec 13 18:53 .
drwx------+ 140 cepheus staff 4.4K Oct 29 17:59 ..
-rw-r--r--    1 cepheus staff 6.1K Dec 13 18:51 .DS_Store
-rw-r--r--    1 cepheus staff  246 Dec 13 15:48 SecuredPreferences.plist
-rw-r--r--    1 cepheus staff  331 Dec 13 18:53 ShareSheetState.plist
-rw-r--r--    1 cepheus staff 704K Dec 13 16:59 Shortcuts.sqlite
-rw-r--r--    1 cepheus staff  32K Dec 13 16:59 Shortcuts.sqlite-shm
-rw-r--r--    1 cepheus staff 2.6M Dec 13 18:53 Shortcuts.sqlite-wal
-rw-r--r--    1 cepheus staff 1.5K Dec 13 18:53 Spotlight.dat
drwxr-xr-x    4 cepheus staff  128 Dec 13 15:50 Temporary
drwxr-xr-x    2 cepheus staff   64 Nov  2  2021 ssh

A sqlite database, by the looks? Let's make sure there's no chicanery:

bash-5.2$ file Shortcuts.sqlite
Shortcuts.sqlite: SQLite 3.x database, last written using SQLite version 3039005, writer version 2,
read version 2, file counter 650, database pages 176, 1st free page 176, free pages 2, cookie 0x136,
schema 4, largest root page 65, UTF-8, vacuum mode 1, version-valid-for 650

Yep, looks like Shortcuts just stores everything in one big SQLite database. Technically, it uses a long-supported Apple system framework known as Core Data, which includes iCloud sync and other features, but the engineers at Apple at that time had the foresight not to spend time and effort reinventing the wheel to make high reliability and good performance application data storage when SQLite had already done the work.

We'll quickly make sure nothing else has the SQLite database open with lsof.

bash-5.2$ lsof Shortcuts.sqlite
COMMAND     PID    USER   FD   TYPE DEVICE SIZE/OFF    NODE NAME
siriactio 27662 cepheus    5u   REG   1,24   720896 2580967 Shortcuts.sqlite

Looks like siriactionsd is running and has the database open. A swift kill 27662 will send a SIGTERM to siriactionsd - it seems to be an on-demand launchd agent that is only restarted when Shortcuts is reopened, so we don't have to be worried about it restarting. Let's open the database now.

bash-5.2$ sqlite3 Shortcuts.sqlite                 
SQLite version 3.39.5 2022-10-14 20:58:05
Enter ".help" for usage hints.
sqlite>

As expected, it opens like a regular database, because it is one. Let's take the lay of the land by checking what tables are available.

sqlite> .tables
ZACCESSRESOURCEPERMISSION
ZAUTOSHORTCUTSPREFERENCES
ZCLOUDKITSYNCTOKEN
ZCOLLECTION
ZLIBRARY
ZPERSISTEDSERIALIZEDPARAMETERS
ZSHORTCUT
ZSHORTCUTACTIONS
ZSHORTCUTBOOKMARK
ZSHORTCUTICON
ZSHORTCUTQUARANTINE
ZSHORTCUTRUNEVENT
ZSMARTPROMPTPERMISSION
ZTRIGGER
ZTRIGGEREVENT
ZTRUSTEDDOMAIN
ZVCVOICESHORTCUTMANAGEDOBJECT
ZVCVOICESHORTCUTSUGGESTIONLISTMANAGEDOBJECT
ZVCVOICESHORTCUTSYNCSTATEMANAGEDOBJECT
Z_4PARENTS
Z_4SHORTCUTS
Z_METADATA
Z_MODELCACHE
Z_PRIMARYKEY

The naming convention here is a little unfriendly, probably an artefact of the way Core Data names the tables it creates. Still it is readable, and we can probably infer a few things:

  • ZLIBRARY and ZCOLLECTION probably relate to the collection of shortcuts.

  • ZSHORTCUT probably stores the overall metadata for shortcuts.

  • ZSHORTCUTACTIONS probably stores the actual action data.

A good way of knowing where to look is to figure out where most of the data resides. The dbstat SQLite extension ships with most modern copies, including ours, so we can use it to figure out where most of the data in the DB is stored. SQLite lays data out in pages which are by default 4KB in size, so any table of interest will have multiple pages, which we can find with a short query.

sqlite> SELECT name, pgsize_tot FROM (SELECT name, SUM("pgsize") as pgsize_tot FROM "dbstat" GROUP BY name) WHERE pgsize_tot > 4096;
ZACCESSRESOURCEPERMISSION|53248
ZCLOUDKITSYNCTOKEN|8192
ZCOLLECTION|32768
ZSHORTCUT|126976
ZSHORTCUTACTIONS|94208
ZSHORTCUTRUNEVENT|12288
ZSMARTPROMPTPERMISSION|118784
Z_AccessResourcePermission_UNIQUE_identifier_shortcut|16384
Z_MODELCACHE|16384
sqlite_schema|24576

The largest table is ZSHORTCUT, which probably flags it as important. ZSMARTPROMPTPERMISSION is the next largest, but to me doesn't scream as particularly related to shortcut definitions. ZSHORTCUTACTIONS is also fairly sizeable so we'll check that out

We'll start by inspecting the schema of ZSHORTCUT:

sqlite> .schema ZSHORTCUT
CREATE TABLE ZSHORTCUT (
    Z_PK INTEGER PRIMARY KEY,
    Z_ENT INTEGER,
    Z_OPT INTEGER,
    ZACTIONCOUNT INTEGER,
    ZHASOUTPUTFALLBACK INTEGER,
    ZHASSHORTCUTINPUTVARIABLES INTEGER,
    ZHIDDENFROMWIDGET INTEGER,
    ZLASTSYNCEDHASH INTEGER,
    ZRECEIVESONSCREENCONTENT INTEGER,
    ZREMOTEQUARANTINESTATUSVALUE INTEGER,
    ZRUNEVENTSCOUNT INTEGER,
    ZSYNCHASH INTEGER,
    ZTOMBSTONED INTEGER,
    ZTRIGGERCOUNT INTEGER,
    ZACTIONS INTEGER,
    ZCONFLICTOF INTEGER,
    ZICON INTEGER,
    ZQUARANTINE INTEGER,
    ZCREATIONDATE TIMESTAMP,
    ZLASTRUNEVENTDATE TIMESTAMP,
    ZMODIFICATIONDATE TIMESTAMP,
    ZACTIONSDESCRIPTION VARCHAR,
    ZASSOCIATEDAPPBUNDLEIDENTIFIER VARCHAR,
    ZGALLERYIDENTIFIER VARCHAR,
    ZLASTMIGRATEDCLIENTVERSION VARCHAR,
    ZLASTSAVEDONDEVICENAME VARCHAR,
    ZMINIMUMCLIENTVERSION VARCHAR,
    ZNAME VARCHAR,
    ZPHRASE VARCHAR,
    ZSOURCE VARCHAR,
    ZWORKFLOWID VARCHAR,
    ZWORKFLOWSUBTITLE VARCHAR,
    ZCLOUDKITRECORDMETADATA BLOB,
    ZIMPORTQUESTIONSDATA BLOB,
    ZINPUTCLASSESDATA BLOB,
    ZNOINPUTBEHAVIORDATA BLOB,
    ZOUTPUTCLASSESDATA BLOB,
    ZLASTSYNCEDENCRYPTEDSCHEMAVERSION INTEGER,
    ZWANTEDENCRYPTEDSCHEMAVERSION INTEGER,
    ZDISABLEDONLOCKSCREEN INTEGER,
    ZREMOTEQUARANTINEHASH BLOB,
    ZHIDDENFROMLIBRARYANDSYNC INTEGER
);

Some indices and triggers are also present but omitted here for clarity. As we can see, most of the data here is metadata for Core Data syncing functionality, excepting the first Z_PK column which gives an ID to the rows, but from line 30 on we see ZNAME VARCHAR and other application level columns, but it doesn't seem like the actual workflow data is here. Still, let's quickly check this hypothesis.

sqlite> SELECT Z_PK, ZNAME FROM ZSHORTCUT;
...
33|Speak Text 1

There's our shortcut. Let's check if there's any useful data to be had for it:

33|7|51|2|0|0|1|8639015803512797916|0|0|9|8639015803512797916|0|0|33||34||724201945.032438|724204412.053341|724204412.051997|Text and Speak Text|||2038.0.2.4|Delta Cephei|900|Speak Text 1|||3048E4CE-7448-4402-B326-EA02B82A0BEE|2 actions|bplist00?`X$versionY$archiverT$topX$objects|bplist00|bplist00?


_WFAppContentItem_WFAppStoreAppContentItem_WFArticleContentItem_WFContactContentItem_WFDateContentItem_WFEmailAddressContentItem_WFFolderContentItem_WFGenericFileContentItem_WFImageContentItem_WFiTunesProductContentItem_WFLocationContentItem_WFDCMapsLinkContentItem_WFAVAssetContentItem_WFPDFContentItem_WFPhoneNumberContentItem_WFRichTextContentItem_WFSafariWebPageContentItem_WFStringContentItem_WFURLContentItem||bplist00|1|1|0|???b??|0

The blob fields have those suspiciously similar bplist00 starts, so it's probably safe to say these are some sort of binary PList, but they appear to actually be raw binary PLists this time. Still, nothing that looks a lot like our action, which we know will have our chosen text. On to ZSHORTCUTACTIONS!

sqlite> .schema ZSHORTCUTACTIONS
CREATE TABLE ZSHORTCUTACTIONS (
    Z_PK INTEGER PRIMARY KEY,
    Z_ENT INTEGER,
    Z_OPT INTEGER,
    ZSHORTCUT INTEGER,
    ZDATA BLOB
);
CREATE INDEX ZSHORTCUTACTIONS_ZSHORTCUT_INDEX ON ZSHORTCUTACTIONS (ZSHORTCUT);

This time, the index is included because it tells us an important fact: the data in ZSHORTCUTACTIONS is directly associated with a ZSHORTCUT by means of foreign key. The table is otherwise light on columns save a single blob, so it's fairly likely this is where the data for the shortcut actually exists. Since we know that 33 is the ID of the ZSHORTCUT we want, let's see what data we have.

sqlite> SELECT * FROM ZSHORTCUTACTIONS WHERE ZSHORTCUT = 33;
33|8|20|33|bplist00?
?_WFWorkflowActionIdentifier_WFWorkflowActionParameters_s.workflow.actions.gettext?     _WFTextActionTextTUUID_"I can say whatever you want me to!_$68CA0060-D18F-42DF-850B-C81AA7590BE7?
                                                                                                                                                                                         _i$_WFSpeakTextVoiceVWFText_WFSpeakTextLanguage?UValue_WFSerializationType?TTypeSAsk_WFTextTokenAttachment?#?Vstring_attachmentsByRangea???V{0, 1}?   !"ZOutputUUIDZOutputName\ActionOutputTText_WFTextTokenStringUen-GB

It's still binary gibberish to us, but we can clearly see our text "I can say whatever you want me to!" in here. Let's dump it and take a closer look. SQLite provides another default fileio extension which allows directly writing blobs to files.

sqlite> SELECT writefile('action.plist', ZDATA) FROM ZSHORTCUTACTIONS WHERE ZSHORTCUT = 33;
600

Dropping out of the SQLite shell, let's do a cursory inspection of action.plist with file.

bash-5.2$ file action.plist
action.plist: Apple binary property list

That looks more like it. Let's use Apple's plutil to convert to the slightly-more-readable XML format that's conventionally used for text interchange.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<array>
  <dict>
    <key>WFWorkflowActionIdentifier</key>
    <string>is.workflow.actions.gettext</string>
    <key>WFWorkflowActionParameters</key>
    <dict>
      <key>UUID</key>
      <string>68CA0060-D18F-42DF-850B-C81AA7590BE7</string>
      <key>WFTextActionText</key>
      <string>I can say whatever you want me to!</string>
    </dict>
  </dict>
  <dict>
    <key>WFWorkflowActionIdentifier</key>
    <string>is.workflow.actions.speaktext</string>
    <key>WFWorkflowActionParameters</key>
    <dict>
      <key>WFSpeakTextLanguage</key>
      <string>en-GB</string>
      <key>WFSpeakTextVoice</key>
      <string>com.apple.ttsbundle.gryphon-neural_Martha_en-GB_premium</string>
      <key>WFText</key>
      <dict>
        <key>Value</key>
        <dict>
          <key>attachmentsByRange</key>
          <dict>
            <key>{0, 1}</key>
            <dict>
              <key>OutputName</key>
              <string>Text</string>
              <key>OutputUUID</key>
              <string>68CA0060-D18F-42DF-850B-C81AA7590BE7</string>
              <key>Type</key>
              <string>ActionOutput</string>
            </dict>
          </dict>
          <key>string</key>
          <string></string>
        </dict>
        <key>WFSerializationType</key>
        <string>WFTextTokenString</string>
      </dict>
    </dict>
  </dict>
</array>
</plist>

We can see that there is a key that corresponds to what we want!

<key>WFSpeakTextVoice</key>
<string>com.apple.ttsbundle.gryphon-neural_Martha_en-GB_premium</string>

The TTS voice here is picked by referencing a TTS bundle name. So what's the bundle name for Siri Voice 3? Well, let's find out the brute-force way. We can guess that there is some system library that defines these TTS bundles, so let's search using grep. Since we know there's some weird partitioning between Siri and non-Siri voices, let's make sure to include gryphon-neural to make sure we only turn up files with Siri voices.

bash-5.2$ grep -R 'com.apple.ttsbundle.gryphon-neural' /System/Library
grep: /System/Library/DirectoryServices/DefaultLocalDB/Default: Permission denied
... a bunch more permission denied errors ...
/System/Library/PrivateFrameworks/TextToSpeech.framework/Versions/A/Resources/VoiceIdSampleStringMap.plist:  <key>com.apple.ttsbundle.gryphon-neural_aaron_en-US_premium</key>
/System/Library/PrivateFrameworks/TextToSpeech.framework/Versions/A/Resources/.plist:  <key>com.apple.ttsbundle.gryphon-neural_aidan_en-IE_premium</key>

More plists - seems this one at /System/Library/PrivateFrameworks/TextToSpeech.framework/Versions/A/Resources/VoiceIdSampleStringMap.plist defines some sample strings of some kind for each of the voices. Right at the end of the file, there's some interesting entries:

<key>com.apple.ttsbundle.gryphon_en-GB-D_en-GB_premium</key>
<string>Hi, I’m Siri.</string>
<key>com.apple.ttsbundle.gryphon_en-GB-C_en-GB_premium</key>
<string>Hi, I’m Siri.</string>

Looks like this file defines the text that's spoken to preview each voice, and these are our new voices. Given that we want Siri Voice 3 (UK), we can guess that the correct bundle is probably com.apple.ttsbundle.gryphon_en-GB-C_en-GB_premium.

Getting our voice back

Let's alter our action.plist and change WFSpeakTextVoice to what we think is the correct value, as so:

<key>WFSpeakTextVoice</key>
<string>com.apple.ttsbundle.gryphon_en-GB-C_en-GB_premium</string>

Next, we probably need to convert our PList back to binary. Thankfully, this is as easy as running plutil with another flag.

bash-5.2$ plutil -convert binary1 action.plist 
bash-5.2$ file action.plist 
action.plist: Apple binary property list

Now we're back to binary, we just need to insert our modified action back into the Shortcuts database, which we can do with a simple SQL query using readfile, another function provided by fileio which parallels the earlier used writefile.

bash-5.2$ sqlite3 Shortcuts.sqlite "UPDATE ZSHORTCUTACTIONS SET ZDATA = READFILE('action.plist') WHERE ZSHORTCUT = 33;"

Now for the big moment - will our modification actually work?

Screenshot of the Library view of Shortcuts, with our shortcut visible.
It's always a relief when the application actually opens after making hacky changes.

Opening Shortcuts presents no immediate issues, so we haven't corrupted the database or malformed it so badly that the application crashes or hangs. How does the shortcut itself look?

Screenshot of the shortcut action we modified. The voice dropdown says “Daniel”.
Sorry Daniel, you're not who we're looking for.

That voice name is definitely not correct. But Siri Voice 3 didn't show up in the voice list at all, so maybe this is just a bug resulting from that. Let's try it, then!

Okay, so maybe you're not Daniel after all.

And there you have it! We successfully changed the voice, and it only took going down a massive rabbit hole.

A silly fix for a silly problem

Ultimately, there was no need for me to ever really do this. It's silly and annoying that this bug (or feature?) exists, but fixing it ended up in an unexpected and, dare I say, enjoyable game of figuring out how Shortcuts data is stored.

I wrote this up because a lot of people I talk to don't really know how to dip their toes in when it comes to light reverse-engineering of application data and data structures. While I was hacking on this problem, it felt like a nice opportunity to lay out a clear case-study that was (hopefully) not too difficult to follow.