“Alexa, it’s Peter who pays Paul”

Consider these commands:

  • “Alexa, ask Peter to pay Paul.”
  • “Alexa, ask Harry for a meeting with Sally.”

and compare with

  • “Computer, ask Replicator for coffee, black!”

The last one doesn’t sound right. Does it? That’s because you don’t tell “the Computer” it’s “the Replicator” that can make coffee, black. It knows that. So why should we keep telling Alexa what Peter and Harry can do – over and over again?

The Alexa team recently announced that you no longer have to explicitly enable skills. Great. I think we need to stay on the track of simplification and really get rid of requiring skill names in utterances, “invocation names” as they are called. Or, at least remove invocation names from short single-shot utterances. There’s just no room and every command sounds like a relay race. We can’t get rid of “Alexa, ” She’s beloved and literally a household name with the potential of being a bigger brand name than Amazon himself. Plus, we don’t want to complicate the wake up logic on the device too much. So if we can’t directly address Peter and Harry, they need to go.

I know. It’s a branding thing and developers like myself will frown upon it. However, we can learn from successful Amazon sellers who have realized that it’s not the name of their store but time in the Buy Box that matters – and – Amazon controls that based on what’s best for the customer. That motivates the collective of sellers to be better, become better, stay better. That’s customer focused. Invocation names do nothing for the customer. Do I really care to ask Capital One for my expenses from yesterday – or do I just want to know my expenses from yesterday? Do I want to ask Zyrtec for the pollen count? Imagine if I was forced to select the seller to choose a product on amazon.com. How horrible would that experience be? It’s equally horrible to force the user to recite brand names.

Invocation names are perfectly reasonable for initial enablement or opening dialogues.

From an implementation point of view, yes, invocation names offer a graceful handoff from one domain (Alexa/Google Home) to another (developer). However, removing invocation names shouldn’t complicate things too much – assuming all we are doing is akin to creating a model that’s an agglomeration of all utterances across all of my skills. “my skills” being the ones that I use < the ones I have enabled << the ones that are available plus a simplification that utterances are disjoint across my skills. That should be valid except for possibly what Alexa can herself do – but that’s just an ordering problem. We should be able to handle collisions later. As long as the collection of utterances for a given skill is still the training set (and not confused by the combining of all), we should still be okay. I feel mapping an utterance to an intent (within a code space) doesn’t really need an invocation name to id an utterance set. Perhaps we amend “the model” at skill activation time (or first use time per the new method). Such dynamism where models are no longer static per skill but influenced by the user and the usage adds complexity no doubt, but the resulting solution’s building blocks should help in bringing context and personalization into models. Help differentiate between the Kathy with a K at work and the Cathy with a C at home, though I pronounce them exactly the same way.

In the meanwhile, developers should consider using invocation names that are more natural to a sentence than brand names: “Alexa, ask my bank how much I spent yesterday”. Perhaps there’s an interim method where multiple invocation names are allowed for the same skill, so one can get specific when needed or use the variations to help keep the semantic correctness of what’s being said. Another method is to allow customers to come up with their own aliases – especially if Alexa has difficulty understanding a pronunciation of a brand name or if one wants to secure the functions of some known skills from visitors.

What’s also acceptable is if we have to use the invocation name, the first few times and then she learns it’s Peter who pays Paul (for me).