Alexa’s Evolution Into An Ambient Assistant

July 22, 2021 / Carolina Milanesi

At Alexa Live 2021, Amazon’s annual developer conference focused on skills and services built for Alexa-enabled devices, the company shared a long list of improvements both from a features and developer tools perspective.

Seven years in the making, Alexa now counts hundreds of millions of devices in use, driving billions of weekly interactions across the continents.  In fact, the number of customers engaging with skills is growing at 40%, year over year, with solid growth in many categories, including music, audio games, and more.

Of all Echo devices, the Echo Show is the fastest growing one. On average, multimodal skills built with the Alexa presentation language see more than 3x the amount of monthly active customers compared to voice-only skills on these devices. And when developers implement API features like API video, they get nearly double the customer engagement of voice-only skills on multiple devices. This is a testament to how Alexa and how we interact with her have evolved over the years in a multimodal exchange that uses voice, touch and motion. I talked in the past about how not having a smartphone offering freed Amazon, at the start of this market, to pivot fully to voice-first interaction. This was necessary for users to change the very entrenched habit of relying on a screen as the primary way to interact with content. By weakening that dependence from the device that is always with us, we have been able to distribute our compute needs across more devices leading us to what we now think of as ambient computing. Then, it is only natural that Alexa herself transitions from a voice assistant to an ambient assistant.

Alexa Live opening keynote had two audiences in mind: users and developers. First, Amazon addressed Alexa’s users, outlining the goal of Alexa as an ambient assistant governed by privacy and transparency.

Proactive, Personal And Predictable

This is what Amazon wants Alexa to be both inside and outside the home, whether through your phone, in your ear or in your car. It also wants Alexa to get smarter, whether this means empowering more natural and personal interactions or using context to offer tailored suggestions or content proactively.

The more natural the interaction with Alexa is, the more engaged users feel. With the AI developments we have seen over the past couple of years, the focus has been on making sure Alexa adapts to you rather than the other way around. As a regular user, I have certainly noticed how Alexa is trying to self-learn by asking if she understood and answered a question correctly or if the device that replied to a question is the device I expected to hear from.

Alexa can also now adapt responses based on the context of the conversation and adjust her tone, stressing certain words and adding pauses and even breaths. Nothing adds more friction than a cheerful tone giving you the wrong answer to a question!

Alexa has also been getting more proactive, and Amazon added two more features to build proactive experiences: Event-Based Triggers and Proactive Suggestions. For instance, Alexa might suggest a workout playlist when a user starts a run. Of course, these options are only available to users who provide permission for a skill to utilize them. Amazon reminded us that privacy and transparency are at the core of how its products and solutions are designed. It minimizes the amount of data collected, limiting it to what is needed to power or improve customer experiences. I think of this as the RoD, the “return on data” – what do I get back from Amazon or Alexa for sharing my data? As long as there is value and trust, users will share data.

Ease of use is one way to increase engagement, but, of course, new use cases help too. For example, Amazon added a few new skills focused on entertainment:

  • INTERACTIVE MEDIA SKILL COMPONENTS + SONG REQUEST SKILL COMPONENT – this shortens the time it takes for radio podcasts and music providers to launch interactive experiences on Alexa, starting with song requests. Including Q&A with the show hosts, voice-driven polls and contests are planned further down the line.
  • SHARED ACTIVITIES API – this brings asynchronous multi-player gaming to Alexa.
  • AMAZON MUSIC SPOTLIGHT FEATURE – a way for music artists to connect directly with fans by highlighting a song album or playlist together with a personal message.
Discoverability, Engagement And Monetization

Skills suffer a similar faith as apps. The more skills, the harder it is to find what you are looking for, especially when the device you are using might not have a screen. So being able to surface skills that are relevant to the users certainly increased the opportunity for developers. The addition of Featured Skill Cards to the home screen of smart devices is one way for users to discover new skills and interact with them right from the home screen. Another way is to use the new APL widgets for rich, customizable glanceable, self-updating widgets views of skill content accessible from the home screen.

Last year, Amazon announced Name Free Interaction (NFI)687, which saved users from remembering the exact words to invoke a skill. On this framework, Amazon is now adding Skill Discovery Via Popular Phrases. For example, users can use popular phrases like “Alexa, I need a workout” to have Alexa offer a music streaming service with a workout playlist. Amazon also wants to make it easier for customers to keep using skills they love, which Personalized Skill Suggestions allows.

With the big scrutiny app stores are getting around payments, it is no surprise Amazon focused on diversifying how developers can earn money. First, by benefitting from a broader set of payment options, including the new Paid Skills. Not every skill experience lends itself well to subscriptions and entitlements. Paid Skills offers the ability to charge a one-time fee upfront to access the content in a skill. This is ideal for premium skills when customers are more likely to pay once to access the core skill experience.

Another new monetization feature is Amazon Associates on Alexa, which gives developers the ability to get a commission on items they recommend to users from So, for instance, you could have a yoga skill and recommend a mat from

Naturally, all these new features aimed at helping developers would mean very little without devices. There are over 140,000 Alexa-compatible products on the market today. To continue to grow this number, Amazon also announced during Alexa Live that Alexa Connect Kit, the service that allows device makers to integrate Alexa into their devices easily, would soon be available in over 25 countries adding support for multiple new languages, including Spanish, French, Italian and Japanese. Furthermore, to increase device interoperability and time to market, Amazon announces it will upgrade its Echo Family of devices to support the new interoperability protocol Matter developed by the Connectivity Standards Alliance.

Seven years in, assistants might not be as hot a topic of discussion, but this, in my view, is simply the result of having come down the initial hype to deliver value to users. In the same way that the number of apps in a store no longer matters, it is not the number of skills but the time users spend using them, the range of activities those skills cover and the consistency of the engagement that tells the story of how the relationship with Alexa and her users is developing. So while I do not expect consumers to have the same love affair with smart devices as they have with smartphones, I expect the dependency on ambient assistants to continue to grow.

Join the newsletter and stay up to date

Trusted by 80% of the top 10 Fortune 500 technology companies