maxdome meets Google Home


by Markus Ziller / April 3rd 2018

maxdome

Since 2006, ProSiebenSat.1 Media SE, one of the most successful independent media companies in Europe, operates its own video-on-demand portal: maxdome. The online video library generates revenues via pay-per-view and subscriptions and is available via Smart TV and PCs as well as mobile devices. maxdome offers over 50,000 titles in various genres. Our recent launch of deep integration in Google Home Devices took us a step further toward increasing our reach and number of supported devices, and we are one of the early adopters of what many believe to be a new user interface (UI) paradigm.

maxdome - VoiceUI

When Google announced their move from Mobile First to Artificial Intelligence (AI) First on Google I/O 2017, they not only followed Amazon’s lead, but also demoted their $30+ billion Android platform to a second-tier priority. With giants like Google and Amazon competing hard for users in the fairly new segment of voice assistants, it is fair to assume that AI in general and voice assistants in particular are here to stay.

There are plenty of definitions of what AI is and what it isn’t. We’re not here to discuss these differences; we see AI as an enabler for creating systems that allow for a more human-like and natural way of communication with machines. Artificial Intelligence comes in many different forms and shapes, be it a chatbot, image recognition, a neural network learning chess from scratch and outperforming existing top engines, or – in our case – understanding semi-natural spoken language. For us, VoiceUI is a voice user interface that allows people to use voice input to control computers and devices.

As a video-on-demand service, maxdome is entertaining customers and we believe that voice is a great medium to provide an entertainment-centered user experience. Using our voice is the most natural way of human communication and allows us to convey a much wider range of emotions than, for example, the written language. Since the entertainment industry – more than any other industry – is about evoking emotions, it seems only natural to bring Voice UI and video-on-demand together to explore the possibilities a blending of both offers – from hands-free content discovery to a frictionless user journey toward video content.

enter image description here

Read on to discover our voice assistant journey: The challenges we faced, the lessons we learned, and what it meant to be a launch partner for Google’s Voice Assistant.

maxdome’s road to Voice Assistants

Our first attempt at creating a voice assistant was in early 2017 when we added an Alexa-Flash Briefing that enabled users to ask for maxdome’s “tip of the day” – a daily movie recommendation. At the end of the day, this was little more than a “Hello World” skill, but it got us interested in voice and exploring the possibilities this new UI offered.

enter image description here maxdome - Tipp des Tages

When Google announced the launch of their smart speakers in Germany, we were immediately inspired to take the idea of “maxdome meets voice” one step further by deeply integrating our service into a voice device, covering the complete user journey from hands-free content discovery to playout on the TV.

For us, this meant going far beyond simply calling a public REST API and returning text to a handler function, which is how our “Hello World”-like Alexa app worked. A fully functional product that would add actual value to our customers meant bringing all the bits and pieces together that contribute to the user experience when accessing our service.

Content Discovery

Although we had invested a lot of brain and manpower into content discovery, making our content catalog voice-ready proved to be a new kind of challenge.

This was mostly due to Google’s approach to making our content browsable. maxdome’s content discovery is currently powered by a combination of various metadata sources that are feed into Solr. This gave us the greatest flexibility and performance for our test cases in the pre-voice area. However, it soon became apparent that our current approach wouldn’t fit into Google Home’s platform architecture and that a new manner of content discovery was needed for voice platforms.

Luckily, Google is excellent when it comes to providing great User Experience (UX) and dealt with that problem when designing their voice platform: Instead of forcing content discovery via voice to follow the same approach as traditional content discovery, they decided it would be best to just take all the existing content metadata and feed it into the very systems that already do the heavy lifting for Google’s Voice UIs.

This led to a whole new set of requirements when we faced the challenge of transforming all our metadata into schema.org compatible data. Fortunately, maxdome amassed some experience in distributing content to business partners over the past years, e.g. through our partnerships with various cable companies and even railway companies. In fact, video syndication has grown to be a significant pillar of our business model and we are proud to be deeply integrated into a number of platforms and set-top-boxes all over Germany.

Ultimately, we ended up with a lightweight node.js microservice that wrapped around our existing services and provided an RSS feed-like way of ingesting our asset catalog in a schema.org compatible format. While working with schema.org, we became accustomed to the level of standardization and interoperability it provides and are now working on making it our primary way of structuring video metadata.

<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>
https://catalog-service.googlehome.v0.maxdome.cloud/2018-03-29T07:25:20.980Z/1.json
</loc>
<lastmod>2018-03-29T07:25:20.980Z</lastmod>
</sitemap>
<sitemap>
<loc>
https://catalog-service.googlehome.v0.maxdome.cloud/2018-03-29T07:25:20.980Z/2.json
</loc>
<lastmod>2018-03-29T07:25:20.980Z</lastmod>
</sitemap>
{
  "dateModified": "2018-03-28T14:47:01.695Z",
  "@context": "http://schema.org",
  "@type": "DataFeed",
  "dataFeedElement": [
    {
      "@context": "http://schema.org",
      "@type": "TVEpisode",
      "@id": "https://www.maxdome.de/12",
      "name": "Staffel 2 Episode 2",
      "url": "https://www.maxdome.de/tramitz-and-friends/s2/e2--12.html",
      "description": "ProSieben zeigt das Beste aus der hochklassigen Sketch-Comedy von und mit Christian Tramitz. Für seine Sketche lud Christian Tramitz über 40 befreundete Gäste aus Schauspiel und Comedy vor die Kamera: Von Götz Otto bis Til Schweiger - ein Mega-Meeting der Star-Elite.",
      "potentialAction": [
        {
          "@type": "WatchAction",
          "target": {},
          "expectsAcceptanceOf": [
            {
              "@type": "Offer",
              "category": "subscription",
              "availabilityStarts": "1970-01-01T00:00+02:00",
              "availabilityEnds": "2020-01-01T00:00+02:00",
              "eligibleRegion": [
                {
                  "@type": "Country",
                  "name": "DE"
                },
                {
                  "@type": "Country",
                  "name": "AT"
                }
              ],
              "seller": {
                "@type": "Organization",
                "name": "maxdome"
              }
            }
          ]
        }
      ],
      "actor": [
        {
          "@type": "Person",
          "name": "Christian Tramitz"
        }
      ],
      "image": {
        "@type": "ImageObject",
        "height": "295",
        "url": "http://09.static-maxdome.de/getAssetImage/objId:12/type:poster/width:204/height:295/imageId:10.jpg",
        "width": "204"
      },
      "episodeNumber": 2,
      "partOfSeason": {
        "@type": "TVSeason",
        "@id": "https://www.maxdome.de/216664",
        "seasonNumber": 2
      },
      "partOfSeries": {
        "@type": "TVSeries",
        "@id": "https://www.maxdome.de/218197",
        "name": "Tramitz and Friends"
      }
    }
  ]
}

After getting the services up and running, all that was left on the content discovery front was to wait for Google to ingest all 50,000 titles into their system, analyze the contents, and index it for use by the systems powering their voice devices.

Single Sign-On

Parallel to getting our content ingested by Google we had to think about how to integrate our existing user management into a UX paradigm specifically designed to work without textual user interaction.

We found that the easiest solution for this was another service that handled translating our existing session-based authentication mechanisms into token-based authentication that adheres to the OAuth2 standard.

Although translating a stateful and stateless authentication works in general, we found that this brought a number of challenges. Keeping both worlds in sync and determining when an action is necessary proved to be quite a painstaking task, especially when taking into consideration all edge cases and cover functionality like password resets, AVS Pin (Age Verification), or mapping JWT refresh tokens to maxdome’s corresponding auto-login feature.

Our approach to handling this was a dedicated microservice that encapsulates the required logic and reactively handles all user requests on the one hand and actively keeps sessions active, i.e. by creating and refreshing tokens and sessions regularly.

Ultimately, we realized that it might be time to sunset our existing session-based authentication solution in favor of a token-based solution that would also play along nicely with ProSiebenSat.1’s company-wide Single Sign On 7Pass and enable us to integrate our growing number of business partners more easily.

Playout

At this point, users were able to login to maxdome via their Google Home device and browse our content catalog. However, without putting the content on the big screen, it remained doubtful that our customers would be as thrilled about our integration with Google Home as we were.

Fortunately, all the components required for this were (mostly) already there to trigger the Playout from the Google Home mobile app and the Google Home Voice Device.

For the former, all that remained was to create a service to put in front of our existing APIs, proving a Backend-for-Frontend layer to really simplify the required logic and only retrieve the required subset of our regular content metadata. The service ended up being so tiny that we decided to go completely serverless and run the Google Home Lane Service purely on AWS Lambda.

Enabling the playout from a Google Home Voice Device turned out to be even simpler, since – as we learned earlier – the heavy lifting for content discovery on voice devices is performed by Google. All that remained was to connect our existing Google Chromecast implementation to Google Home Voice.

Wiring all the components

Having all the ingredients together, we were now able to provide a comprehensive hands-free user journey from content discovery to playout on the big screen using nothing more than a user’s voice.

Wiring everything together went down pretty smoothly and we were finally able to see the sum of everything we built.

All in all, not only had we crafted a cool product in an exciting new technology, but also found ways to improve our existing architecture that will ultimately lead to a better user experience for all.

MORE BLOG POSTS