Creating TV Apps With Web Technology

Ever wondered how TV apps are made? How it differs from the usual web app or mobile app development? I was fortunate enough to work on TV platforms for about year now, and there is some exciting stuff for me to share around this subject. I will first give an introduction on TV platforms, then explain why we decided to use web technology to build TV apps, finally I'll go through some of the challenges we faced that are specific to TV app development.

As mentioned already I've broken down the blog post into different sections, feel free to skip to the relevant sections if you are not interested in the background information.

Introduction to TV platforms

Many people own at least one TV. It is what we use to play console games, watch the latest shows and movies. We can even project our phone or computers on to them to show things to a bigger group. It is estimated that 259 million TV units will be solved by 2020 source: statista.

Here are just some of the major TV platforms that we support:

many tv platforms

As shown in the image above, the platforms are made out of a mixture of different type of devices. Some platforms are built into TVs (e.g. Samsung's Tizen, LG's webOS, Android TV), there are also external boxes that you can plug into any TVs and turn them into a smart TV (e.g. Amazon Fire Stick, Apple TV, Android TV again, Chromecast, etc), then there are game consoles that can act as a TV box on top of its gaming functionalities. Finally, streaming platforms often come up with their own dedicated physical devices and operating systems (Roku, etc). It is probably safe save to say, the fragmentation within TV platforms are a lot worse than web or mobile.

The above platforms are also known as "smart-TVs" by the customer, why is it smart? Because they have apps.

Building TV Apps

It is quite unfortunate for companies who needs to release apps on TVs, because of the fragmentation issue mentioned. So how might one go about developing a TV app? It is a bit like Mobile app development actually, there are 3 general approaches:

  1. Native
  2. Web or PWA
  3. Hybrid

Let's talk about each of them and see what are the pros and cons.

Native Apps

In order to go Native, we will need to build a separate package (app bundle) for each platform. Give the number of platforms we need to support, this is both time-consuming and costly for the business. Especially when most of the platforms use different languages with their unique frameworks and libraries. The development process involves writing redundant business logic in different languages which are not shareable. Needless to say, this is very inefficient and should be avoided if possible.

Mobile development has a strong argument to go Native because there are only 2 platforms for them to develop against. Imagine if there are 10 platforms, in that case I doubt many companies can afford to build their app in Native.

On the positive side, out of all the choices, building dedicated Native apps for each platform gives the most customizability and control. If would be possibel to adapt apps to their respective platforms, or even make them totally different from each other if we wish.

Web App or PWA

Surprisingly this is more viable than one might think. Most platforms support some kind of WebView which are used to render web applications. It is how Progressive Web Apps (PWA) work on Android (mobile). However, the main drawback with this approach is the lack of native API support. Say I want to detect when the TV loses internet connection then show an error notification as it happens. A pure Web App won't be able to achieve that as most TV platforms don't bake it into its WebView. These kind of limitations means it is difficult to build a high-quality app, and in some cases, a basic application.

The main advantage of this approach is how easy it would be to deploy apps to all the platforms since all we have to do is to deploy the web app and then register it on each platform. Unfortunately, this approach is not ready just yet.

Hybrid Solution

This solution effectively combines the benefits of the first two approaches. With this we create a basic shell application for each native platform, they all have limited functionalities and provide WebView with native APIs support. Inside the shell app, there is a web app that contains most of the app logic. This web app is shared across all platforms, which means less redundant code.

Here's a diagram to better illustrate how it all works:

hybrid solution diagram

In this example, the shell application is written in Java and uses Cordova to provide native API to WebView.

TV Platform Specific Challenges

Before we drive into the challenges, let me first show you what kind of UX we are working with. The image below shows an out-dated design because the new design is still sensitive information.

TV app UI design

Note the following terminologies, we'll be referring back them a lot in this section.

  • Tile - a rectangle block
  • Rail - a row of Tiles
  • Grid - multiple rows of Rails

Display

TV devices are meant to be used from far away, this means we had to make all the Tiles large enough so that the text is still readable from a few metres away. This leads to less information is shown on the screen, in turn, it would require more user actions to browse around. This mean all of our UI had to be simple and concise.

TVs come in different screen sizes (luckily the same ratio). How can we make the UI elements the right size for all devices no matter the screen size? We ended up going with Viewport units to resolve this issue. But it is still challenging to use a percentage based unit, especially when we want to add animation to something often have to use values with long decimal points.

Navigation

Navigation is a core part of any application, without it, there isn't much the user could do with the app. Unlike web or mobile, we can't click or tap on stuff to register an action. Instead, we are limited to TV controller inputs, again due to the fragmentation problem identified already, we want to keep things simple here. We support these actions: UP, DOWN, LEFT, RIGHT, SELECT, and BACK. The whole app behaviour needs to be built around these basic interactions and this brings some serious challenges.

Focus is a big thing, so I've dedicated a whole section for it later on.

With browsing all controlled by arrow keys. Each button press equates to one user action. If there is a long Rail, it will take tens of button presses to browse through all of Tiles which is not ideal. To improve the user experience here, we created "fast scroll" functionality. When the user press and holds a button (e.g. LEFT), this allows faster navigation around Rails (they will see the tile shifting through) until they release the button.

With Grids, we have the same problem as long Rails. The "fast scroll" technique is not going to help us much here, because now we are working with 2 dimensions horizontal and vertical navigation. To help users find what they want on the grid faster, we introduced the "wrap-around" concept, which means when the user press LEFT on the most left Tile, the focus will wrap around to the Tile on the far right side. Using the screenshot example above, when user press LEFT on the current focus Tile "Milan v Atalanta" the focus will shift to "Roma v Sampdoria".

Layered Views

It is worth knowing the structure of the app before we proceed to Focus management. The app we are building has 4 layers:

  • Player layer - video player
  • Page layer - shows rails
  • Overlay layer - transparent and appears on top of Page layer
  • Notification layer - for error handling, acting as modals

layer diagram

The reason why we have this multiple-layer design is that the user must use the buttons specified in the Navigation section to move around the screen. User can navigate back and forth between the different layers using SELECT and BACK buttons. When the user change between different layers, the focus will update and refocus on something appropriate based on what layer and content are being displayed. Therefore the layered design, will give the user a sense of depth on an otherwise very flat navigation flow.

Focus Management

Focus is a crucial part of the UX since we need a way to indicate to the user what they are focusing on. This was by far one of the most challenging parts for us.

One of the reasons that make focus difficult to manage is that we require one and only one thing in focus at any point in time. Otherwise we either risk multiple elements in focus at the same time, or that focus was completely lost with the user stuck in their journey.

We had to ensure focus can be handed around the app correctly. This means between elements, pages, layers and even deep-linking. How do we know what to give focus to and focus on? To solve this problem, we create a focus map (JavaScript object) which we use to look up where the focus should go in different situations. E.g. given the page, currently focused element, keypress we can find out what to focus on next.

For deep-link journeys on TVs, when the user lands on a page with no prior focus, the app will assign focus to the default focusable element. This "default focus" is part of the focus map explained above.

Back Journeys

This is another topic that gave the whole team a lot of headaches. Put simply, the back journey is what happens when the user press the BACK button. I think the main challenges here comes from the UX and not the TV platforms. But though I'd mention it anyway, just because it caused us a lot of pain.

Back journey could mean any one of these things (knowing how complex it is, I probably missed some out):

  • It could be simply reset focus on the current page, often going back to the default focusable element
  • It could close or dismiss notification without affecting Browser History (no routing change)
  • Go between Overlay and Page layers (change in Browser History)
  • Go between different Pages (change in Browser History)

On any BACK button press, it could result in one of the above behaviours. The difficulty is working out which one, and it depends on several factors: where the user came from, where the user is right now, what is in focus or is it part of any specific business logic?

There are talks of change the journey completely.

Performance

The TV devices are such a mixed bag, some are quite powerful while others are very limited in resources. Performance optimisation is one of those conversations we have weekly. But I won't cover more of it here, if you are interested please check out my previous posts improve CSS performance of Cordova apps on Android TVs and ways to improve list loading performance.

Development

We didn't want the Native side of things slow us down when we develop the web application. So we made sure that we can spin up the web app in the browser and develop it the same (almost) way as a standard web application. We use hot-module-reloading and other common development strategies, so we can implement new features at roughly the same speed.

Once we finish developing a feature, we always have to test it on physical devices. This almost always means generating a new Native Shell app and side-load the web app inside, then test it on whatever device we are developing against.

Final words

The actual project I am working has a lot of complexities than the ones mentioned above, because we are using a pretty complex micro-frontend architecture and teams in different countries work on different part of the architecture. But I still enjoyed building TV apps with web technology, there are some interesting findings along the way which helped me to understand more about frontend development in general. If anyone gets an opportunity to work on TV platforms, based on my own experience I'd recommend people give it a try!