Driving Data Solutions for Connected Vehicles — Mike Branch

By Usama Malik

Mike Branch is the Vice President of Data & Analytics at Geotab and leads the charge for developing solutions that enable insight from the over 30 billion telematics records that Geotab processes on a daily basis. He was previously the CEO of Inovex Inc., a software development company he founded in 2003 with expertise in the healthcare and energy sectors.

Mike Branch, Vice President of Data & Analytics at Geotab, speaks about  data.geotab.com  at  GEOTAB CONNECT 2018  (Source:  Geotab )

Mike Branch, Vice President of Data & Analytics at Geotab, speaks about data.geotab.com at GEOTAB CONNECT 2018 (Source: Geotab)

Interviewing a data company like Geotab about connected cars is like opening the inside of a mechanical watch to see how all the moving parts work. The sensory network that needs to be built in a city to enable smart cars require correlated effort from a large assembly of smart city specialists. The computer vision technology needed to give smart cars the sensory input to make driving decisions is a field of its own. The computation and processing of data from car sensors and GPS required to give connected cars contextual awareness of their environment is an entire market segment driven by companies like Geotab. Geotab is a private company that develops, manufactures and supplies GPS fleet management solutions. The company specializes in telematics and analytics-driven fleet optimization, and is quickly expanding into the growing fields of connected vehicles and smart cities.

Analytics by Design (ABD) correspondents, Usama Malik and Jeremy Fajardo, interviewed Mike Branch, VP of Analytics at Geotab, to explore the world of connected vehicles and smart cities in Geotab’s point of view. Branch shared his successes in helping to steer the company through near-term issues like scalability and data privacy while navigating Geotab across the difficult terrain of the connected vehicle industry. The ABD team also got an inside look at how data teams at Geotab are effectively organized and managed to build cutting-edge analytics products that achieve ambitious business objectives.


Managing the Ecosystem of the Dataverse

Geotab manages the entire data generation and analysis process since the company also develops the telemetry devices that relay data directly back to the Geotab system for analysis. Geotab offers data products across the entire fleet management data lifecycle, such that services and products can be used modularly or in combination by their clients. Even if OEMs develop their own telemetry devices or leverage other third party sensors, data can still enter Geotab’s ecosystem through open APIs for enhanced processing and analysis.

“We position ourselves as an open ecosystem. Our success to-date has been through the platform we’ve created that enables our partners and customers to build some truly incredible applications that scale exceptionally well.Branch said.

Branch believes that “the actual device or dongle used to record telemetry data still has a place in the market since the average age of a vehicle in the US is around 11 years.” Geotab will still have that part of the market. “But there will be parts of the market where we don’t need that device anymore because we get the data directly from the vehicle. So, what we’ve done is developed channels where that data still comes in. Whether from our device, from competitor devices, or from the OEM, data still comes into our ecosystem which we have built up as a great equalizer. It’s about the APIs, it’s about how people interact with the data. It’s not just about the data anymore but everything we have built on top of it.”

The goal of an open ecosystem that taps into different aspects of the data-driven technology stack goes beyond revenue generation. The vision is to create an environment that enables open data sharing and processing which are critical for connected cars in a smart city environment. Closed data silos not only inhibit this kind of data pooling. The lack of data from a multitude of sources also impedes innovation in, for example, the development of smarter and more robust algorithms.

A use case of open data sharing — cars around the bend immediately become aware of the accident around the corner (Source:  Geotab )

A use case of open data sharing — cars around the bend immediately become aware of the accident around the corner (Source: Geotab)

“We believe in a partner ecosystem approach,” Branch added. “With all the organizations we partner with, we are experts in scalable data collection. In the connected car, we don’t employ computer vision at Geotab, but we have partners with that expertise and we infuse them into the ecosystem. It won’t just be vehicle on-board diagnostics (OBD) data that’s going to be important for making decisions in a connected vehicle. It’s going to be a combination of all kinds of sensory forces coming together.”

Branch emphasized that Geotab is an engineering company at heart and that’s what the company wants to grow as. The development of such an ecosystem is what allowed Geotab to grow as a company to what it is today.

“We focus on engineering a rock-solid platform, and our incredible network of partners bring complete solutions to market for our end customers. This model works exceptionally well.”


Driverless Cars: Where the Technology is Now

Branch painted a realistic picture of where autonomous vehicle technology is right now and where it will be in the next few years.

“I don’t think the driverless technology is going to be as ubiquitously available as currently predicted. I see people using figures like X% of vehicles by 2025 will be fully autonomous. I think it’s going to happen more in regions of cities and within more controlled quadrants,” he said.

The key point to note about driverless car technology is how long the ‘tail’ of the technology is. The sensors and onboard computing constitute only one portion of the entire technology stack. Behind these frontend technologies, there exists a sophisticated analytics backend to optimize routes, avoid pedestrians, reduce speed around areas where harsh braking tend to cluster, and so on. The data-driven computing to optimize routes for driverless cars (and even non-driverless cars) alone presents countless development and monetization opportunities for businesses. That being said, the lack of unified data technology standards for extraction of vehicle data presents a big challenge, and is an area where Geotab believes it can serve as a normalization layer because they have already done this for years.

Our focus right now is to figure out how we can help in building an ecosystem for connected cars. We are not going to be in control of the vehicles, that’s not our game plan.
An ecosystem approach is needed to make autonomous vehicles a reality (Source:  Geotab )

An ecosystem approach is needed to make autonomous vehicles a reality (Source: Geotab)

“Regardless of when the world goes autonomous, Geotab will still have a huge role to play. We see ourselves as a great equalizer,” Branch commented. Geotab products are interoperable and will remain desirable even if there is little headway made towards a single standard. “Whether you are driving a Honda Civic or a Ford F-150, the Geotab system will consider a seat belt buckle / unbuckle event as either ‘on’ or ‘off’ despite the fact that different OEM protocols are running in the background.”


Smart Cities: Infrastructure for 21st Century Urban Management and Autonomous Vehicles

Geotab is also making headway in the Smart City domain by leveraging its existing expertise in telematics and fleet management optimization. One of the most common themes in smart cities is the deployment of sensor networks across the city such that city managers and urban planners have massive data sets at their disposal. These data sets could be used for building analytics models that will help monitor at aggregate and plan for urban development projects by providing valuable insights into current city dynamics. Geotab helps accomplish this with its array of sensors on fleets driving throughout a city that gather data across the municipality dynamically. This sensory data helps optimize everything from intersection control to traffic signals.

Geotab CEO, Neil Cawse, spoke about the power of BigQuery in a spotlight session at  Google Cloud Next 2018 (Source:  Geotab )

Geotab CEO, Neil Cawse, spoke about the power of BigQuery in a spotlight session at Google Cloud Next 2018(Source: Geotab)

We have a dedicated business division catered to government customers and smart city projects that our data team works very closely with.” Branch talked about a particular initiative in virtual pneumatic tubes that was presented in Google Cloud Next last year. “Traditionally, we help government customers with optimizing their fleet management. But now as a result of having thousands of vehicles driving through around their city, we can (in aggregate) help them understand and better plan transportation infrastructure. We have a virtual pneumatic tubes application that measures traffic activity without having to deploy large crews to place physical tubes on the roads. After monitoring and comparing with Oakville’s actual pneumatics tube data, we discovered that if we had 50 vehicles driving in the area over a short period of time, we would reach coverage within the 85th percentile. You don’t always need full coverage to get a representative idea for important metrics that otherwise are collected manually. Applications like this and applications like intersection insights are interesting because it helps us model how aggregate driving behaviour and physical intersections fundamentally behave.”

From an infrastructure planning perspective, this information can help identify problematic intersections that the city can put more focus on without the need for additional sensors deployed throughout the city.

Road quality assessment for NYC (Source:  Geotab )

Road quality assessment for NYC (Source: Geotab)

Another use case that Branch shared is the deployment of Geotab sensors on municipality fleets to gather valuable data like cellular coverage areas, air quality and temperature in each city block. Deployment on municipal fleets is very beneficial because fleets cover a lot of different travel patterns and move over the same paths repeatedly, which is important to collect good and representative readings. This initiative unlocked a new and wide array of data sets that is previously unavailable (or unavailable in real time) to city managers.

“Outside of the air quality work we are doing with the Environmental Defense Fund (EDF), we are also working with companies on other projects in this area. Our device is essentially an IoT hub, you can connect any kind of sensor you want to it. Whether it’s an air quality sensor or a temperature sensor, it’s all part of the same ecosystem. This allows you to map out hyperlocal air quality which, according to studies done by the EDF, varies block to block. So, enabling granular understanding at the hyperlocal level is very important.”

Essentially, we see this data as the building blocks to map out what’s happening in the real world.

“We have sensors that tell us whether you are going through a rough patch on the road and alert us on whether there is gravel or a bumpy road ahead. The sensors can even pick up whether an ABS is off in a certain area and can alert other vehicles in the area about it.”

Geotab is also working with intelligent traffic systems companies like Miovision in Waterloo in technology that allow data communication between vehicles and intelligent transportation infrastructure.

“If I’m a freight vehicle coming to an intersection and the light is about to turn yellow, maybe I should extend green for another 2 seconds so I can let the vehicle through. This type of control is useful because stopping and reigniting the heavy duty vehicle incurs costs, wastes fuel, and leads to increased idling, and emissions in the city. We are currently exploring ways to optimize Transit Signal Priority (TSP) through initiatives with Miovision”


Scaling Data Solutions

There are, however, more near-term problems that Geotab must tackle as a data services company, one of which is scalability and high availability of their solutions. The maturing of cloud technology has made data storage increasingly cheap and more accessible for businesses worldwide. However, for data companies like Geotab, the problem isn’t just storage. It is also speed. When dealing with real time data that needs to be processed at breakneck speeds, the main bottleneck is latency — the time required for data to traverse the system in the cloud. Latency will become the key metric as more analytics applications begin processing their data streams in real time. Companies are spending immense amounts of resources to lower latency and develop capabilities to process data quicker. Data processing for connected vehicles will not be an exception.

Geotab’s GCP infrastructure for scaling data solutions (Source:  Geotab )

Geotab’s GCP infrastructure for scaling data solutions (Source: Geotab)

Geotab foresaw the growing issue of scale and continuously looks for ways to strengthen their infrastructure to withstand increasing data traffic.

“I think the biggest powerplay we have at Geotab is how we can scale all of this. A lot of people can do this for a few thousand vehicles or a few thousand drones. When you start getting up to millions, you reach a whole new level of problems in latency and scalability. We employ a lot of Kubernetes pods to dynamically manage scale in production. Even if we have 2, 3 or 4 million vehicles, we can sustain the load with our powerful infrastructure,” Branch said.

Implementing a scalable infrastructure is only half of the equation, the other half is hiring data engineering talent to operate and evolve the data platform. Over the last few years, Branch has organized a Data Engineering department at Geotab to focus on getting data through the pipeline as quickly as possible. To address problems in latency and real data streaming, he believes “it’s all about the ingestion and the pipelining process” which his Data Engineering team strives to improve relentlessly.


The Art of Developing Data Solutions

How do data companies develop data-driven solutions? The answer may surprise many new entrants and outside observers: Geotab spent YEARS just exploring the data. This is an enormous investment for any company to expend. Although a slow process, a thorough understanding of the data is essential if a data company wants to provide value to its customers.

“In order to understand the value we can provide our customers, we had to intimately understand our data. We had to develop the acumen and the toolsets before we can develop these solutions. Insight that you provide to one customer might not be the same insight that you should provide to another because to them, it might be of little value.“ Branch explained.

The company’s strategy in product development epitomizes a balanced synergy between product managers, data scientists and the end user or client. After exploring the data at hand in great depth, the Data Solutions team is divided into Agile pods that work independently to develop solutions for a specific sector or area of focus, such as smart cities or fleet management. “Without conducting the first few years of exploration, it would not be possible to identify these key areas of focus,” Branch said. Each team is guided by a product manager and comprised of a pro services and a data enablement team to develop the solution, starting with a MVP (minimum viable product). MVPs are based on self-generated ideas or proposals from customers or partners. Potential product ideas are deliberately shared with customers in order to foster innovation. MVPs also allow experimentation of new products and solutions early on in order to mitigate future risk and sunk costs.

“If they have a concept or idea that leverages our ecosystem, that they are committed to taking to market, and has a massive opportunity to scale, we will work alongside them to develop a MVP at no cost. And if we feel there’s something there, we pass it to the solutions team to scale,” Branch said. “The tricky part is deciding which ones we’re going to spend time on and which one we won’t.”


Opportunities in Privacy and Security

Outside of the traditional product development teams, Geotab has also stood up a new governance and privacy team. This team’s role is to interface with the Geotab privacy committee to ensure that any data leaving the ecosystem meets privacy requirements. This team also instills governance within development teams to ensure regulatory compliance.

A problem most data companies must inevitably deal with in the aftermath of major data breach scandals and growing concerns on proper data management is, of course, data privacy. Geotab is no stranger to oversight from government bodies that regulate data usage, cyber security and AI ethics.

“Just because we release aggregate data doesn’t mean privacy is automatically protected. You also need to ensure that you have consent and mechanisms in place that promote ethical use of data. Therefore, we employ third party companies that run privacy analysis on our aggregate data and verify for anonymity.” Branch said.

Data aggregation techniques used by Privacy Analytics to create privacy-protected datasets (Source:  Privacy Analytics )

Data aggregation techniques used by Privacy Analytics to create privacy-protected datasets (Source: Privacy Analytics)

In talking about data security, the problem doesn’t just lie with building foolproof security for your network and ensuring that no threats can breach your system. It’s also ensuring that the data sets you utilize are aggregated well enough such that any traces of information that could be used to identify individuals are erased. This is particularly important for a company like Geotab that regularly releases aggregate data sets publicly.

“I have seen companies who claim that they are providing anonymous data and can pinpoint a list of where all the harsh braking is happening in a city. But if they do it based on any individual vehicle, it can easily be turned around,” Branch argued. He illustrated his argument in an example, “If I know that every Monday morning someone is slamming their breaks in front of a particular Tim Hortons, if I wanted to, I could position myself there and I could figure out who it is. Just because it’s anonymous, it doesn’t mean you can’t trace confidential information, and in turn, violate privacy. We do a lot of work to aggregate the data in a privacy-protected way.”

On the flip side, regulations, growing oversight and scrutiny of data policies are creating unique business opportunities. Data privacy, aggregation, and ethical use of AI have become huge markets that new companies are starting to exploit. Branch thinks the privacy area will experience a lot of potential market growth.

“You see GDPR in the EU, CCPA in California and you can tell that more and more of these regions are concerned about privacy…and rightly so.” Branch said.

Firms like Privitar and Privacy Analytics are using technology to evaluate data privacy more rigorously. “We see Geotab as a leader in the privacy area too,”Branch commented. “By working with privacy experts and applying these principles into our data strategy and delivery mechanisms, we can innovate in a way that maintains the trust of our incredible customer network.”


About the Author:

Usama Malik is a Data Analyst and current student of the Master’s of Management Analytics program at the Smith School of Business. Connect with him here!