This blog post introduces a real case from a world-class commercial IoT service provider that uses AWS IoT to run its telemetry data analytics business that fulfills diverse and real-time data analysis requirements for clients.
The key challenge the business faced was ingesting telemetry data in different formats to AWS IoT and generating real-time data analytics. Additionally, the business’ solution needed to align to its client’s specific aggregation rules so that end users could receive analytics results with business insights. To solve this, the business used AWS services to build its IoT data analytics solution, implement the composition of telemetry data with predefined analytics rules, and leverage the composition to generate business insights. This solution enabled the business to adjust telemetry data structures and aggregation rules and to generate real-time insights according to the new structures and rules.
In this blog, we walk through a reference architecture and describe how the commercial IoT solution uses AWS IoT Core to ingest telemetry data from devices and other systems and receive analytic rules from clients, and uses Amazon Kinesis to perform telemetry data analytics.
Many enterprises that have registered and monitored their devices and sensors on IoT platforms are seeking business insights from telemetry data analytics. Their use cases range from building management to smart offices, connected vehicles, smart cities, and more; all require real-time analytics based on various data types and analysis policies. The diversity of data analytics introduces challenges to commercial IoT service providers (CIoT) who service many IoT solution suppliers and their clients. CIoT service providers expect to ingest both telemetry data and analytic rules to aggregate the data instantly.
The collaboration between IoT solution suppliers and their clients on the platform owned by CIoT service providers is shown in Figure 1.
Figure 1: CIoT service provider, IoT solution suppliers, and clients
1) The IoT solution suppliers onboard their IoT solutions and devices to the platform in different ways and then offer specific services to their clients. Those solutions and devices generate a large quantity of telemetry data in specific types. All the data types and data sources from the suppliers must be supported, and real-time data processing and aggregation need to be fulfilled.
2) The client runs their business on the solution and devices offered by the IoT solution supplier and needs data analytics from multiple points of view to gain valuable business insights from the solution. The client needs to define analytic rules based on the telemetry data structure and the solution from the supplier to deliver analytic results according to the rules.
3) When the data is analyzed, the CIoT service provider must ensure the platform can integrate correct data with correct clients. For example, if a client uses a supplier’s smart building solution on the CIoT service provider’s platform, the platform must pick up that specific client’s building data and analyze it according to rules for those specific buildings. Without this, the analytics will make no sense to the client, and might even cause negative consequences.
The CIoT service provider requires a data ingestion and analytics solution running on its CIoT platform to orchestrate rules and data aggregation from multiple third party IoT solutions. The solution in this blog post supports these requirements by: 1) receiving telemetry data ingested from different types of data sources, 2) dynamically combining telemetry data and predefined analytic rules, 3) preprocessing telemetry data and performing real-time data aggregation.
The solution helps the CIoT service provider easily achieve three key benefits for their suppliers:
1. The suppliers can connect their devices to the CIoT platform through AWS IoT Core. Those devices directly register in AWS IoT Core and send telemetry data to topics of AWS IoT Core.
2. The suppliers can run their own IoT solutions on AWS, and leverage any approach such as AWS IoT Core to accept telemetry data sent by their devices. The suppliers can perform data filtering and cleaning before transmitting the data to the CIoT platform through Amazon EventBridge.
3. The suppliers can operate their IoT solutions on their preferred cloud providers or on-premises data centers, and execute device management on their own. They only need to submit the telemetry data to the CIoT platform to leverage the data analytic functionality.
Commercial IoT platform for telemetry data ingestion and analytics
As shown in the box framed by the black dotted line in Figure 2, the telemetry data from the devices or the suppliers’ solutions is received by AWS IoT Core, Amazon EventBridge, Amazon Kinesis, or Amazon Simple Queue Service (SQS). The AWS Lambda functions behind those services preprocess the telemetry data for the analysis and publish the processed data into Amazon Kinesis Data Streams. Those data streams are entries of telemetry data to be analyzed.
As shown in the box framed by the blue dotted line, the clients of the IoT solutions suppliers define the analytic rules through APIs powered by Amazon API Gateway and AWS Lambda, and the rules are stored in Amazon DynamoDB tables. A lambda function periodically publishes those rules into Amazon Kinesis Data Streams, triggered by the timers generated in the event rule of Amazon EventBridge. Those data streams are entries of analytic rules used in data aggregation.
In the box framed by the orange dotted line, Amazon Kinesis Data Analytics as the analyzing executor in the CIoT platform absorbs telemetry data and aggregation rules from the data streams and uses the rules to aggregate the data. After the aggregation, the results are pushed into the data streams for aggregation results. A lambda function validates the formats of the results and detects abnormalities in the results such as empty values or out-of-range. Once an error is discovered, the lambda function invokes Amazon Simple Notification Service (Amazon SNS) to notify the analytic operators that there might be issues in data, rules, or their composition. Amazon Kinesis Data Firehose loads the telemetry data from Amazon Kinesis Data Streams, and stores the data into Amazon Simple Storage Service (Amazon S3) for analytics (e.g. analysis by year) in the future.
Figure 2: Data analytics solution architecture on CIoT platform
Flexible data aggregation on the CIoT platform
When the rules are published to the data stream used for aggregation rules, Amazon Kinesis Data Analytics broadcasts them to all the downstream tasks, and the aggregation running on those tasks retrieves the rules locally and follows them to accumulate and compute the telemetry data. For example, the rule below defines the data aggregation method for a smart building solution. The lambda function produces the rules and invokes the APIs to write them to the data streams. The attributes tenantId, sourceId, and streamName are used to group telemetry data. Only the telemetry data including the same tenantId, sourceId, and streamName is put into the same group. A tenant is a client of the smart building solution, such as a hotel owner. The sourceId is the floor number in a certain hotel building, and streamName identifies environment data types such as humidity and temperature.
As shown in Figure 3, after grouping the telemetry data, Amazon Kinesis Data Analytics uses a time window to accumulate telemetry data. The size of the time window is defined in the rule. In this example, we use 60 second and 180 second tumbling windows. Amazon Kinesis Data Analytics also supports the sliding window. For each telemetry data group, Amazon Kinesis Data Analytics maintains 2 tumbling windows to separately accumulate data every 60s and every 180s. Once the timer for the window starts, Amazon Kinesis Data Analytics caches telemetry data until the timer expires. The timer expiration triggers Amazon Kinesis Data Analytics to compute the cached data at the same time the window tumbles to clean the old data and cache new data. In this way, Amazon Kinesis Data Analytics frames the values of accumulatorAttribute of the telemetry data in a certain time range and computes those values in the function assigned in aggregationFunction, such as computing the average or maximum of the values. With data accumulation and computing, Amazon Kinesis Data Analytics completes data aggregation and publishes the results into the data streams for analytic result output.
As seen in the example in Figure 3:
The average humidity on the 1st floor of building #1 is output per minute. The maximum humidity of the 1st floor of building #1 is output every 3 minutes.
The average temperature of the 1st floor of building #1 is output per minute. The maximum temperature of the 1st floor of building #1 is output every 3 minutes.
The average temperature of the 18th floor of building #8 is output per minute. The maximum temperature of the 18th floor of building #8 is output every 3 minutes.
Figure 3: Telemetry data aggregation according to predefined rules in Amazon Kinesis Data Analytics
By leveraging the data analytics solution introduced in this blog, instead of building a dedicated analytics function for each IoT solution on the CIoT platform, the clients simply ingest analytic rules that dynamically control data aggregation. By doing so, the clients easily gain real-time insights specific to their business. IoT solution suppliers and CIoT platform owners no longer have to operate a large number of solution-specific data analytic modules, freeing them to focus on data analytic rule development for deeper business insights.
About the author
Shi Yin is a senior IoT consultant from AWS Professional Services, based in California. Shi has worked with many enterprise customers to leverage AWS IoT services to build IoT solutions and platforms, e.g., Smart Home, Connected Vehicles, Commercial IoT, and Industrial IoT, etc.