The Integrity of Building Analytics

Levi Epperson

Analytics for building systems, regardless of whether the services are cloud-based or on-site solutions, should be accompanied by constant observation of the data and the analytics themselves.

 


 

Building owners are increasingly aware of and interested in analytics for their building systems and equipment. There are many different vendors available for analytics, including cloud-based and on-site solutions; both methods have merit, depending on the client’s needs. But whether a solution is on-site or cloud-based, how each vendor maintains the integrity of the data and analytics should be fully understood to ensure issues affecting the data and/or the analytics can be quickly addressed; otherwise, you have expensive software producing no results, and consequently no value.

From our experience, both data and analytics can go awry for a number of reasons, thus an important part of our operation as a company is to continuously monitor all facets of your data and analytics results to ensure quality data is stored and analytics are running and producing consistent, accurate results. It is important to note that issues can also be caused by the analytics provider, such as faulty mapping of the data or a processing failure of a specific analytic, making it even more important to ensure that the data storage and analytics are consistently monitored.

Starting with the data, numerous issues can occur that will trigger an alert in one or more of three data storage metrics that Ezenics tracks. The three primary data storage metrics utilized are last record, frequency, and data errors. Last record shows the last time data has been stored, and there are thresholds in place to alert in case the period of time between records is too long. Frequency is a measure of how often we receive data for each data point. The ideal frequency is at one minute, meaning data is stored for each data point once per minute. Similarly, there are thresholds in place to ensure the ideal frequency, with some tolerance, is being met. If either of these metrics exceed the set thresholds, the quality of the analytics will decrease because of a lack of data and/or low resolution.

Data errors, the third data storage metric, are indicators of a variety of problems with the data. Ezenics uses many different error codes to help identify the source of the issue. For example, timeout errors can indicate a connection problem, and an invalid data type error can indicate the expected and actual point type do not match. An example of an invalid data type error is when we expect a numerical value and are receiving a text value. This error usually indicates a data mapping issue.

The processes and results from an actual client best validate that the incoming data needs to be monitored continually. Client A has hundreds of retail locations across the United States, with several types of equipment (HVAC, lighting, refrigeration, and power meters) monitored in each location. For simplicity, only the information for the HVAC equipment will be explained in this article.

Since Client A has thousands of HVAC assets to monitor, it can be easy to miss issues that occur because the effect on the overall last record, frequency, and data error metrics can be masked by the large number of machines. To account for what might initially be thought of as minimal effects, a different strategy is utilized with the data errors. A report was created to show the overall percentage of data errors at each of Client A’s locations. The report lists the locations with the highest percentage of errors, which are then investigated one by one.

In the investigation phase, the building management system (BMS) is a common starting place. The most frequent cause of the data errors is a controller communication issue. These communication issues can occur across the entire location, or only on a few machines via a temporary disablement. It is also common to have noise on the line. The result of this noise is intermittent storage of reliable data. The HVAC equipment will drop offline momentarily and then return, producing data with data errors mixed in.

Another cause is an update to the device addressing or networking information. If a change occurs to the connection details of all or some of the devices in the list, but the data collection process is not updated, then data errors occur. A similar cause is an update to the function block of a controller. In and of themselves, function block changes can affect data storage differently. Sometimes these changes can cause all stored data points on the controller to go into error and other times only some of the stored data points go into error, while the other data points will store values different from what they were storing previously. Commonly with Client A, a change of the latter type will result in setpoints storing values that are well outside of the expected range of a setpoint (a separate set of Ezenics algorithms identify valid data that falls outside of an expected range)

Finally, another situation that can occur in a portfolio of hundreds of locations is machine replacement. Communication is key in situations of machine replacement, and when there are thousands of HVAC assets, this reporting method helps minimize any unexpected downtime.

Over a 9-month period of time, there have been 332 instances of high error percentage locations caught by this reporting method for Client A. Over this time period, all of the investigation results into these 332 instances have been shared with Client A. Active communication on the issues and causation enables both sides to find, remedy, and prevent issues from occurring as frequently in the future. The goal is to continue lowering the error percentage threshold as solving issues leads to faster resolution and improved reliability with analytics.

Ezenics also oversees the analytics that are deployed to their client’s equipment. Several measures are regularly examined to ensure that analytics constantly process and provide consistent, accurate results for our clients. The first measure is the delay of the implemented algorithms. Since equipment data is always being stored along with timestamps, the algorithms must similarly always process the incoming data. Monitoring the algorithm delay allows for quick response to scenarios in which the delay has risen.

The second measure for analytics health is errors. The algorithm errors are comparable to the storage errors in that many different error codes are used to identify the issue that has occurred and provide an indication of what the root cause is. The main causes of formula processing related issues are server or database issues, data storage issues, and issues with the formula dependencies. Each of the three main causes have several sub-error codes which provide more specifics on where the issue originated from, enabling faster remediation.

The number of results generated by the algorithms is the third measure utilized in maintaining analytic integrity. We record the number of results generated by the algorithms for each location, factor in the client’s typical issue resolution pace, and then compare the current number of results to a running average of previous days. If there are significantly fewer results than expected or significantly more results, the location is investigated to resolve the issue or verify the additional results are valid.

Building owners must be aware of the many ways in which issues occur. Analytics for building systems, regardless of whether the services are cloud-based or on-site solutions, should be accompanied by constant observation of the data and the analytics themselves. Problems can take place at any time and in any piece of the technology, including both the client and provider. Whether the building owner has hundreds of locations across the country or a single office building, it is essential for analytics of every shape and form to be maintained at a high level of operating integrity in order for optimal value to be delivered.