Accessibility and portability of the Internet have not just changed the way humans communicate, with the availability of inexpensive network cards and high coverage networks, machines are gradually catching up to human usage of the internet. This paradigm shift, from human to machine domination of Internet traffic, is termed the Internet of Things (IoT).
The rise of the Internet of Things is often characterized by recent increases in “Smart” products, from “Smart” TVs to “Smart” refrigerators to “Smart” Light Bulbs. However, the term also covers more complicated inter-machine communications, such as the communication between large systems such as transit. Because there is now a sensor, or more likely an array of sensors, in almost every device, the Internet of Things has the ability to generate a vast amount of data at an incredible rate.
The near limitless supply of data allows for new insights and better decisions with each additional new device. While most companies are successful at capturing data produced by the Internet of Things, converting that data into meaningful, actionable information has proven to be a far greater challenge. Here we look into some key insights how companies can better manage and analyze data produced by the Internet of Things.
Main Challenge: Varieties of Devices
The fives V’s which are Volume, Velocity, Variety, Veracity, and Value represent challenges of dealing with big data. According to Bodkin, of the five V’s, the most important one to consider when dealing with the Internet of Things is variety.
The very nature of the Internet of Things is the combination of a range of devices producing and sharing various types of data. When considering such a range of devices a fixed schema cannot be assumed. Even between two devices within the same manufacturer, the same format and the same data types can not be assumed. Furthermore, the growing nature of the Internet of Things means that even the most extensive fixed schema could be rendered obsolete by the addition of another device.
This is the classic use case that NoSQL databases seek to solve. NoSQL databases provide a schema-less container to allow the storage of data regardless of structure or type. In order to provide analytics of any sort, data will eventually need some sort of structure. NoSQL provides the concept of “schema on read,” which delays the application of schema until data is being used by an application.
This lazy schema has huge implications on both application design and database design. Applications can all share a common “data pool” or “data lake,” and are able to apply their own schema to data, independent of the requirements of other applications. Perhaps even more importantly, it allows for an evolutionary schema to change with the data without having to remodel.
Logical and Physical Changes on Modeling
Logical modeling no longer happens only in the database, but often at the application level. While NoSQL is capable of storing basic data types such as strings and numbers, it really starts to shine when storing highly specialized data such as graph data, location data and complex objects. There are hybrid models where NoSQL databases allow for up-front schema definition (schema on write), while also supporting schema on read.
Modeling in the physical layer has also changed. The database layer is no longer powered by expensive, gargantuan, high-performance servers. With most NoSQL implementations opting for scalable clusters of commodity machines, storage is rather cheap. However, these benefits come at the cost of expensive index lookups, eventual consistency, and reconciliation demands, all of which must be considered when modeling for the Internet of Things.
Solid Foundation of Data Management Practice Needed
The Internet of Things is all about blending different types of data from different places. While the tools surrounding IoT have evolved to become increasingly flexible and adaptable, data modeling is still important. The ability to perform effective analytics stems from a solid foundation of data management practices.