Oct 3, 2024
/
Engineering
The Trifecta of Successful Data Management
Author:
Joel Christner
Businesses have a deluge of data – present in different formats, stored in a wide variety of repositories, and spread across a wide variety of internal and external silos. The concept of data management is not new – rather it has been relevant since the moment electronic information was first written to persistent storage (punch cards and cathode ray tubes, anyone?).
Today, data management refers to every function used to providing cradle-to-grave service for data. This broad topic covers multiple aspects, but it primarily focuses on understanding data location (both current and ideal), identifying data attributes and features (including content and authorship), determining appropriate data protection measures (such as backup, recovery, replication, and governance), and establishing access protocols (specifying who can access the data, when, and under what circumstances).
With data being the shoulders upon which analytics, machine learning, and artificial intelligence stand, it is clear that having a solid data management architecture will help make deployment (and the ability to extract a return on your investment) much more productive. The challenge most businesses encounter is the highly fragmented nature of the data management market This fragmentation often leads to a situation where a company might discover a specific tool that is a perfect choice for one particular need. However, by selecting this specialized solution, they inadvertently restrict their options for complementary tools in related areas. This limitation can create challenges in building a cohesive and integrated data management ecosystem.
Our goal at View is to accelerate our customers’ journey toward an AI-powered future by unlocking the value of their data assets and enabling their use in AI experiences. To support this, we have integrated core data management capabilities into the platform to help address challenges such as:
What data do I have?
Where does my data reside?
What data has attributes A, B, and C?
We’ve built what we’ve termed the trifecta of data management – that is, three pillars that provide a foundation that makes consumption of data easy.
The first is metadata. Through our patent-pending Universal Data Representation (or UDR) and our semantic search catalog (called Lexi) we are able to create a homogenous representation of heterogeneous data from heterogeneous data sources that, once ingested, allows you to use powerful query capabilities to find data assets related to your data tasks and AI experiences.
The second is graph. We’ve integrated emission of metadata to graph within our processing pipelines to unlock value yielded from understanding the relationship amongst data assets, such as data types, attributes, metadata, ownership, and other useful properties.
The third is vector. The final stage of our processing pipeline is to take relevant data assets and generate embeddings according to specifications you provide – put simply, which model, what parameters and configuration, and which vector store. Once embeddings are generated, you can immediately begin chatting with your data, or, use that data to power an AI experience that you’ve created using retrieval augmented generation (RAG).
Our integrated conversational AI experience, called View Assistant, takes full advantage of these repositories to provide a fast, accurate experience while interacting with and asking questions of your data. While no conversational AI experience can guarantee 100% accuracy, having the trifecta (metadata, graph, vector) available to us allows us to creatively search and refine results, and continually optimize toward the stated goal of continually improving accuracy and performance. To try what we talked about in this blog post, join our beta.