By: Christopher Stephenson and Dr. Russ Couturier
Back in January many business analysts and data science pundits proclaimed 2019 would be the year that artificial intelligence went mainstream. And currently, all indicators appear to support this assertion. But as more and more businesses begin to deploy and operationalize AI, its obstacles, limitations and pitfalls become more apparent.
For example, Machine Learning (ML), which is all the rage these days, suffers from fluid landscapes, and ‘data amnesia’. ML requires a corpus of data as a starting point for building a model. But when it comes time to update or modify the model, a complete retraining is required that involves backtracking to the original starting point, negating any learning that has previously taken place. This is one of the main reasons building AI models can take weeks and sometimes months. Imagine if every time you improved your golf swing, you needed to relearn everything you previously knew about golf. As an aside, this might actually improve our golf scores, but we will leave that for a future blog post. Since the originating corpus is only representative of a point-in-time for many applications, retraining is an ongoing and sometimes frequent requirement in order to remain in line with current data and trends. Of course, you can always increase the computing power and crank up GPUs to speed things up, but the costs quickly get out of control.
An additional, less-spoken-about obstacle associated with traditional ML is something we call ‘photo-roll paralysis’. Some of you may be old enough to recall the days when cameras held physical rolls of film that allowed you to take a finite number of pictures (usually 24 or 36). This meant that you put a lot of thought into each picture that you took, as you wanted to make sure that every picture was worth it. In the digital world, we now have the luxury of snapping as many pictures as we want, knowing we can pick the ideal ones later and ignore the rest. The current ML approach is analogous to the photo roll scenario. Due to the time and effort it takes to build models using current methods, data science teams face a lot of pressure to ensure they get things right on the first try. And under this pressure, they often go to excessive lengths to identify the perfect question(s) and try to build the ideal model. Much time and energy is lost in the process, and some projects can even become stalled indefinitely. Thankfully, there are emerging technologies and approaches that allow us to update models in an iterative, additive way without the time and cost associated with the current ML traps.
Topos Labs is pioneering such an approach, as part of a cognitive content intelligence platform called Gracie. By using conventional databases, neural networks, and web sourced feature vectors, Gracie uses a newly-minted global language ontology model (GLO), which is a mathematical representation of human language that serves as a foundational starting point for all new models. Think of GLO as a periodic table for classifying the multivariate relationships that exist in language. With GLO, Gracie can use a sequence of words or phrases to predict the next word or phrase. And as new landscapes emerge, human-augmented models are compared and merged with the GLO to increase the accuracy of existing models while creating new ones that are embedded within the intelligence. The result: elastic, near-real-time model building, saving weeks of time and enabling rapid, iterative model experimentation.
Thankfully the days of long build times are numbered, data amnesia will soon be a distant memory, and photo-roll paralysis will be out of the picture for good.