21.11.2024

Beyond the Hype – Part 2: Enabling Your Machines’ Learning

This is part two of the blog series on AI, an attempt to map the ever-growing jungle of AI, drive awareness, and offer valuable insights.

Written by — Jeremias "Jerry" Shadbolt, ML Engineer

Introducing Recordly’s three-piece blog series on AI - an attempt to map the ever-growing jungle of AI, drive awareness, and perhaps offer valuable insight for those not that deep in AI to understand what the fuss is about. The series consists of three posts, with the first laying the groundwork: what are we even talking about here, and why should I care? The second one builds upon this by touching on the technical requirements for applying advanced analytics into your daily operations, with the third one focusing on organization-wide AI adoption for strategic growth; a non-technical approach.

Recordly specializes in all three topics covered; should you be interested in either mapping your data/AI capabilities or even implementing your own solution, do not hesitate to contact us. We’re here to help.

After hopefully feeling inspired after the first entry, why, of this series, let us continue mapping the AI jungle by touching upon the how. We will discuss topics such as problem definition and execution, understanding data requirements, establishing strong leadership and meaningful evaluation. Each of these topics is all too extensive to be thoroughly explored, yet is crucial to include, even if only by scratching the surface, therefore this entry aims to deliver the main points in a clear and understandable manner.

We’ll kick things off with an example - one of my personal favorites:

When Steve Jobs, the late founder and CEO of Apple came up with the idea for the iPod, he started with the problem, not the solution. Jobs understood that people didn’t necessarily want a new gadget—they wanted a better way to listen to and carry their music. Before the iPod, digital music players existed, but they were clunky, difficult to use, and offered a poor user experience. The core insight was this: people wanted a simple, elegant way to have their entire music collection in their pocket. Jobs wasn’t interested in pushing the latest hardware or software; instead, he asked, “What is the problem we are trying to solve?”.

“What is the problem we are trying to solve?”.

The iPod emerged as a solution to that problem, driven by the user’s need for convenience, simplicity, and enjoyment. The iPod was revolutionary not because it had the best technical specifications (initially, it didn’t), but because it was designed to be intuitive, aesthetically pleasing, and deeply integrated with iTunes, which simplified the process of buying, organizing, and listening to music. A technology-first approach might have focused solely on building the most advanced MP3 player, packed with features that most users didn’t care about or found too complex. Instead, Jobs prioritized a seamless, delightful experience, making the device’s simplicity its standout feature.

Apple’s approach was to use cutting-edge technology only when it enhanced the user experience. For instance, they utilized a compact hard drive (from Toshiba) to store thousands of songs, and a simple, intuitive click wheel for navigation. These technological decisions were driven by the need to make the device elegant, efficient, and user-friendly, not by a desire to showcase advanced hardware for its own sake.

This is all closely related to the first step in any data/AI project: problem definition and execution. You can’t go around with a hammer looking for nails. Instead, when figuring out ways to improve your business operations the approach should rather be ‘what are we trying to achieve’ than ‘where can we apply AI’. Whilst advanced analytics and ML can be, and definitely are, helpful in a lot of cases, there is no point in overcomplicating something that could be a simpler and more maintainable solution.

This gets us to the execution part, where I’d like to propose we K.I.S.S. - Keep It Stupid Simple (Figure 1).

Ad Image 551x371

Figure 1. Cover of You are Solving the Wrong Problem: Retrain your brain to reframe the problem in the age of AI by Broemmer, D. [1]

This has been dominant, especially with GenAI, where every application and business is trying to build GenAI into it when either we don’t even need it OR the problem is much better solved with other methods. Take the ice cream sales example from the first entry: is the problem better solved with a complex RAG-based LLM which is chained to a time series analysis model to generate ‘based on your historical sales and tomorrow’s weather forecast, you’ll sell three ice creams tomorrow.’ or by a simple dashboard which shows both the historical data and the forecast for the next day. More often than not, a simple statistical model can not only outperform more complex models, but are cheaper and faster to implement, easier to maintain and build upon, or in other words, add complexity as it's needed. Another example could be customer classification based on activity. Instead of building a complex deep neural network classifier, a simple statistical analysis could do the same job, faster and better.

While this may sound overly pessimistic and give off some bah, humbug! -vibes, it is related to the point made in the earlier entry; the focus should be on solving the business problem effectively, rather than on using the newest or most sophisticated technology available. While the advancements especially in the GenAI-sector are astounding and mind-boggling they, exactly like any other AI-solution, only solve a well-defined specific problem, suited to them

Our exceptional Data/ML Engineer Toivanen, R. and Data Architect Heino, M. further explain the basics of GenAI here. They even go slightly deeper in this entry.

Keeping it stupid simple is not only related to getting rivalling results faster, but also with regards to infrastructure, time-to-market and maintainability. Our earlier blog , by ML Engineer Kivinen, J. touches upon these factors and explains these topics.

Once the problem has been defined and an initial proposition for a solution has been made, the next step is the most crucial one: finding out whether the data capabilities match the business requirements. As mentioned in the previous entry, AI is just a tool to automate making conclusions from data; Garbage In - Garbage Out. From my experience, there are two main reasons even the most valiant of initiatives fail: not enough data, or not the right data. A third and equally important reason is related to the earlier point: scope and definition; trying to solve too much at once and not being clear on the why.

Even though the topic of data has been discussed on many occasions and is probably getting repetitive, it is still too important to leave out. So bear with me as I take my turn in trying to explain the importance of data (Figure 2).

bear

Figure 2. Bearing with me. Generated with ChatGPT by the author.

To further understand the importance of data, let us first define what data actually is. The term data is the plural form of datum, which refers to a single piece of information. In a broader sense, data encompasses any set of values, observations, or measurements collected through research, experimentation or daily activities. These values can be numbers, text, images, videos or any form of information that can be captured, stored, and processed. In other words, data are the numerical representations of various events.

Data can be categorized into the following types:

Structured Data: This type is highly organized and easily stored in databases in rows and columns. Examples include spreadsheets, SQL databases, and financial records. Structured data is machine-readable, making it easy to analyze with traditional statistical tools.
Unstructured Data: This is raw, unorganized information that doesn’t fit neatly into a database table. It includes text data from social media posts, emails, audio files, and video recordings. Unstructured data requires more complex processing techniques, often involving natural language processing, image recognition or GenAI.
Semi-Structured Data: This type of data does not conform to a strict structure but still contains some organizational elements, such as metadata or tags. Examples include JSON files, XML data, and certain log files.

However many forms of data exist, there is a common factor: when fed to any model, whether GenAI or linear regression, they are in, or are turned into, numerical format. Video, text, image, everything. The data needs to be represented numerically so that these models can do their job, as they are based on maths. So for example your favorite cat video, which is essentially a sequence of image frames, can be represented as a 4-dimensional tensor or a multi-dimensional list of lists of numbers. Similarly, audio is typically represented as a time series of numerical values corresponding to sound wave amplitudes. Funky and not at all complicated, am I right?

Now here’s the real kicker: the data, or independent variables have to explain some variance of the dependent variable. In simple terms, when forecasting ice cream sales, we assume certain factors — like the day of the week and the weather — play a role in influencing people's decisions to buy ice cream. For instance, hot and sunny days or weekends, the independent variables, might encourage more ice cream purchases, the dependent variable. In machine learning, these variables are more commonly known as features, and they help the model make predictions.

However, in real-world data, there is often a lot of noise—random, irrelevant information that doesn’t help in predicting the outcome. Machine learning aims to distinguish this useful information (signal) from the noise. In a more complex example, a model might use hundreds of features, from social media activity to economic indicators, to identify patterns that are not immediately obvious. The model's job is to find these hidden patterns and focus on what truly impacts the predictions. This is kind of like preparing for an exam and focusing on the parts that really matter.

Well how do we know we focused on the parts that really matter? Through out-of-sample evaluation or in other words, a pop quiz of sorts. You begin with the allotted course material - the training data. You then validate what you’ve learned with exercises - the validation data. Then a final test is done with a pop quiz from the teacher, which consists of questions you haven’t seen beforehand but are related to the given material - the test data. Once you’ve done well enough on the quiz (i.e., we’re satisfied with the initial results on the test set) we can move on to testing in a real-life setting: production, or the actual exam. Should we do well here too, we can conclude that the model focused on the right things and that our training data accurately reflects production data. This is all closely tied to evaluation, which will be discussed shortly.

To conclude the points thus far, in order to kick off your data-driven project, you need

a well-defined problem and scope;
enough explanatory data.

Next up is leadership and evaluation.

Leaders with hands-on experience in their field possess a unique advantage: a deep understanding of the day-to-day realities, technical nuances, and operational challenges that their team faces. Such leaders can make more informed decisions. They can anticipate challenges, understand what is feasible, and identify realistic solutions because they know the specifics of the work. If the one leading your AI project has only experience in pressing an AI magic button that does inexplicable AI magic I’m willing to bet they do not bring as much value and guidance in an AI project as someone who actually has built such magic buttons before. This practical knowledge reduces the risk of impractical or misguided strategies that could arise from a lack of understanding of the field, thus increasing the probability of success.

Experienced leaders can quickly diagnose issues because they have likely encountered similar problems before. Their background allows them to draw from a repository of solutions and adapt past strategies to new challenges. They can also offer meaningful support and guidance to their team, helping to resolve issues more efficiently.

Furthermore, their strategic vision is grounded in reality; hands-on experience provides a foundation for creating strategies that are both visionary and practical. Leaders can set ambitious goals while understanding the practical steps required to achieve them - this balance of vision and practicality helps align the team's efforts with achievable objectives, reducing the risk of misalignment between strategy and execution.

For instance at Recordly, our advisors and architects bring real-world experience to the table — every one of them has technical understanding and experience which they apply in their day to day work in advising clients, planning solutions and overseeing operations. They, similar to our other technical experts, ensure that the proposed solution makes sense and drives real value. While it would be great to just sell the latest and shiniest new technologies and drive tons of revenue overpromising and overcomplicating, that’s just not how we do things. We are not only capable, but also motivated to ensure the solution makes sense and is the best fit for your specific needs. Even if it made the project a little smaller.

LWF Academy

Figure 3 by LWF Academy [2].

In the context of the requirements of AI solutions, a leader with experience in technical aspects is better positioned to define the problem clearly, set realistic expectations, guide the project effectively and communicate both problems and results in a clear manner. The prerequisites here are however that the leader understands the problem to be solved. That is where you, the domain expert, come in; while advanced analytics and AI are generally applicable almost anywhere, we can’t help you unless we understand you. Luckily we’ve got experience in clear communication and workshopping which help us do exactly that.

Last, but definitely not least, are evaluation metrics; you can’t improve, what you can’t measure. In order to successfully kickstart an AI project it is necessary to define some success criteria and metrics which quantify the success. Whether this is the accuracy of the model or some 3rd party business metric, some numeric representation of the results is crucial.

Whilst the quantitative results of a solution are important, so are the projects’. As said earlier, you can’t go around with a hammer looking for nails - even if you’ve got the best and shiniest hammer, if you can’t use it meaningfully, it’s practically useless. Sure you can show it off to your buddies but without a real use case, that’s usually all you can do. Hence, while it is important to quantitatively assess the hammer’s performance, you also need to take into account the work the hammer was used to achieve. The project requires a success criteria. Does the shed the hammer was used to build drive value? Is it being used and enjoyed? Or is it rotting away in your yard, not doing anything good for anyone.

As an example of assessing the solution (not the project, the solution), take medical imaging. How does an accuracy of 95% in detecting cancer from images sound? Too good to be true, I’m hoping. Whilst it is crucial to maintain a high level of accuracy in a critical setting as medicine, it is also important to understand the underlying reason behind the metric.

Accuracy is simply the percentage of correct predictions (true positives and true negatives) out of all predictions made. However, in cases of highly imbalanced datasets — like cancer detection, where the majority of scans might not show cancer — accuracy can be misleading. For example, if a dataset consists of 95% healthy patients and 5% cancer patients, a model that always predicts "no cancer" would still achieve 95% accuracy, even though it miserably fails to identify cancerous cases. This is why robust evaluations are important, as they focus on how well the model detects true positives (cancer cases) without being misled by a dominant class. By thoroughly evaluating a solution with the problem in mind, we can avoid catastrophic mistakes in a production setting.

In the case of customer segmentation, there may not be clear labels or ways to identify how well we’ve segmented the customers; we can’t directly ask them whether they like cats or dogs, even if that’s the conclusion we’ve arrived at. We need to measure success in some other way, which brings us to 3rd party metrics, or proxy evaluations. In order to determine whether our segmentation is correct or not, we can, for example, do an A/B test where we target advertisements on an unsegmented audience and the segmented audience; we can draw supporting or contradicting conclusions on our segmentation accuracy based on an external metric such as sales. This handily assesses both the technical implementation and the project itself - the hammer and the shed.

Regarding A/B testing, an even more advanced and robust method exists called multi-armed bandit test. That we’ll save as a discussion for another time - perhaps the next Recordly afterwork?

Once robust evaluation is established for both the solution and the project it is also crucial to use said evaluation. This goes beyond answering the initial question of ‘how did we do?’ but rather asks another question: ‘why did we do the way we did, and what can we do about it?’. Robust evaluation allows us to inspect the what and why through quantitative and qualitative analysis. An AI project is not necessarily a failure if your project reveals that there is no significant pattern or predictive signal in the data. Instead, it can be seen as a valuable outcome in its own right.

Understanding that your current data doesn’t contain the necessary signal allows you to refine or even redefine your problem. It might indicate that you need to reconsider the variables you’re measuring, or that the context you’re examining isn’t where the action is happening. For instance, if you’re analyzing customer data and find no predictive power in the demographic variables you selected, it may suggest that behavioral data (e.g., browsing patterns or purchase history) could be more relevant.

Further, finding no signal helps you understand which sources of data are uninformative, allowing you to redirect resources. It acts as feedback for your data strategy, informing you that additional or different types of data are necessary to answer the business question. You can now focus on collecting new data that are more likely to contain the signal you’re looking for.

One of the biggest risks in an AI project is basing decisions on assumptions about the data that turn out to be false. If your analysis reveals a lack of signal, it challenges preconceived notions and forces stakeholders to reconsider what they believe drives the outcome. This process is often overlooked but can be highly impactful, as it prevents the company from acting on misguided insights.

Even if you don’t find a signal, the project contributes to the organization’s understanding of its own data. It adds a new piece to the puzzle and improves your overall knowledge base. This can be helpful for future projects, especially if your findings help narrow down where the signal is likely to be.

In cases where there is no discernible signal, pushing forward with an AI solution anyway will lead to unreliable predictions and costly decisions. Knowing that the current data are inadequate prevents you from deploying flawed models, avoiding potential business risks.

To summarize this entry, a successful AI project requires knowledgeable leadership, a well defined problem, enough good data and robust evaluation metrics. In the next blog, we’ll discuss organization-wide AI adoption for strategic growth.

If I’ve managed to spark your curiosity, and you’d like to further discuss or explore use cases specific to your business, please do not hesitate to contact us - we’re here to help you take your business to the next level.