2 Important Data Management Principles for Supply Chain from a Top Data Scientist
What’s inside?
Irit Sella, PhD, Bjorn Vang Jensen, and Daniel Nachum delivered during Windward’s latest webinar: “Optimize Your Logistics Operations with Robust Data Management. ” Combining a data science team lead, supply chain expert, and product expert produced an insightful conversation. Click here to watch the webinar.
Logistics companies focus heavily on implementing AI and other technologies to boost growth and cut costs. But their data is often an obstacle that makes this difficult.
The experts provided valuable insights into best practices for data governance, the importance of high-quality and fresh data, and cost-efficient data management strategies to create thriving supply chain operations.
This blog post is focused on two recurring principles from the webinar emphasized by Irit, Windward’s Data Science Team Lead, and an expert who builds Gen-AI based products for Windward. Stay tuned for webinar snippets from all the participants on our social media channels, coming soon…
Every Convo. That Starts with AI, Ends with Data
It seems that “AI” and “generative AI” (gen AI) got turned into buzzwords in the supply chain/logistics ecosystem following Chat GPT’s explosive arrival on the scene. The race to brand every new solution as “AI” and include generic AI solutions in the conversation has muddied the waters a bit.
When people use the acronym “AI,” they are typically referring to a smart algorithm, according to Irit. This could be a machine learning model, a deep learning model, or another type of model.
When fed data, the model provides some type of advantage: analysis or insights, something interesting and non-trivial that is more than just data points. This does NOT minimize the importance of data. Just as a house cannot stand for long with a weak foundation, a supply chain AI solution with dirty data will quickly prove to be useless.
Irit notes that AI solutions utilize data as their foundation. Often large amounts of data, possibly from several sources and of several types. “The types of algorithms and processing are important, but the fundamental and crucial element is the data and its quality. There’s a famous saying, ‘garbage in, garbage out’: no matter how sophisticated your algorithm is, if you give it garbage data, you’ll get garbage output,” said Irit.
Like all other fields, the quality of data highly impacts AI’s effectiveness for freight logistics. “The data that underlies the models must be as complete, coherent and clean as possible to extract complete, coherent and correct conclusions from it,” according to Irit.
Want to learn about Windward’s data differentiators and philosophies? Check out our newly updated Executive Brief taking you inside our “engine room.”
Data ALWAYS Comes Dirty
Daniel Nachum, Windward’s Product Marketing Manager for the supply chain, brought up an example of dirty data during the webinar.
A shipment was intended for Egypt. The carrier originally indicated the freight was headed towards the port of El-Dekheila in Egypt, but then showed eight consecutive days with the port of Piraeus as the port of destination (PoD). which is almost 800 kilometers away. Then El-Dekheila, where the shipment eventually arrived, was again named as the PoD.
Are inconsistent port names caused by dirty data?
“One of the most important sentences I heard in my career was from my manager at Windward. It was, ‘Data always comes dirty,’ there’s no bypassing that,’” said Irit.
There are good data sources that provide mostly clean data, which is definitely something to aim for when you decide on a data provider, but some “messy data” will come your way, especially because the domain is inherently messy.
In the specific example above, the carrier website is the source of the data, and when there’s a mess there, which is not uncommon, it will reach the system.
Irit offered an important note: it can be extremely difficult to differentiate between real changes in PoDs and data errors that appear to be PoD changes. “When we encounter difficult cases, we incorporate expert feedback, a human-in-the-loop, to make the determination. It’s part of our job to recognize when a human is needed in the loop, and to give that added value, so that the data that reaches our costumes is as clean as possible.”
An added benefit: after obtaining extensive experience, these experts-in-the-loop often learn and observe things that can be added to improve Windward’s automated processes. This improves our AI models.
Want more webinar wisdom? Check it out here!