The Center for Data Innovation spoke with Dagan Xavier, co-founder and chief product officer of Label Insight, a Chicago-based company that provides product data for consumer packaged goods. Xavier discussed how better data can empower consumers to make better choices.
This interview has been edited.
Daniel Castro: What was your motivation for founding Label Insight?
Dagan Xavier: I started Label Insight as a means of wanting to help my dad find the foods that he needed based on his health condition at the time. When I realized that the insight we had generated could help many others with similar needs, that’s when I felt we had a business idea. I was motivated to be first to market and first to disrupt. Someone was going to do it, and I wanted it to be me. So when we’d be laughed out of board rooms or told that no one cares about attributes and ingredients, it just made us more motivated to succeed and prove them wrong.
Castro: How does Label Insight use data to empower consumers to make better decisions?
Xavier: We aim to answer all the questions that could be asked of a single product by designing our data capture and transformation process with this in mind: Let the data tell us the answers instead of asking the data questions. In order to do this effectively, we’ve had to capture all the written information on packages and deconstruct it. By deconstructing this data, we can process it down to its most independent and effective level. For example, this means looking at every individual ingredient, every parenthesis, every nutrient, and every unit of measurement. By creating this base layer of data, we are able to organize, categorize, and tag it. This is essentially the process of creating building blocks that can then be used to create millions of custom configurations, which enables consumers to answer any question about a single product.
Castro: How have changing consumer shopping trends impacted how brands and retailers use product data?
Xavier: Examples of some of the latest attributes we’ve created include: “contains mint ingredients” in personal care; “contains scallion ingredients,” “Bulgarian style,” and “rope form (sausages)” in food and beverage; “contains ozone” in household cleaners; and “contains antler” in pet food.
Many of these examples may sound strange, but they are based on how we as consumers are searching on various e-commerce platforms. We’re able to aggregate all the search terms that consumers are using and link them to the product category and the attribute. So when you know 24,000 things about every product and marry that up with what consumers are searching for in each product category, you can come up with some interesting, informative, and impactful insights!
Obviously, brands are looking to better understand these insights and how their portfolio and competitors are tracking against them. The hunger for data and insights is at an all-time high, and that is purely based on the impact that COVID-19 has had on e-commerce.
We’re seeing a lot more emphasis placed on product innovation, shortening the product development cycle, and ensuring that new product development is focused on the growth drivers. This is also expanding into retail with private label brands (PLB) making a big push. Look at Target’s latest PLB or the fact that Amazon now has its own PLB. It’s getting competitive out there, and brands are looking for any edge they can get on the digital shelf.
Castro: How do you ensure consistent data quality?
Xavier: It may sound counterintuitive, but the key to scalable data quality is more and more data. Our process of deconstructing data down to its rawest form is what creates such a large base layer of data for us to then leverage.
We can use the deconstructed data to profile products based on nutrients, ingredients, category, and even brand. At that point it becomes mathematical—any data element that isn’t within a standard deviation is flagged and queued up for inspection. If it’s wrong we’re able to find it and fix it. If it’s right, then we have more data to fuel and teach our models.
When you have 24,000 plus attributes on every product, you can use those attributes against each other to flag potential issues. For example, a product may contain a vegan certification but also contains a meat ingredient. A product has a “nothing artificial” claim but contains an artificial preservative ingredient. Creating a repository of conflicting attributes helps us scale our data quality efforts across our database to ensure our customers are leveraging the most accurate data in the market.
Castro: What challenges do you hope to tackle in the future as you expand to cover more products and brands?
Xavier: For me, it’s about creating exceptional value for our customers. It’s important that we help not only deliver insights but that we help our customers save time and money. So to do that, our products need to provide end-to-end solutions to our customers’ core problems.
I think product innovation and the product development lifecycle are ripe for disruption. Helping brands identify the signals, validate the signals, test and sandbox the formulation to meet the signals, set up and syndicate their data to retail, and ensure maximum SEO—to me, that is the entire workflow. And to truly bring value, we’ve got to nail that workflow.
If we can do that, then we don’t just help brands sell more products, we also help consumers discover more of the products that they want. That takes me back to the reason why I started this company. Any time I can connect an idea to the original use case, I believe it’s a good idea because who doesn’t want to help people make better choices?