The Center for Data Innovation spoke with Dave Woollard, Senior Vice President of Technology Strategy and Innovation at Standard AI, a company based in San Francisco that specializes in autonomous checkout technology for retail stores. Wollard discussed the challenges of retrofitting stores for autonomous checkout and the future of synthetic data.
Becca Trate: Standard AI helps brick-and-mortar retailers offer autonomous checkout. What is autonomous checkout, and how does it work?
Dave Woollard: When you visit a store, it’s a common scenario to pick up various items, each labeled with a barcode, such as a UPC (Universal Product Code) or an EIN (Enterprise Identification Number). These barcodes serve as identifiers that inform the store about the products you’ve selected. Traditionally, during checkout, a cashier scans these items by inputting the barcode into their point-of-sale (POS) system, which then generates your receipt. This method has been in use for over 50 years.
Autonomous checkout removes the cashier and the need for manual scanning from this process, without changing much else. Our goal is to modernize the checkout experience by integrating computer vision and machine learning. The shopping experience remains familiar: you shop in the store as you always have. However, instead of presenting your items to a cashier, enduring potential waiting times in lines, or even scanning each item yourself at a self-checkout station, our systems are already aware of the contents of your cart. All you have to do is simply leave the store.
Trate: How does computer vision technology identify and track the products that people select from the shelves?
Woollard: We offer several systems designed to identify products in shoppers’ hands and on shelves, ultimately determining what someone has picked up or put back. We achieve this through various methods. Firstly, we are primarily a mapping company that specializes in creating maps of retail environments. This enables us to pinpoint the locations of specific items within a store, helping us understand where shoppers are as they move around.
From the outset of our design process, we prioritized privacy as a computer vision company, especially one deploying cameras in public spaces. We track individuals in retail environments without resorting to techniques like facial recognition. Instead, we rely on movement to distinguish one person from another.
When we determine what someone has in their possession, whether it’s placed in their bag or taken off the shelf, we rely on a combination of signals. Firstly, we use positional data since we know both the shopper’s location and the item’s placement. Secondly, we incorporate visual cues. We maintain extensive catalogs of product imagery, which we develop in-house. This visual data plays a crucial role in our process of identifying the products shoppers have taken.
Trate: How can this computer vision technology benefit both consumers and retailers?
Woollard: The benefit for consumers is quite straightforward—nobody enjoys waiting in line. Self-checkout kiosks have been around for a while, but they have often been a topic of controversy because, in essence, they shifted the work onto shoppers themselves. So, the primary and most direct benefit to shoppers is the elimination of long lines.
However, there are several secondary advantages that both shoppers and retailers can gain from incorporating computer vision technologies in their stores. For instance, our underlying systems and analytics generate various insights for retailers, including identifying when shelves are out of stock. A better shopping experience is one where you find what you need when you enter the store. We like to think of this as improving the overall retail environment because a store that is efficiently stocked provides a superior shopping experience. It’s also advantageous for retailers since they don’t miss out on potential sales due to stockouts.
Ultimately, it’s these types of insights that contribute to the efficient operation of a store. Retailers can better utilize their existing workforce by addressing issues like stocking errors, misplaced items, and stockouts in a more targeted manner, rather than relying solely on manual inventory checks or end-of-day counts.
Trate: What are some of the challenges of retrofitting a retail store for autonomous checkout?
Woollard: There are a number of challenges in dealing with real-world environments. First, consideration must be given to how consumers will use the actual space, which often isn’t considered when creating these spaces in a lab setting. For example, taller customers may interfere with lighting and cameras built into the ceiling. Another aspect to consider, especially in the retail context, is the challenge posed by the complexity of stores from a supply chain and logistics perspective. When we began this project, we made certain assumptions about lead times for new product introductions and the signals we would receive from stores when they planned to make changes to their layouts and other aspects. What we came to realize is that even our most reliable partners, who were well-versed in logistics and operations, often didn’t have advanced knowledge of changes within their own stores.
Additionally, we noticed that our technology, which focuses on identifying products through visual cues, was relatively unique in the supply chain landscape. Many other technologies used by retailers and the retailers themselves primarily focus on changes to the product’s barcode. Consequently, we are vulnerable to alterations in packaging, such as special holiday editions. For example, we have partners in Japan, and during the Tokyo Olympics, we encountered a unique challenge as everything from sports drinks to potato chips suddenly featured a “Tokyo 2020 edition.” This underscores the importance of closely monitoring visual product changes and packaging evolutions.
The industry as a whole has yet to fully recognize that downstream consumers of packaging changes require timely triggers and a better understanding of these developments. I believe that this awareness will eventually grow over time.
Trate: Can you explain Standard Sim, and the use of synthetic data in Standard AI’s model?
Woollard: Standard Sim is a simulated dataset that we utilize to train our computer vision technology. The essence of creating a simulation environment lies in situations where obtaining real-world data may not be feasible. For instance, in cases where a store only exists as a blueprint and hasn’t been constructed yet, a simulated environment becomes indispensable. It allows us to train our model without the need for an actual physical replication of the space and products.
Furthermore, there are scenarios in which we seek to test our algorithms in new and diverse environments. In such cases, we rely on simulated data to expand our dataset. While simulated data isn’t our core product, it serves as a valuable tool.
During the development of Standard Sim, we recognized that others might find this dataset useful as well. We believe that as the ecosystem continues to evolve, this product can seamlessly integrate into discussions among companies and marketers regarding packaging changes. By understanding what printed changes might look like through simulations, we can create digital twins of products even before they are physically introduced into a retail environment. Retail simulation enhances our model training, enabling us to progress more rapidly than the supply chain can deliver physical goods.
Looking at the broader landscape, many companies are generating synthetic datasets of real-world environments, often referred to as “digital twins,” particularly when replicating physical spaces digitally. This technique and data type have diverse applications, ranging from autonomous vehicles and industrial scenarios to the retail sector in which we operate. Synthetic data is gaining prominence as it offers a solution for environments where traditional data collection is challenging or impossible, considering factors like privacy considerations. Synthetic data is emerging as a valuable alternative to real-world data collection.