By Nigel Duffy, CTO at Sentient
Whether you’re looking online for clothes, shoes, art, real estate or jewelry it’s often difficult to find exactly what you’re looking for. We believe that there are three reasons for this:
1. Many shoppers can’t express their visual desires in words.
2. Most search and discovery tools can’t accurately translate their visual catalog into words.
3. When it comes to visual content, shoppers often don’t know what they want until they see it.
We built Sentient Aware™ to address these problems for ecommerce. Essentially Sentient Aware takes the form of a visual dialogue with the consumer. Every interaction in the dialogue is designed to elicit information about the user’s goals or intent. Similar to a game of “Animal, Vegetable or Mineral,” the answers to earlier questions influence the questions that are asked in later rounds. Once we infer that a user is looking for blue collared shirts, for example, the next most informative question might be about the style of the user’s preference in cuffs.
In a traditional search interaction, users are required to understand the search engine, which means they often play a game of “query refinement” whereby they try to cajole the search engine into understanding their intent by repeatedly trying variants of the same query. By contrast, Sentient Aware takes responsibility for understanding the user and asks questions to iteratively improve its understanding of the user’s goals.
Of course, this is challenging for visual content. It can be difficult, even for humans, to phrase such questions in English or in any natural language. Users’ visual intent is often nuanced and extremely personal. So rather than use natural language, we use the images directly. Questions are framed as a choice, i.e., “Which of these items do you like most?” These questions are answered by tapping the most preferred item. This simple interaction avoids the need for reducing complex visual products and nuanced intent to language. It allows users to interact directly with the products and not with a user interface. The information content of these questions (when properly framed) can be extremely high, and a small number of such interactions enables a deep understanding of users’ intent.
The Sentient Aware system has two key components. First, we train a deep convolutional neural network to embed products into a high-dimensional space. Applying this neural network to each catalog image yields a representation for the product. This representation captures highly granular but meaningful information about each product. When applied to shoes, for example, it captures key differentiating features such as heel-height, calf length, the style of laces, the overall shape of the shoe, and the varying textures across the shoe. In addition these representations capture many nuanced features that defy textual description.
Next, we use an extremely efficient learning algorithm to rapidly learn users’ goals. This algorithm builds on the deep learned representation. It rapidly identifies the aspects of this representation that are most relevant to the users’ search goals. Furthermore, it identifies cases where it is uncertain and asks follow up questions to refine its understanding. Thus in a small number of taps or clicks the system can acquire a deep understanding of the user’s preferences. In fact, it is often the case that an effective search takes fewer taps than would be required to type “high heel” or “blue shirt.” Note the contrast here with traditional applications of machine learning to “understanding” users. Traditionally the users’ entire search history is used to build a model of that users’ long term preferences, or more succinctly the models aim to understand what the user might want tomorrow.
By contrast Sentient Aware uses machine learning to understand what the user wants right now.
To find out more about Sentient Aware visit: sentient.ai/aware