Adjust the attribute sliders to see which celebrities come out as the best match! You can get a random starting point with the “Randomize Sliders” button. Then try modifying some of the sliders to see how the matches change. Note that the match is calculated based on Manhattan distance over the entire set of slider values, and therefore the best match may differ from some of them: For instance, a person may not be bald even if you so specified because the other attributes match very well. Also sometimes you may need to change a few sliders to move to the next nearest match among the 1550 faces in the dataset. If you do not immediately recognize who the match is, click on the “Bio” link!
What’s going on?
The Pseudo-Task Augmentation (PTA) method uses the multitask learning conclusions as the starting point. In the same way as training with multiple related tasks helps training in each task, in PTA single-task training benefits from training with multiple output decoders. Each decoder projects the output of the earlier layers to the output layer in a different manner, so that the gradients they generate are different, similarly to gradients from the different tasks in multitask learning. Multiple decoders can be used in multitask training as well (i.e. several decoders in each task), and they provide further improvement, suggesting that the multitask and pseudotask effects are complementary. In Paper 2, PTA was trained to predict the 40 facial attributes (i.e., 40 tasks) in the CelebA dataset. It achieved state-of-the-art results, reducing error from 8.00% to 7.94%, showing the ability of PTA to yield improvements even in strong current vision models.
This demo allows you to explore the behavior of the PTA model, by turning the task around. You can specify the value of each facial attribute using a set of sliders (a subset of 14 of all 40 to make the demo more focused; also the demo dataset consists of Wikipedia images instead of CelebA, demonstrating generalization ability of the model). The model will find the faces in the dataset that best match those values (precomputed by the PTA model). Matching is thus based on semantic understanding of the face images, and it is possible to make connections between faces that would be otherwise difficult to see. It therefore contrasts with many other face matching applications, most of which are based on learning arbitrary image features.