The second day of F8 focused on the long-term investments we’re making in AI and AR/VR. In the opening keynote, Chief Technology Officer Mike Schroepfer talked about the AI tools we’re using to address a range of challenges across our products — and why he’s optimistic about what comes next.
Schroepfer was followed by Manohar Paluri and Joaquin Quinonero Candela from Facebook AI, Product Design’s Margaret Stewart, Lade Obamehinti, Lindsay Young, and Ronald Mallet from AR/VR.
AI powers a wide range of products at Facebook. In recent years, this has included our work to proactively detect content that violates our policies. To help us catch more of this problematic content, we’re working to make sure our AI systems can understand content with as little supervision as possible. And we’ve made important strides, but these are early efforts and there is still a long way to go. Advances in natural language processing (NLP) have helped us create a digital common language for translation, so we can catch harmful content across more languages. And a new approach to object recognition called Panoptic FPN has helped our AI-powered systems understand context from the backgrounds of photos. Training models that combine visual and audio signals further improves results.
Our work around natural language processing is important, but many techniques work best for the most common languages. We need a way to support the many languages where there aren’t enough samples to train on. Self-supervised learning can help as we’re able to train models for new languages without having humans label additional datasets for those new languages. This lets us better understand relevant content — including policy violations — without translating each sentence. These techniques help make sure all our classifiers are catching problematic content in more languages than we previously could.
AI is instrumental as we work to keep our platform safe — but we know it comes with risks. Namely, it can reflect and amplify bias. To address this, we’re building best practices for fairness — to ensure AI protects people and does not discriminate against them — into every step of product development.
When AI models are trained by humans on datasets involving people, there is an inherent representational risk. If the datasets contain limitations, flaws or other issues, the resulting models may perform differently for different people. To manage that risk, we developed a new process for inclusive AI. This process provides guidelines to help researchers and programmers design datasets, measure product performance, and test new systems through the lens of inclusivity. For vision, those dimensions include skin tone, age and gender presentation and for voice, they include dialect, age and gender. The inclusive AI process is now in use across many product teams at Facebook and baked into the development of new features.
For a deeper dive on all of today’s AI updates and advancements, check out the posts on our AI blog.
One of the areas where we’re using the inclusive AI process is augmented reality (AR). Spark AR engineers use it to ensure their software delivers quality AR effects for everyone. For instance, some of the effects are triggered by a hand gesture, so the training data included various skin tones under a variety of lighting conditions to ensure the system would recognize a hand in front of the camera. Oculus engineers are also using this process for voice commands in virtual reality (VR), using representative data across dialects, ages and genders.
As we work to ensure our technology does not exclude people, we’re also working to make sure it helps bring people together. And with VR, we see a future whether people can interact, and come together, regardless of physical distance. But to really achieve this, people need to feel completely present in VR. That means we need truly lifelike avatars, with gestures, facial expressions, and tone of voice that add nuance to our conversations.
We’ve shown groundbreaking realism in our Codec Avatars faces, which let people interact in real time in VR. But genuine communication requires the full body. That’s why we’re developing fully adaptive, physics-based models that reproduce a 3D avatar with data from a limited number of sensors. We’re using a layered approach that replicates human anatomy and can automatically adapt to perfectly match any individual’s appearance and unique motion. We design these models from the inside-out, developing a virtual skeleton then layering on the muscular structure, skin and clothing. The result is avatars that are realistic – right down to muscle movement and the draping of clothes. We still have a long way to go before this research results in a product, but we are encouraged by the results so far.
Like AR, we want to make VR inclusive and safe for everyone. We’ve built preventive systems — like a code of conduct for everyone that uses or builds for our headsets — that foster respectful culture and interactions. And we’ve built reactive systems — including tools for reporting or blocking users who are violating guidelines.
When we released our own social VR apps, including Spaces, Venues and Rooms, we incorporated safety into the core design of the experience. An orientation video introduces people to some of the features designed to make them feel more comfortable interacting with a large group of people while in VR. For instance, safety bubble is a feature that prevents people or objects from coming closer than you’d like. If one avatar enters another’s safety bubble, both avatars become invisible to each other. We also have live moderators on hand to help ensure good behavior and review reports of inappropriate behavior.
Watch the full keynote here.
The tools and processes we shared today are all part of how we are preparing for what comes next. For us, this work is about bringing voice and opportunity to people all around the world and helping people stay connected to one another. To read more about yesterday’s announcements, read our Day 1 roundup. For more details on today’s news, see our Developer blog, AI blog, Engineering blog, Oculus blog, Instagram Press Center, and Newsroom blog. You can also watch all the F8 keynotes on the Facebook for Developers page.