Facebook wants machines to see the world through our eyes
| Local News | Google News
Over the past two years, Facebook AI Research (FAIR) has worked with 13 universities around the world to assemble the largest first-person video dataset ever created, specifically to train image recognition models in deep learning. AIs trained on the dataset will be better able to control robots that interact with people or interpret images from smart glasses. “Machines will only be able to help us in our daily lives if they really understand the world through our eyes,” says Kristen Grauman of FAIR, who leads the project.
Such technology could help people who need help around the house or guide people in the tasks they are learning to do. “The video in this dataset is much closer to how humans observe the world,” says Michael Ryoo, a computer vision researcher at Google Brain and Stony Brook University in New York, who is not involved in Ego4D.
But the potential abuses are clear and worrying. The research is funded by Facebook, a social media giant that was recently accused in the Senate of putting profits over people’s well-being, a sentiment corroborated by MIT Technology Reviewits own surveys.
The business model of Facebook and other Big Tech companies is to extract as much data as possible from people’s online behavior and sell it to advertisers. The AI described in the project could extend that reach to people’s everyday offline behavior, revealing the objects around a person’s house, the activities they enjoyed, who they spent time with, and even where their gaze went. lingered – an unprecedented degree of personal information.
“There’s privacy work that needs to be done as you take that out of the frontier research world and make it into something that’s a product,” Grauman explains. “This work could even be inspired by this project.
Ego4D is a radical change. The previous largest first-person video dataset consists of 100 hours of footage of people in the kitchen. The Ego4D dataset consists of 3,025 hours of video recorded by 855 people in 73 different locations in nine countries (US, UK, India, Japan, Italy, Singapore, Saudi Arabia, Colombia and Rwanda).
The participants were of different ages and origins; some were recruited for their visually interesting professions, such as bakers, mechanics, carpenters and landscapers.
Previous data sets typically consist of semi-scripted video clips that are only a few seconds long. For Ego4D, participants wore head-mounted cameras for up to 10 a.m. at a time and captured first-person video of unscripted daily activities including walking along a street, reading, sleeping. laundry, shopping, playing with pets, playing board games and interacting with other people. Some footage also includes audio, data about where participants’ eyes were focused, and multiple perspectives on the same scene. This is the first dataset of its kind, Ryoo says.
Local News Today Headlines Facebook wants machines to see the world through our eyes