The New Gig Economy: How Thousands of People Are Training Tesla's Optimus by Filming Chores
A new workforce is emerging to train the next generation of humanoid robots, and all it requires is a smartphone, a head-mounted camera, and willingness to film yourself cooking, cleaning, and gardening. Companies developing robots like Tesla's Optimus are facing a critical bottleneck: they need vast amounts of real-world video data showing how humans actually perform everyday tasks. To solve this problem, startups have begun recruiting thousands of remote contractors worldwide to supply what's known as "egocentric data" or "human data," creating an entirely new type of gig work .
The scale of this effort is staggering. Micro1, a Palo Alto-based data collection company, has assembled approximately 4,000 "robotics generalists" across 71 countries who collectively submit more than 160,000 hours of video each month. Yet even this massive volume falls far short of what's needed. According to Arian Sadeghi, vice president of robotics data at Micro1, the industry requires "probably billions of hours" of footage to train general-purpose robots effectively .
Why Can't Robots Learn Without All This Video Data?
Unlike large language models such as ChatGPT, which were trained on hundreds of billions of words freely available across the internet, robot developers lack a ready-made library of training material. Robots need highly specific, contextual data showing how humans interact with physical objects in real environments. A broomstick in an Indian kitchen looks different from one in an American kitchen, and robots must learn these regional variations to function effectively in different homes and workplaces .
The challenge is particularly acute because household environments are inherently unpredictable. Furniture moves, appliances vary, and humans constantly shift positions. "What's really missing is a human-like intuition of forces, friction, and uncertainty that people acquire throughout their lifetime," explained Rutav Shah, a robotics researcher at the University of Texas at Austin. "Making robots generally useful for everyday household tasks like cooking, cleaning, that is going to be the last mile of automation" .
Recent breakthroughs in AI have made this data collection approach viable. Three years ago, large language models that powered ChatGPT gave rise to new algorithms capable of translating visual cues into physical actions. This breakthrough allowed robots to move beyond simple programmed repetition and begin perceiving and navigating the world around them .
How to Become a Robot Training Contractor
- Equipment Needed: Workers receive head-mounted camera gear, typically a GoPro, Meta glasses, or smartphone, to record first-person video footage of their daily tasks
- Task List: Contractors film themselves performing household chores including cooking, cleaning, gardening, pet care, and other mundane activities that robots will eventually need to replicate
- Time Commitment: Workers are expected to submit at least 10 hours of video each week, with the flexibility to work from home across 71 countries worldwide
- Compensation Range: Hourly wages vary significantly by region, from $5 in countries like Vietnam and India to as much as $20 in the United States, reflecting local labor costs and consumer purchasing power
- Data Quality: Only about 50 percent of submitted footage is usable after review, so contractors may need to film additional hours to meet acceptance standards
The work itself is straightforward but requires attention to detail. Contractors film themselves performing tasks from a first-person perspective, and companies like Micro1 then annotate the videos so robots can learn to differentiate objects, distances, and physical movements. Arian Sadeghi noted that the company encourages contractors to experiment creatively: "If you think you want a robot to do this for you, go ahead and record it" .
Arian Sadeghi
A Multibillion-Dollar Market Emerging From Household Chores
This new data collection industry is growing rapidly. Market research firms estimate that the data collection and labeling industry will expand at an average rate of 30 percent annually, driven primarily by growth in Asia, and is expected to reach at least $10 billion by 2030 . The economics are compelling for robot makers: instead of purchasing expensive hardware or relying on less effective software simulations, companies can pay workers modest hourly wages and equipment costs to generate training data at scale.
Different regions are adopting different strategies. Tesla has been training its Optimus humanoid robot in its own facilities in Fremont, California, with plans to expand to Austin, Texas. Meanwhile, China has announced plans for at least 60 robot training centers across the country, supported by state investment in high-tech industries. Japan and South Korea have established data collection centers in Southeast Asia to capitalize on lower labor costs .
However, the United States and Europe tend to favor simulation training championed by Nvidia, which designs the world's most advanced computer chips. Yet even Nvidia acknowledged the value of real-world video data. In a February report, the company found that incorporating more than 20,000 hours of first-person videos into robot training improved the success rate of tasks like rolling T-shirts, sorting playing cards, unscrewing bottle caps, and using a syringe by more than 50 percent .
"If you rely on just one way of data collection, it's probably not the best approach," said Marco Wang, a Shanghai-based analyst for Interact Analysis, a technology research firm. "In the future, it will be a mixture of different approaches."
Marco Wang, Analyst at Interact Analysis
Ravi Rajalingam, founder of the data annotation company Objectways, has observed that geographic location matters significantly for robot deployment. "The India kitchen is very different from the US kitchen. A broomstick in India is very different from a broomstick in US. So variety is important, but it depends where you are going to place your robots first," he explained. This geographic variation is why companies are collecting data from workers across the world, even though American data commands premium prices .
How Long Will This Gig Economy Last?
The boom in human data collection may not be permanent. Puneet Jindal, who co-founded the data annotation company Labellerr AI, believes that for the next three years, prioritizing human data is a "no-brainer" for robot developers. However, he cautioned that technological advances could eventually make this work obsolete. Improved simulation training or AI systems capable of converting YouTube videos into first-person perspective could become viable substitutes for human-collected data .
The uncertainty is so profound that even robotics labs are struggling to plan ahead. "Even robotics labs are feeling like they don't know what data will be needed 12 months from now," Jindal noted. This unpredictability means that workers entering this field should view it as a temporary opportunity rather than a long-term career path .
For now, however, the demand remains voracious. As Tesla Optimus and competing humanoid robots move closer to commercial deployment, the need for billions of hours of training video will only intensify. The people filming their daily chores from home are not just earning supplemental income; they are quite literally teaching the robots that may one day handle household tasks themselves.