You are where you e-mail: Global migration trends discovered in email data

U.S. Emigration unveiled: by analysing millions of emails the first consistent figure of those emigrating from the USA was made possible. The curves show those who sent most of their emails from the U.S. between September 2009 to June 2010 but consistently wrote the majority of their messages from abroad between July 2010 and June 2011. © MPI for Demographic Research <br>

For the first time comparable migration data is available for almost every country of the world. To date, records were incompatible between nations and especially by gender and age, nonexistent. Emilio Zagheni from the Max Planck Institute for Demographic Research (MPIDR) in Rostock, Germany, for the first time provides a rich migration database by compiling the global flow of millions of e-mails.

“Where estimates of demographic flows exist, they are often outdated and largely inconsistent,” says MPIDR researcher Emilio Zagheni. Official records are difficult to use for various reasons. Emigrants tend not to register after they move to a new country or do so very late. There is also no clear agreement between nations on how to actually define a migrant.

Official migration data is outdated and inconsistent

“Global internet data does not have these drawbacks,” says Zagheni. “You are where you email.” Together with Ingmar Weber from Yahoo! Research he traced emails sent from Yahoo! accounts around the world to infer the residence of its sender. Every device which sends email can be located at least at the country level by an internationally standardized code, the so-called IP address. Zagheni and Weber analysed the countries derived from IP addresses for a set of messages sent by 43 million anonymous Yahoo! account holders between September 2009 and June 2011.
In addition to the date and geographical origin of each message they compiled the self-reported birthday and gender of the sender. When a person started sending e-mail from a new location permanently, it was assumed that he or she had changed residence. This way they were able to calculate rates of migration from and to almost every country in the world. Only anonym zed data was used, so identifying individuals was impossible and no information about the recipients, the subject, or content of a message was accessed. The findings have now been published in the ACM Web Science Conference Proceedings.

The results not only are a proof of concept. They also reveal international migration characteristics never seen before. For the USA, Zagheni and Weber were able to produce the first curve of emigration by age and sex ever. “In the U.S. many statistics are collected about people who move into the country, but there is no system that keeps track of people who move out,” says Emilio Zagheni.

The potential of the email statistics goes far beyond calculating gross country profiles. For instance, the researchers also looked into Mexico-US cross-border mobility. The data reveals how strongly both countries are demographically integrated: most people who moved from Mexico to the United States either spent time in the USA before emigrating north, or went back to visit Mexico soon after moving to the United States. Those in their 30s have the highest rate of mobility across the Mexico-US border, while the least mobile are those 50 and older.

Only the tip of the iceberg

The strength of Zagheni’s and Weber’s migration data comes not only from the vast number of emails available, but also from a mathematical model they set up to adjust for typical shortcomings of email statistics: those who send email are not representative of the entire population. Some groups, like the elderly, use email less or not at all and are thus underrepresented. But the researchers managed to calculate adjustment factors for such groups by gauging their email data against migration numbers from European countries, where official data is fairly reliable.

“What we addressed so far is only the tip of the iceberg,” says Emilio Zagheni. With further fine-tuning of the adjustment factors and mining more digital data like twitter messages, more difficult questions could be tackled. For instance one could keep track of the short and long-term mobility patterns before and after a crisis like that of the Japanese Fukushima reactors. Unquestionably, digital records give demographers the chance to gain a more accurate picture of population dynamics in regions they can so far only guess about, says Zagheni. “This research has the most potential in developing countries, where the Internet spreads much faster than registration programs develop.”
Contact
Emilio Zagheni
Max Planck Institute for Demographic Research
Phone: +49 381 2081-227
Fax: +49 381 2081-527
Email: zagheni@­demogr.mpg.de
Silvia Leek
Max Planck Institute for Demographic Research
Phone: +49 381 2081-143
Fax: +49 381 2081-443
Email: leek@­demogr.mpg.de
Original publication
Emilio Zagheni, Ingmar Weber
You are where you E-mail: Using E-mail Data to Estimate International Migration Rates

ACM Web Science Conference Proceedings, June 25, 2012

Media Contact

Emilio Zagheni EurekAlert!

All latest news from the category: Social Sciences

This area deals with the latest developments in the field of empirical and theoretical research as it relates to the structure and function of institutes and systems, their social interdependence and how such systems interact with individual behavior processes.

innovations-report offers informative reports and articles related to the social sciences field including demographic developments, family and career issues, geriatric research, conflict research, generational studies and criminology research.

Back to home

Comments (0)

Write a comment

Newest articles

Innovative 3D printed scaffolds offer new hope for bone healing

Researchers at the Institute for Bioengineering of Catalonia have developed novel 3D printed PLA-CaP scaffolds that promote blood vessel formation, ensuring better healing and regeneration of bone tissue. Bone is…

The surprising role of gut infection in Alzheimer’s disease

ASU- and Banner Alzheimer’s Institute-led study implicates link between a common virus and the disease, which travels from the gut to the brain and may be a target for antiviral…

Molecular gardening: New enzymes discovered for protein modification pruning

How deubiquitinases USP53 and USP54 cleave long polyubiquitin chains and how the former is linked to liver disease in children. Deubiquitinases (DUBs) are enzymes used by cells to trim protein…