Researchers have accomplished a remarkable feat – they have developed an artificial intelligence tool that can analyze sequences of life events, such as health history, education, and job, to predict everything from a person’s personality to their mortality.
What makes this tool even more impressive is the fact that it is built using transformer models, the same ones that drive large language models like ChatGPT. Known as life2vec, the new tool is trained on a data set pulled from the entire population of Denmark – a whopping 6 million people. The Danish government provided the researchers with access to this exclusive data set.
With its groundbreaking predictive capabilities, life2vec is able to forecast future events, including the lifespan of individuals, with unparalleled accuracy. However, the research team emphasizes that this tool should be used as a foundation for future research, rather than a means to make predictions about real people. According to Tina Eliassi-Rad, professor of computer science at Northeastern University, “It is a prediction model based on a specific data set of a specific population.”
Eliassi-Rad, who brought her AI ethics expertise to the project, sees life2vec as a powerful tool that can provide valuable insights into society’s policies and regulations. By involving social scientists in the development process, the research team aims to ensure that this AI tool maintains a human-centered approach and doesn’t lose sight of the people behind the massive data set it has been trained on.
The potential of this model to reflect the world as it’s experienced by human beings makes it unique and valuable. Sune Lehmann, one of the authors of the paper recently published in Nature Computational Science, asserts that this model offers a much more comprehensive view of human life compared to many other models. The research is also accompanied by a Research Briefing in the same journal issue.
At the core of life2vec is a massive data set maintained by Statistics Denmark, which contains a detailed registry of every Danish citizen. This data includes a wide range of life events, from health factors and education to income. The researchers used this data to create long patterns of recurring life events to train their model.