Read Time: 8 minutes

Machine learning helps find other Earths

Astronomers used an algorithm on stars with known exoplanets to identify 44 systems with potential Earth-like planets, 8 of which they considered highly probable.


shadow
Image Credit: "Artists's impression of an Earth-like planet" from Pho3niX is licensed under CC BY-SA 4.0

Astronomers are interested in finding planets of a similar size, composition, and temperature to Earth, also called Earth-like planets. However, there are challenges in this endeavor. Small, rocky planets are difficult to find because current planet-hunting methods are biased towards gas giants. Also, for a planet to be similar in temperature to Earth, it has to orbit a comparable distance from its host star, similar to Earth orbiting the Sun, which means it takes approximately a year to go around its star. This creates another problem for astronomers trying to find these planets, since searching for an Earth-like planet around just one star would involve dedicating a telescope to monitor it constantly for more than a year. 

To save time spent operating telescopes, scientists need a new way to find stars that are good candidates for thorough searches before dedicating resources to observing them. One team of astronomers investigated whether observable properties of planetary systems could indicate the presence of Earth-like planets. They found that the arrangement of known planets in the system and the mass, radius, and distance from its closest planet to its star could be used to predict the occurrence of an Earth-like planet. 

Then, the team tested how well machine learning could handle this task. They started by creating a sample set of planetary systems with and without Earth-like planets. Astronomers have only found around 5,000 stars in the sky with an orbiting exoplanet, which makes the sample size too small to train machine learning programs. So, the team generated 3 sets of planetary systems using a computational framework that simulates how planets form, called the Bern model

The Bern model starts with 20 clumps of dust that are 600 meters, or about 2,000 feet, across. These clumps kickstart gas and dust accumulating into full-sized planets that form over 20 million years. Then the planetary system evolves over 10 billion years to an end state, called the synthetic planetary system, that the astronomers include in their dataset. They used this model to create 24,365 systems with stars the size of the Sun, 14,559 systems with stars ½ the size of the Sun, and 14,958 systems with stars ⅕ the size of the Sun. They also split each of these groups into 2 sub-groups, including a group with an Earth-like planet and a group without an Earth-like planet. 

With these larger datasets, the team then tested whether a machine learning technique called a Random Forest model could categorize the planetary systems into those that likely had an Earth-like planet and those that did not. In a Random Forest, all the outputs are either true or false, and different parts of the program, called trees, make decisions on different subsections of the whole training dataset. The team decided that if a planetary system likely had 1 or more Earth-like planets, then the Random Forest should consider that “true.” The researchers tested their algorithm for accuracy using a metric known as a precision score.  

They set up the Random Forest to base its decision on specific factors in each synthetic planetary system. These factors included the arrangement of planets astronomers could feasibly find if they looked at a similar real-life system, how many of those planets were in the system, how many planets bigger than 100 times the Earth’s mass were in the system, and the size and distance of the nearest planet to the star. The team used 80% of the synthetic planetary systems as training data and reserved the remaining 20% for the first testing of the completed algorithm.

The team found that their Random Forest model predicted where an Earth-like planet likely existed with a precision score of 0.99, meaning it correctly identified systems with Earth-like planets 99% of the time. Following this success, they tested their model on real data for 1,567 stars in a similar size range that are known to have at least 1 planet orbiting them. Of these, 44 passed their algorithm’s threshold for having an Earth-like planet. The team suggested most of the systems in this subset wouldn’t fall apart if an Earth-like planet were present.

The team concluded that their model could identify candidate stars for Earth-like planets, but with caveats. One was that their training data was still limited, as generating synthetic planetary systems takes a long time and is expensive. However, the bigger caveat was that they assumed the Bern model accurately simulated planetary formation. They suggested researchers rigorously test its validity for future theoretical work.

Study Information

Original study: Earth-like planet predictor: A machine learning approach

Study was published on: April 9, 2025

Study author(s): Jeanne Davoult, Romain Eltschinger, Yann Alibert

The study was done at: German Aerospace Center (Germany), Universität Bern (Switzerland)

The study was funded by: Swiss National Science Foundation

Raw data availability: Real planet data can be found at the Catalogue of Exoplanets

Featured image credit: "Artists's impression of an Earth-like planet" from Pho3niX is licensed under CC BY-SA 4.0

This summary was edited by: Nathan Gock