The question of “Are we alone?” has fascinated humanity for centuries, and we have been looking for alien life in the Solar System since NASA’s Viking 2 mission to Mars in 1976. But, how can we detect life if we fundamentally don’t know what life is? And how different would life in the Universe be from the life that we do know?
There are many ways scientists are searching for ET life. These include listening for radio signals from advanced civilizations in deep space, looking for subtle differences in the atmospheric composition of planets around other stars, and directly trying to measure life in soil and ice samples collected by spacecraft in our own Solar System. This last category allows scientists to bring their most advanced chemical analytical instrumentation directly to bear on ET samples. They may even bring samples back to Earth, where they can be carefully scrutinized.
Since we still have no example of alien life, this leaves scientists with a conceptual paradox: did Earth-life make some arbitrary choices during evolution which got locked in early on, and thus life could be constructed otherwise, or should we expect that all life everywhere is constrained to be exactly the same way it is on Earth? How can we know that the detection of a particular molecule type is indicative of whether it was produced by ET life? How can we tell if signals detected in ancient terrestrial samples are from the original living organisms preserved in those samples or derived from contamination from the contemporary organisms which presently pervade our planet?
A new study by a joint Japan/US-based team, led by researchers at the Earth-Life Science Institute at the Tokyo Institute of Technology, attempts to tackle these questions. They used “machine learning” to help determine if mixtures of organic molecules could be from life or just non-living chemical reactions.
Mass spectrometry (MS) is a principal technique that scientists will rely on in these spacecraft-based searches for ET life. Mass spectrometry can simultaneously measure multitudes of compounds present in samples, and thus a sort of “fingerprint” of the composition of the sample can be obtained. Nevertheless, interpreting those fingerprints may be tricky.
Using ultra high-resolution mass spectrometry (a technique called FT-ICR MS, shown above), they measured the “fingerprints” of a variety of complex organic mixtures. They compared known non-living mixtures of organic molecules prepared in the lab with 4.5 billion year old samples from meteorites, lab-grown microorganisms including some novel ones, and raw crude oil. These samples each contained thousands or millions of discrete molecular compounds, which provided a large set of MS spectra that could be compared and classified. The analysis of these samples “taught” the computer which characteristics to look for that might be associated with life and non-life.
Usually, the accuracy of individual molecule fingerprints matters a great deal as scientists discern what molecule the mass spec read-out is showing. Instead of doing this, they aggregated their data and looked for patterns using statistics. In this way, individual data peaks were blurred together to better resolve overall trends. They found that these complex organic mixtures presented very different “fingerprints.”
Next, the researchers fed this raw data into a computer machine learning algorithm, in a random assortment of the initial “training” samples and many other test samples. They were surprised to find that the algorithms were able to accurately classify the samples as living or non-living with ~95% accuracy. Importantly, they did so after simplifying the raw data considerably, making it plausible that lower-precision (and therefore lower-wattage) instruments — such as what would need to be used on spacecraft — could obtain data of sufficient resolution to enable the same classification accuracy.
Why does this work? The scientists on the team are actually not entirely sure, but they have a hypothesis. Living processes are parts of systems that regenerate themselves, while abiological processes have no internal process controlling this. So, molecules are produced differently by life and non-life. This approach is exciting because it opens up new ways of describing the chemistry of life. The team plans to follow up with further studies to understand exactly what aspects of this type of data analysis allow for such successful classification.