Automated hypothesis generation: an AI role in science

When I was getting my PhD in Ann Arbor during the 1980’s, just staying up to date with the relevant literature to my own thesis project was a constant challenge. There was a paper magazine back then called Current Contents (CC). CC contained just that: the tables of content for all of the relevant journals (in Life Sciences). It was a critical resource because there was no other way—even then—to keep tabs on the collective scientific output.

Keeping tabs was not just for general knowledge about the field. Or even about properly giving credit to others. Rather, it was critical to the hypothesis creation. Asking the right question (at the right time) is what determines scientific success in many cases. But you can’t ask the right question without understanding whether it’s been already asked. And really you can’t ask the right question without a full understanding of what the current state of scientific knowledge is.

At the time, it was the habit, in many high impact papers, to have the last figure in the paper be a cartoon schematic that represented the author’s view of where the field was—at the moment of the paper’s acceptance into the journal. In my field of molecular neuroscience, this often was a series of shapes and arrows representing key biomolecules and pathways. It was often amusing to go from one paper to the very next that a particular group put out and see that some of the arrows would mysteriously reverse directions from the cartoon in the previous paper. This was presumably because the paper’s results along with other results had changed the thinking of the author.

In any case, that cartoon figure was always a clue into what the next hypothesis to be tested would be for a particular research group. So in a sense, you could predict the trajectory of scientific inquiry from that cartoon figure at the end of a paper.

That was the 1980’s. Our scientific knowledge base has expanded exponentially since then. One of the current versions of Current Contents is called Faculty of 1000 (F-1000). It’s on-line of course. The idea is that leaders in the field curate the papers that you should read based on your profile. It’s a great idea I guess, although science being as competitive as it is, I have doubts that the elect would give up some brilliant and undiscovered insight of a paper to the unwashed, if it really might supercharge some scientific inquiry. However, as a scientist, you have many other choices. Google Scholar comes to mind—it’s both comprehensive and I’m pretty sure it uses AI extensively to tailor its results. So machine-driven instead of human-driven (as in the case of F-1000).

However, the cartoon figure at the end of papers has become pretty obsolete (although it does still make appearances). That’s because pretty much all of science—certainly life sciences—has become incredibly complex. In my field, you can’t make a cartoon big enough to represent all the relevant biomolecules and pathways and the arrows have become incredibly intertwined because of the multiplicity of feedback loops and cross-talk links.

So not only is it difficult to glean the next hypothesis for the clever reader (even when there is a cartoon). It’s impossible for the author to do the same.

This has pushed much of science from the paradigm of Popper to exploratory research. In such science, I might read the data stream from some set of sensors, correlate that data with some other external variable (like seasonality) and publish a correlation that is intriguing. Correlation of course is not causation—we all know that.

And yet, science has the tools to do excellent hypothesis-based research. In neuroscience, optogenetics methods allow us to turn on and off neural circuits to understand their effects upon behavior. In molecular biology, CRISPR does the same for genetic circuits and networks.

The problem is not executing the research. It’s the ability to ask the right question. For biology, generating a hypothesis that is parsimonious with all of the current knowledge in a scientific discipline is challenging for human scientific superstars and downright impossible for your typical graduate student coming up with a thesis project. I believe that the same is true for any area of science where the volume of knowledge and relevant data has expanded exponentially.

But all is not lost. I think this is a perfect domain for AI as it exists today. Keeping tabs of many disparate but relevant data points and then coming up with a next move? That’s how AI’s beat humans in chess right now. So… AI in collaboration with human scientists might be a very fruitful collaboration going forward. And it may yet save hypothesis-based research.

The importance of Big Data….

Michael White’s excellent piece is here. His view is that it’s different from the Popperian hypothesis-based science we are all used to and I tend to agree. I also worry that the term ‘Big Data’ is in serious danger of being over-sold. That happened once upon a time to another hot new discipline: Artificial Intelligence… and the results were not pretty.

On the danger of being oversold: NeuroX

I’ll simply note this morning that today’s NYT op-ed piece on interpersonal neuroscience is the latest case of the NeuroX fad–applying a neuroscience framework to any and all social issues. This sort of journalistic conceit is becoming noxious I think–neuroscience as a field is still young, lacks a full theoretical underpinning and is simply not ready to be the explainer-in-chief for all human/animal social phenomena.

The last time a nascent discipline got over-exposed like this was I think the overselling of Artificial Intelligence. It wasn’t good at all for that field.

Opportunities in Business Intelligence

I’ll be joining Ron Brachman (VP-Worldwide Research operations at Yahoo) and Ernst Volgenau (Chairman SRA International) to talk about the current applications, latest trends and business opportunities in business intelligence for marketing, risk management, fraud prevention and homeland security at Maggiano’s in Tysons Corner on September 12 from 7:30 to 9:45 a.m.

You can register here.

Jim

Washington DC "tis the season"

A strange aspect of living in Washington is how, come the holiday season, the weather turns actually winter-like (as if on cue) and how there seem to be an endless stream of holiday parties that successively pile up on the days as if they were storm waves crashing up on a beach. So it is this year living along side the Potomac River.

I gave a talk yesterday on artificial intelligence–a topic that I get asked about quite a bit, but one that I’ve always felt somewhat removed from, given my own work in neuroscience. And yet, the field of AI has made great strides since Marvin Minsky–although my take is we’re not anywhere near the “strong AI” that was sold last century. One way new way to think about AI I suppose may be to go beyond the Turing Test towards some deeper understanding of our own consciousness and what it means to be self-aware.

And so we return to the topic of studying consciousness and the mind–which is the raison d’etre of the Krasnow Institute for Advanced Study.

Happy Holidays,
Jim