Source Extraction : Or, How To Hunt For Hydrogen

Optical astronomy has well-developed ways to turn its raw data into useful catalogues of sources : galaxies, stars, meteors, Elon Musk's bloody Starlink satellites, etc. But in radio astronomy the situation is not so advanced. There are automatic procedures we can use, but generally only as a supplement to the old-fashioned approach of just looking at the data. I like to call them "semi automatic", but that's because it's very hard to make a good joke about radio astronomy data processing techniques and that's the best I could come up with.

Sometimes people express skepticism about the visual procedures, especially one referee who called it "convoluted". It isn't. Nor is it as arbitrary as one might think : as described on the cataloguing page, we can quantify how successful this in a rigorous (albeit imperfect) way.

On these pages I describe how visual and automatic methods typically work, especially with a view to promoting visual extraction, which is neither as tedious nor as irrational as people sometimes think. For the beginners, I also give an overview of what we mean when we assess our catalogues : how exactly do we quantify what we can't find ? And I describe the easily-misunderstood process of stacking, which can increase our sensitivity compared to the raw daw... but with a high price that people (myself included) often underestimate.

Understanding Catalogues : What do we really mean by the "accuracy" of a catalogue ? What does "sensitivity" really mean ? These terms sound simple, but they're really not. In this page I attempt to explain some of the philosophical subtleties, as well as the biases that affect how we construct catalogues from real data.

Data Structure : Radio astronomy is awesome ! But what makes it so awesome ? After all, our radio cameras might typically only have the equivalent of 7 pixels... but each of those can record thousands of images at once ! On this page I explain the basics of the 3D data cubes our observations churn out and how we go about interpreting the data.

Visual Techniques : Once we've got a data cube, we need to catalogue what's in it. We can and should do this the old-fashioned way, by eye. This is not just a matter or arbitrarily deciding, "well I think that looks like a galaxy, so let's put that in a catalogue". Here I explain exactly how this is done and why I sometimes trust this method more than automated search algorithms.

Automatic Techniques : Not that we should avoid automated source extraction by any means ! I just think it's a mistake to rely on it exclusively. This page describes my own automatic algorithm that I came up with while working in Cardiff University in the small hours of a morning sometime, and it seems to do a decent job.

Stacking Galaxies : Combining observations to increase sensitivity is a veritable minefield of easily-misunderstood subtleties. I won't claim to fully understand it myself, but on this page I attempt to address some of the more counter-intuitive pitfalls.

Stacking Diffuse Gas : If we're very careful, we can use stacking not just to look for the gas at the positions of known galaxies, but even to find the most extended, diffuse gas without any optical counterparts. Here I describe some of the techniques I've experimented with to try and do this, noting especially the ideas that seemed like a good idea at the time but turned out to be utterly fruitless.