Rigorously defining what's definitely a real source and what definitely isn't is impossible. The only way to truly verify something is to go back and do another observation to see if the signal is recovered by a repeated observation. Even so, we can still quantify things enough to be useful for algorithmic approaches.
With AGES our automatic techniques mainly rely on GLADoS. While we mainly use it as an additional, supplementary technique to increase completeness, it can also be used independently. As with other source-finding algorithms its reliability never approaches 100% except for the brightest sources, so again this by no means eliminates the need for human inspection. Things must be set up quite carefully for this to work well, despite a lot of effort to maximise reliability under different conditions.
GLADoS works by taking advantage of the multiple polarisations in which we observe the data. The HI signal itself isn't polarised, so a real signal should appear in both polarisations. Noise, however, is uncorrelated, so if a spike is present in one polarisation but not another, it's likely spurious. This doesn't help eliminate bright artificial signals (like airport radar and overhead satellites), but it does reduce the random variations in the noise that look like galaxies.
GLADoS operates as follows :
The actual operation of GLADoS takes a few minutes to a few hours, depending on the data set. Although it sounds complicated, its efficacy has been demonstrated to be high. One of the key advantages is that we operate on cubes where the brightness has been converted from absolute flux to signal to noise, which means only minor adjustments to the search parameters are needed from cube to cube. Since it operates only on source brightness and extent, only the most minimal of assumptions are made about what constitutes a source.