Sunday 26 November 2017

Don't believe the database!

I spent a little while investigating GBIF to follow up on Stuart Roberts' comment on the NFBR Facebook page that relatively few countries regularly upload to GBIF. To try to find a bit of context to compare UK data with northern Europe I looked at Eristalis cryptarum which in the UK is confined to three or four 10 hectads in southern Dartmoor. It was formerly present in the New Forest and at Studland, and there is a scattering of very old records from south-west England (Map 1).

Map 1. Distribution of Eristalis cryptarum according to held and scrutinised by the Hoverfly Recording Scheme. Black circles = 2000 onwards, Grey = 1980 to 1999; and Open = pre-1980.
Turning to GBIF, it is clear that E. cryptarum is widely distributed and largely Boreal or sub-Boreal across the Palaearctic (Map 2). I assume that the majority of records are correctly identified but of course there can be no certainty! Anybody working in another country might make the same assumption but would they be right? I'm afraid not! And where is the obvious problem? The UK, of course!
Map 2. Global distribution of Eristalis cryptarum according to GBIF as on 26 November 2017.
What on earth does he mean? I hear you ask: after all the UK records come from the Hoverfly Recording Scheme? Unfortunately they don't; all sorts of data are put on GBIF, verified and un-verified. So what is happening in the UK? A blow-up of the map (Map 3) tells the story very clearly.
Map 3. Distribution map for Eristalis cryptarum in the British Isles according to GBIF as on 26 November 2017.
I cannot comment on the Irish records; I am not aware of E. cryptarum in Ireland but would be happy to be corrected if there are genuine records. What I do know is that there are no records from Scotland, nor are there records from East Anglia. Both of these records are clearly erroneous, so where do they come from? I've not managed to work out where the East Anglian record comes from, but the Scottish one is amongst recent records compiled by Buglife! Where is the quality control before putting data onto the national and international forum? My guess is that both records are misinterpretations of colloquial names - both E. cryptarum and Sericimyia silentis share the same name: Bog Hoverfly.

This is of course an object lesson in why the rigid structure of Latin names exists - why try to usurp it with names that cause confusion? Meanwhile, beware outlying records that just don't look right! On which theme, I wonder about the record from the Pyrenees but would not dismiss it because there is the possibility that the right environmental conditions obtain at some altitude there. So, for UK distribution I would avoid any compilation of data from sources other than the HRS - we have problems but do normally manage to sort out the obvious glitches.

3 comments:

  1. NBN are our GBIF node. I note that both these erroneous UK records have came from NBN holdings. The Scottish one is from Millhall Bing, Stirling from a few years ago. There is a recorder/determiner name attached to it and it comes from a Buglife dataset. I suspect the reason this is here is, as you say, because it shares a vernacular name with Sericomyia silentis

    ReplyDelete
  2. Not driven directly by records in the same way but this is the distribution map from Fauna Europaea. It also includes Ireland and Spain. I'm not entirely clear how it is agreed but it is interestingly incongruent with the gbif records. https://fauna-eu.org/cdm_dataportal/taxon/12948891-a368-4fd8-b611-b16cbcb37369

    ReplyDelete
  3. It is always possible that there is wider coverage but I suspect that this is at a particular altitudinal band. BUT that could mean wide distribution at a Country scale

    ReplyDelete