Tuesday, 9 August 2016

Presenting records via Facebook

Use of the UK Hoverflies Facebook group as a way of getting photos checked has grown quite remarkably. It is a great way of facilitating interactions between observers and the Recording Scheme and is now generating a phenomenal number of records: there were over 4,000 gathered in July from web-based sources, with the majority coming from this group. Obviously such large numbers mean that data comes in all sorts of formats and there is quite a big job turning FB posts into a spreadsheet, so I thought some guidance on record/photo submission might be helpful.
My spreadsheet for web-based data has the following headings:
Column header
Species name
I record full identifications and partial identifications in two different spreadsheets. The first comprises those records that can be taken fully to species or to an aggregate name where there has been a split and the Resident Team refer to the aggregated name. For example, if we cannot put a firm name to a specimen that would once have been recorded as Xanthogramma pedissequum I record this as Xanthogramma pedissequum s.l.

If we can only get to Genus or Tribe, then these go on a separate sheet.
This is arranged dd/mm/yyyy e.g. 31/07/2016; if we cannot get a full date then it will be recorded as either 'July 2016 or as '2016. The latter two options are far from ideal but can be used.
Grid reference
Ideally a four or six-figure OS grid reference. A lot of grid reference finders take the resolution to the 1 metre level i.e. 10-figure level which in most cases is probably false accuracy if we move around to get shots and records. Nevertheless, if that is what the recorder has used then it is logged.
Location name
If a location name has not been supplied then I look the site up on Streetmap and find a logical name if I can.
This is the name of the person who makes the post. Where I know that people use various aliases then I include these too (some people have up to four aliases).
If a photo is published then I use my own name unless I feel it appropriate to defer to somebody else - either Ian Andrews or Joan Childs, or occasionally Gerard Pennards. Just occasionally I use the recorder's name when he/she has posted an awkward angled shot of an obvious species and almost certainly knows what they saw.

I run a separate spreadsheet for records that come in without a photo to back them up and in these cases the determiner is the name of the recorder unless there are good reasons to use another name.
Essentially male/female/larva but occasionally it is not possible to determine gender and in these cases the annotation is 'adult'. I use 'adult' for pairs in cop and make a note in the notes section that this was a pair in cop. Likewise where there is a nuptial stack of Eristalis nemorum I note the numbers of adults in the abundance column.
I run separate lines for males and females, except where there are mating pairs or nuptial stacks.
This is as best as I can do to determine how many individuals of the gender reported or photographed.
This is the URL for the Facebook post or for the iSpot or Flickr post. If several posts depict animals from the same location on the same day then I add additional URLs separated by &
Here I log observations by recorders and items such as flower visit records. I do not log the leaves an animal was on and make clear that the flower involved (if visited) was a potential pollination event by saying 'at x' or 'visiting  y'. This column is useful for behavioural notes. Where I am unsure of the plant associated with the hoverfly I usually put 'at ???' or 'at ?Sweet Pea' or 'at umbellifer'

What is the best format for a post?

I usually copy and paste details from the FB post, so there are things that can be done to help.

A really nice example of presentation might be along the lines of:

I had an excellent day in the garden with lots of hoverflies. Here are my photos:
Episyrphus balteatus (2 photos, same specimen); Eristalis arbustorum at hogweed; Eristalis pertinax; Eristalis tenax (2 photos of separate animals); Platycheirus albimanus male and female. both visiting vipers bugloss
Not photographed:
Syritta pipiens 1 male; Volucella zonaria 1female at buddleia
Observation details:
Queen Elizabeth II Reservoir, West Molesey

The above example is fine - I can cut and paste the basic site details straight into the spreadsheet. I usually then copy the species list into Word and run find and replace to turn it into a list with tabs to separate numbers and observations - the above would then look like this:

Episyrphus balteatus
2 photos, same specimen
Eristalis arbustorum
at hogweed
Eristalis pertinax

Eristalis tenax
2 photos of separate animals
Platycheirus albimanus
 visiting vipers bugloss
Platycheirus albimanus
 visiting vipers bugloss
I can then cut and paste each column into the spreadsheet

Some hints on what not to do

Species names
It helps not to abbreviate the generic name - I simply have to type this so it is not a huge problem, but it does add to the job.

Grid references:
One of the biggest problems we have is grid references. Stuart has found that the advent of GPS has exacerbated rather than resolved this problem. Somewhere between 5 and 10% of all records have some sort of grid reference problem.
The worst I have to deal with are lat long data, which people submit in all sorts of permutations - they can take a fair while to sort out and make sure that the correct OS reference has been determined.

The other regular problems are:

·         people leaving off the 100km letters (e.g. TQ)

·         people using lower case - I turn these into upper case to make sure that the spreadsheet is uniform and simple to read. It is a small niggle but does add to the time involved in creating a complete record.

·         Separating the grid reference using commas, lines etc e.g. TQ28/68 or TQ28,67 etc - I have to remove these. I regularly run a find and replace to clean all of these sorts of glitches from the data.
·         people who place the location out to sea, either accidentally or deliberately. We do get the occasional marine record from oil rigs and ships, but these are usually noted in the post.

I should also say that it does help to separate records from different sites onto different posts - I am a 'bear of small brain' and having a post with records from several sites does cause confusion and transcription errors.

What do the resulting data look like? 

Example section of photographic spreadsheet for 2016.



  1. Wow, so much effort, well done. A useful reference for anyone new to biological recording of any taxonomic group.

    Not sure about 'determiner' - I'd prefer 'verifier' or 'verified by'. The former suggests that the submitter had no idea in any case. The latter, which I think is used in iRecord, allows for correction within the verification process but can also imply 'confirmed by'. It would be interesting to know what the submitter suggested (if anything) and what the final result was so we would know what the confusion species for learners were but I am not for a moment suggesting expanding the spreadsheet and increasing the workload.

    There will be records that I will put straight into irecord (via SEWBRECORD) when Id is certain eg Marmalade hoverfly. In which case, all the required fields should be present, except maybe flower details. That educes my effort time, and hopefully Roger's as well.. If i have posted on Facebook for identification or confirmation, I will also post to iRecord in most cases if a good Id is obtained and it is new for me for that site or square for that month of that year (IE I don't post the same regular visitor to my garden every day). But these records may not be added till the winter or when I have time to do so.

  2. The issue of 'determiner is not as straightforward as you think Paul. We get huge numbers of posts with a simple 'ID please' or 'ID wanted' message. Often we have to go back and chase for the associated data. In addition, there are other posts where some have a possible ID and others don't. So, suggesting that all the recorders have ID'd their shots is not correct. In addition, if the Determiner is different on every occasion that becomes an administrative nightmare for completing the - using one name makes administrative simplicity that is essential. If I have to make the process more complicated then I will simply pack it in and stop running the dataset.
    Bear in mind also that I might not always agree with dets already made - so I am not verifying but am making a different determination. I should also add that quite frequently Ian or Joan might say 'maybe x' or 'maybe y' - neither is a firm det. So, I take a decision and record what I believe it to be. Sometimes I might be wrong, but be that on my head. If I think it can be recorded as species x or y then I do so; if I am unsure it gets recorded at family, tribe or generic level as appropriate. As I also pointed out, I also recognised that I am not always sure of a det and I do defer to others, especially to Joan on Xanthogramma and to Gerard Pennards on tricky species that I don't know well.
    Developing a block of data with a single determiner has its strengths as long as the determiner is reasonably reliable. These days, maybe 50% or more of the HRS data has been determined by me - which I hope strengthens the reliability of the data.
    Turning to iRecord - that is still a great deal slower than dealing with spreadsheets. Last year it took about 100 hours to deal with iRecord 'verification'. This year it will be a bit less I suspect, but by comparison I can 'verify' a spreadsheet of 1,000 records in about 15 minutes as opposed to doing the same on iRecord in maybe 10 hours. Seeing a spreadsheet is often a much better way of getting a fast overview of the ability of a recorder and their reliability. If one sees a list with some species such as Syrphus ribesii listed with both males and females, but no other Syrphus then one must start to get concerned - looks like the author is doing picture book science (possibly using Chinnery) - I have lots of tricks to work out who is actually reliable and who is not!
    The concept of 'verification' as used by iRecord is also highly questionable - I have had occasions when I have 'verified' a post lacking a photo, but on balance the date and location look right, only to then see a photo posted on another site that clearly is not the species concerned but extolling the strengths of iRecord for confirming identifications!