Wednesday, 2 August 2017

Making sense of data

Over the past two years I have take a leaf out of John Bridges' book and have attempted to get out recording every day. It is a tough challenge so I reckon John does pretty well getting out as often as he does. Some months I do better than others, but in July I managed to do something every day; partly because I was walking to the hospital every day for the first 3 weeks.

My routine has been to record everything I see, no matter how common it is. If I enter a new 1km square, or new recording site unit then a new list commences. This, I hope, is not dissimilar to some of the recording by Facebook group members such as Kevin Bandage. Thus, I have attempted to create a complete record for the month that might be used for comparison with the data emerging from the Facebook group. In strictly scientific terms the comparisons are sufficiently different to say that one cannot draw firm conclusions from the data, but they do paint some important pictures.Thus, in Tables 1 & 2, I present my own data and the data extracted from the UK Hoverflies Facebook page, iSpot and Flickr for July.
Table 1. Records at generic level generated by photographic recording in July 2017. The data include a combination of full and partial records comprising a total of 5091 records.

Table 2. Records generated by RM in July 2017 comprising a total of 1758 lines of data
I have included basic counts of numbers of species and gross numbers of records aggregated at Generic level. The only point of departure between what I record and what is recorded from Photographic data is that I don't record female Sphaerophoria. In common with the photographic dataset, I created separate lines for males and females, except where the numbers reached such proportions that it was not possible to count them.

It should be noted that a substantial number of Facebook members maintain their own spreadsheets that are submitted periodically. I have not attempted to do anything with these data as this is simply a very rough analysis. More detailed analysis is needed but will require a lot more work.

The results are pretty informative.
  •  A total of 4231 full records and 860 partial records were generated by photographic recorders. My data yielded 1757 lines for a full ID and one partial ID (a female Eumerus). When you bear in mind that most of the really assiduous recorders contributing to the HRS rarely generate more than 1,000 records in the course of a year, my efforts show what can be done but they are based on a level of effort that cannot be sustained by most recorders and would not have been sustained by me without the enforced period of hospital visiting.
  • It is clear that using a large pool of recorders is an extremely effective way of securing records from a wide range of species, including a significant number of relatively uncommon animals that a single recorder, no matter how diligent, is unlikely to see on a regular basis. Thus the species list for the photographic dataset stands at 101 species; whereas my own list was considerably shorter (70 species).
  • The same obtains at Generic level, with 50 genera reported by photographic recording as opposed to 35 by my own efforts.
  • Geographical coverage within the photographic dataset is country-wide, whereas my own data cover fewer than 5 hectads at locations in Northaptonshire and south London.
  • The ranked frequencies of the genera as represented in the two datasets (Table 3) are substantially different, as illustrated by the genus Cheilosia, which in my dataset lies second in the ranking whilst in the Photographic dataset lies at no 8. Other genera that enjoy a more prominent role in my dataset include Paragus and Pipizella. All three of these genera are difficult/impossible to do from photographs and yet are extremely abundant when recorded systematically.
  • The abundance of some genera such as Platycheirus in the photographic dataset suggests that there may be a weakness in my search techniques for these genera, although I am at a loss to understand why that may be so - not only do I make visual searches, I also sweep suitable vegetation wherever possible (hence the strong representation of Paragus in my data).
  • On this note, I suspect the answer to some of the differences in frequencies probably lies in regional variation in species' abundance. For example, I have been amazed by the numbers of Volucella inanis and V. zonaria in south London this year. Conversely, SE England is always very weak for Leucozona glaucia and L. laternaria; hence their poor showing in my data.

Table 3. Comparative positions of individual genera when organised in rank order within the two datasets.
Thus, what can we say about the data? Well, both systems have their strengths and weaknesses. I suspect that what we really need is a network of recorders who adopt similar techniques to those I employ if we are to establish the sort of contextual data we need to make full use of the photographic dataset and to understand the trends that might be conveyed in both datsets.

No comments:

Post a Comment