Friday, 2 December 2016

A new paradigm in parataxonomy?




Parataxonomy is an established approach to some aspects of biological recording. Its most high-profile proponent was Dan Janzen who employed local people in the rainforest of Costa Rica to collect and sort biological specimens after a considerable amount of training (1,000 hours). The intention was that Costa Rican parataxonomists would cover a wide range of organisms and the level of training was considerable. It was probably far more than any modern graduate would have attained! I suspect that not all subsequent use of parataxonomists has invested quite as well in their training, and therefore the results will have been different. Nevertheless, parataxonomy remains a useful, if constrained tool in developing an understanding of the diversity of plant and animal life.

For a long while, I have felt that at least some of what happens in UK biological recording is, in effect, parataxonomy. There are relatively few professional taxonomists, and only a small number of non-vocational specialists who would naturally fall into the class of ‘taxonomist’. There are a good many more (me included), who are sufficiently competent to arrive at reliable determinations within certain taxonomic areas. A few exceptional individuals may also spot the splits that need to be made, but many will simply follow the guidance of others. Are we taxonomists or parataxonomists?

There then follows a longer tail of people who take an active interest in some aspects of plant and animal life. Some may attain respectable competency in one or several disciplines, but choose not to take their interests any further. Here, the specialist may vet their records and query a small number, but in general their records go straight into the system.

Finally, there is a further cohort that is gaining skills or whose interests are peripheral to the discipline. In today's world that must include the growing army of people who, armed with a camera, record the wildlife of Britain and post their finds on iSpot,  iRecord and a wealth of Facebook pages where a name may be provided by a specialist. Bearing in mind the general agreement in the literature that parataxonomy is a good thing (within agreed parameters), it seems to me that we must embrace this new paradigm.

It is very clear from the UK Hoverflies Facebook page that many latent parataxonomists are emerging as they acquire experience and skills. Thus, I would liken the Facebook pages to the training that was given to Costa-Rican paratoxonomists by Dan Janzen. Being open-source learning, everybody progresses at their own pace, but in the course of a couple of years the most active participants will have acquired considerable skills. Quite a few of the the early recruits are starting to spread their wings and are now helping with IDs. That is a very positive move forward. Bearing in mind the investment in training made in Costa Rica, one must wonder how long it will be before there is a self-sustaining training element on the various Facebook pages.

Thursday, 1 December 2016

Where do recording schemes fit into the Biological Recording map?

Over the last couple of weeks various posts on the NFBR Facebook page have got me thinking about the way biological recording is going and whether this really fits with the roles of recording schemes. Where do the schemes fit in today's developing system? I put together a simplified flow diagram to try to tease out this issue (figure 1). In this structure, recording schemes start to look a bit misplaced - there are at least two other ways of doing at least some of what the schemes are doing: the automated system of iRecord and the LERCs, which I have classed as data compilers. The very disparate nature of recording schemes means that whilst some such as the HRS might be regarded as data compilers, others are simple data gathering exercises that still rely one another body to put their work into the public domain.
Simplified data flow chart (revised 02 December 2016)

As a follow-up, I have tried to look in detail at what the Hoverfly Recording Scheme does and how this fits into the data flow model? (Figure 2). What struck me about this analysis is the sheer volume of activities that are involved in running a recording scheme!



Saturday, 26 November 2016

Upping our game in biological recording


As a follow on to my last post, I note that there was a further post in the NFBR thread that started my thoughts. This new post suggested that there should be engagement between NFBR/NBN and Defra and its agencies to determine how biological recording could 'up its game' in order to attract more funding from them. That has me very worried because I fear that critical evidence is being overlooked.

A little while ago, there was a major outcry because Natural England had decided to cut short its commitment to fund LRCs and would only be making funds available to increase the centralisation of data that is completely freely available. In the meantime, we have seen a vast swathe of natural history curators at regional and national museums made redundant. Funding for long-established monitoring schemes such as the Rothampstead Insect Survey and the National Moth Recording Scheme is also diminishing.

In total, this paints a very clear picture: Government is not committed to supporting biological recording, even if there is genuine good will within Defra and the Country Agencies. Instead, there is an increasing belief that Government data requirements can be provided from volunteers and that the infrastructure that facilitates that voluntary effort is un-necessary. I'm afraid the increasing publicity surrounding 'Citizen Science' generates the idea that there is an untapped wealth of technical capacity to meet data needs. I don't see an untapped source; rather, I see a highly active network of volunteers who are already giving very freely and in places are stretched to the limit.

I would therefore be very wary off talking about 'upping our game'. That suggests that we are not doing enough and that Government is right to think that technical capacity can be replaced by volunteers. I'm not sure it is a fair reflection of what is going on at the moment. The UK has perhaps the finest network of biological recording in the World. True, the Dutch, Germans and Scandinavians do a pretty fantastic job too, but I firmly believe that the UK is in the vanguard. 'Upping our game' can only be translated into – how do we get a quart out of a pint pot? Or, to use modern parlance – how do we achieve efficiency gains and improve productivity?

When it comes to biological recording it is pretty clear where the bottlenecks are:
  • Weaknesses in the infrastructure of professional appointments where technical skills are learned and honed.
  • An increasing demand of data and for data verification using an existing small cohort of specialists.
  • Increasing need for administrative capacity within recording schemes and societies. For example, the majority of specialist societies struggle to recruit key posts such as the secretary or the treasurer.
  • Shortages in people willing to step up to the role of County Recorder when the incumbent steps down.

So, how do we resolve these problems? Well, it is effectively incumbent on the existing organisers, shakers and movers to put in extra effort to try to generate our replacements. And, perhaps, we also need more people to step up to the jobs. Join the local natural history society or Dipterists Forum, BWARS etc, and then take on some of the administrative jobs that make things happen. Or, perhaps, take on the job of local field meetings secretary, County Recorder etc.

I think there are also some very simple things that could be done to assist that long-term investment:

  • A central insurance system for people who are prepared to run field meetings. I had considered setting up a local 'Active Naturalists' group but then I would need insurance to do this.
  • A simple grant system for capital outlay that might assist training programmes – something akin to OPAL.
  • A mechanism to support the publication of new keys and field guides. True, we do have the AIDGAP series, but even this will probably struggle to fund expensive colour productions.
These are just a few thoughts - I am sure there could be more. But, for me, I think we need to be a lot more vocal about what we are already doing and what Government and its agencies are already getting for very limited investment. I think we also need to be a great deal more vocal about what we need to deliver existing aspirations, rather than asking to be hit with more demands.

Increasing biological recording - why and how?


There is an interesting thread on the NFBR Facebook page concerning funding for biological recording. https://www.facebook.com/groups/NatForumBioRecording/permalink/789891461152258/

Rather than place a response on that thread, I thought it might be more helpful to do a longer think-piece that can be easily accessed in the future.

Having run a recording scheme that gets no funding and self-funds (Stuart Ball and I cover costs and subsidise training events), I would be very wary of making any formal ties with funding streams. The more money you take from Government, the more indebted you are to it and as a consquence the more it feels it can dictate what schemes do. Target-setting is inevitable and then we simply become an unpaid arm of Government.

The direction of biological recording


I am decidedly uncomfortable with the way biological recording is going. There is now a huge administrative infrastructure and a comparatively large number of professionals running around looking at ways of increasing biological recording and making use of those records. That is all well and good, but in many ways it has reached the point where organisers of the bigger schemes have to start to spend their time as administrators of recording and not as specialists whose passion was their scheme.

Matt Smaith makes a very good point about scale. It is fine for societies with tens of thousands of members and a paid worksforce to run projects that depend upon co-ordinators and administrators, but the further one goes down the tree (or should I say into the furtherst branches) schemes are dependent upon a very small core of people. BWARS straddles the divide because it is a subscription society. The HRS is of analagous size in terms of the volumes of records it generates, but is totally voluntary. Would we want to become a formal society? NO – it is a nightmare recruiting the necessary officers and just adds to administration. So, I for one will vote to keep independence and to minimise the volumes of paper we churn out.

How, therefore, can we increase biological recording?


I suppose the first question must be: do we need or want to increase recording? My answer is yes we do, because society is becoming increasingly sceptical of scientific analysis. We only need to see the way climate science has been vilified – it is probably believed by less than 50% of the populus and yet the scientific community is thoroughly convinced. Changes in population size and distribution of animals and plants ought to be a fundamental concern to society because they are indicators of the health of Earth's regulatory system. But, the messages they convey are likely to be unwelcome and will be challenged all of the way. So, the data have to be as robust as possible. That means that we need more and better data.

Web-based products to improve recording skills are certainly one answer, but there are places where this is not viable. Likewise, web-based feedback is another. We have seen the phenomenal impact of Facebook; not only on the HRS but also on many other Recording Schemes. Digital media have an obvious place in the mix, but I don't think they are the total answer. What is definitely needed in many areas of species identification is new keys that fit the modern requirement for high levels of illustration and simplification of difficult concepts.

Production of new keys and field guides is an obvious area where a formal society is needed. In the case of Diptera, where would we have been without BENHS who were there and able to attract the grant-aid that made Alan Stubbs' first edition of British Hoverflies possible? The combination of British Hoverflies and British Soldierflies was arguably the trigger for much of modern interest in Diptera as a popular subject rather than a fringe specialism.

So, I wonder if the answer to improving biological recording is actually to find ways of supporting the publication of new generation keys? Stuart and I are working on a new guide to Diptera, but as yet we don't have a publisher. We hope that somebody will come forward, but if they don't then the project will languish in the wings.

In this respect, one has to think back to OPAL and the small-scale funding that it provided. Dipterists Forum and the HRS benefitted greatly from this funding stream: we bought a set of 13 microscopes and a trinocular microscope and camera that has been used on innumerable occasions to run courses. The scale of its impact is demonstrated by the range of venues that we have visited using this equipment: From Lerwick and Kirkwall, to Glasgow, Gateshead, Bangor (N. Wales), Exeter, Studland, Norwich, Cardiff & Bristol. The list is considerably longer than this and I guess we have provided training for around 500 people. It does not stop there, because various other DF members use the equipment to run their own courses. 

Quite how many serious Dipterists have been generated as a result is difficult to say, but even if we have only generated two per year, that adds up to a considerable number after the 8 to 10 years effort that we have made. Looking back, I can think of the Chair/Chair-designate that have become the backbone of Dipterists Forum because of the introductory courses run by the Forum in the past 23 years. It does work, but it is a very slow process!

My feeling is therefore that we need to look at the critical infrastructure that will be used by volunteers. What we need to do is to develop the next generation and imbue them with a similar ethos to ours so that when we decide to retire (or pop our clogs) there is somebody who will take over the baton and do the same again; thus forming a virtuous circle.

Wednesday, 23 November 2016

Verifying iRecord



I have finally completed the iRecord verification process for data arriving between April and November 2016. There are about 220 records remaining to be verified once there have been changes to the species dictionary.

Last year I tried to make sense of the data but failed to separate out the records supported by photographs from those that were not. This year I learned from my mistake and ran separate logs so it is now possible to see what is going on! Table 1 presents the basic data on the time spent, numbers of records verified and the types of problems encountered. The significant figure is that around 10% of the photographic records have required some level of adjustment, with about 6% outright misidentification. The biggest howlers were a sawfly logged as Platycheirus granditarsis and what appeared to be a water trap full of thrips attached to another record that seemed to be OK. There was also a ladybird posted as Volucella zonaria!

Table 1. iRecord basic data


As a first stage in analysis, I have split the tabulation of mistakes into a series of units, mostly around Tribe, but in places simply a lumping for convenience!The ID given by the recorder is in the left hand column and the correct ID is in the top row.

Table 2. iRecord corrections for Bacchini for data updated in 2016

Table 3. iRecord corrections for Cheilosini

Table 4. iRecord corrections for Eristalini

Table 5. iRecord corrections for for Merodontini, Chrysogastrini, Paragini and Pipizini

Table 6.  iRecord corrections for Syrphini

Table 7. iRecord corrections for Volucellini corrections

Table 8.  iRecord corrections for Xylotini


These data show two discrete sets of problems:

  • basic mis-identifications, which make up about half of the problems; and
  • over-confidence in attributing names where it really is not possible to take photographs to species.

Relatively little can be made of the data that are not supported by photographs as all one can do is to use a bit of inspired judgement! A few interesting problems were encountered, such as records from Northern Ireland by relative novices that were unlikely. A record of Rhingia rostrata in particular exercised me as it was some distance further north than its northern limits in England. This coupled with my belief that R. rostrata does not occur in Ireland led to that species' exclusion. Another was a record of Eristalis cryptarum that later turned out to be Sericomyia silentis. Apparently the contrived name 'bog hoverfly' applies to both species, so we must watch for more of these glitches.

I will perhaps do a little bit more analysis in due course, but the figures tell their own story!

Saturday, 19 November 2016

Gender representation in hoverfly records

In theory, the numbers of males and females in a population are relatively evenly matched; or are they? I've seen nothing to suggest that the numbers of hoverfly larvae that reach adulthood are in any way skewed one way or the other. But, in the course of fieldwork, one sometimes gets the feeling that one is only seeing one gender. I therefore wondered about which hoverfly species were represented by disproportionately large numbers of males or females in the photographic records? Obviously a few, such as Sphaerophoria scripta and Syrphus ribesii will only be represented by males or females respectively because we can only put names to one gender. I suspect that in a few genera, such as Melanostoma, the data are skewed because we have more difficulty making a firm diagnosis where photos are from awkward angles. But there are plenty of examples where a firm diagnosis is not dictated by gender or the angle for either male or female characters.

From a biological perspective, if one gender is disproportionately represented, then it seems probable that the other is  not seen because it is doing something that keeps it away from the camera's prying eye. To test this idea I ran a pivot table and then extracted the data for the 50 most frequently seen species in 2016. The results were interesting (figure 1), not least because they helped me detect a number of transcription errors!

To simplify the statistics, I have worked on the basis that a 60/40 split in the genders is indicative of some sort of behavioural separation, such that we don't see one or other gender. What emerges is that in 70% (35) of the examples evaluated (disregarding Syrphus ribesii and Eupeodes luniger), males are outnumbered by females in the dataset and that in 40% of cases they are outnumbered by at least 60:40. Conversely, just 14% of males (7) outnumber females by the same ratio (disregarding Sphaerophoria scripta).

My thesis starting is therefore that the smaller the proportion of males represented in the dataset, the more likely it is that they are doing something that makes them invisible to the recorder. Logically, this suggests that they have mate-finding behaviours that we don't see, Conversely, in a few species, such as Cheilosia variabilis and Xylota segnis, males are so dominant in the dataset that either the females are confined to places where we don't look, or male mate-finding strategies make them very obvious! Alternatively, perhaps the males of some species are considerably shorter-lived than females?

Some of the results are relatively un-surprising, but others are a bit of an eye-opener, especially the relatively low numbers of male Epistrophe grossulariae and Leucozona glaucia! What is going on there? Perhaps it is simply that males require less nectar and pollen and are therefore elsewhere? Is there a temporal difference in the degree to which males and females are visible to the recorder?

The big question, therefore, is where are the missing males/females? and, also, if one gender can be significantly under-represented in the dataset what does this tell us about the data? I have long suspected that at least a proportion of supposed 'rarities' are not rare but are simply un-recorded because they are doing something that makes them largely invisible to the recorder. There is therefore plenty of scope for those with an inquisitive mind to track down what missing males/females are doing.



Figure 1. The relative abundance of males and females in the 50 most frequently recorded hoverfly species in the 2016 photographic dataset.

Friday, 18 November 2016

Has iRecord raised expectations of what recording schemes can achieve?



A thread on the NFBR Facebook page raised the very important issue of the 'contract' between recording schemes and contributors. If people place data on iRecord then they expect the data to be checked and verified so that it forms part of the national dataset. That is not an unreasonable position to take. Equally, the bodies that funded and continue to support iRecord have a similar 'contract' with recording schemes. They have stumped up the cash to create a system that captures a very wide range of ad-hoc records without the need for recorders to send data to a multitude of recording schemes. Therefore, if the recording schemes want the data they have an obligation to attend to the data and to follow the verification process. That does not mean that all records will be accepted without challenge but the data should be looked at.

Unfortunately, there is a problem! Vast numbers of records are being submitted to iRecord but are not being verified. Why is this? Well, the sad fact is that a large body of recording schemes have yet to sign up to the verification process, and as they delay the job gets bigger and becomes an almost insurmountable problem. I will readily confess to being a serial offender because it can take a long while before I attend to iRecord. This year people had to wait for about 8 months before I got round to the verification process. I simply did not have enough hours in the day to deal with iRecord as well as the wider range of jobs that are associated with running a recording scheme. I expect there were some who felt let down - I apologise for this. I've still got about 1,500 records to deal with but have at least cleared about 4,500 records. Hopefully the job will be done by December.

This brings me to the nub of the problem. Running a recording scheme has changed out of all proportion in the past ten years as a result of the digital revolution. Fast computers and (relatively) cheap digital imaging has allowed a huge number of people to participate in recording where they would not have done so previously. This technological revolution has also raised the expectations of bodies that draw upon biological records: statutory agencies, NGOs and, increasingly, the Academic World. The idea that 'Citizen Scientists' are a vast untapped source of information is now firmly embedded in the psyche of such bodies as well as politician who see this as a way of generating much-needed data very cheaply.

Looking to the past

When Stuart Ball and I took on the Hoverfly Recording Scheme in 1991 it had been moribund for 4 years. Philip Entwistle retired in 1987 and at the same time gave up running the scheme. We did not start by volunteering: we both worked for the Invertebrate Site Register (NCC) and our boss was Alan Stubbs - Alan approached us to see if we would take the job on. It was a potentially enormous task because the data was substantially card-based and there were insufficient funds to pay for the data to be digitised and checked. It would only happen if there were volunteers who were prepared to put in the time. Stuart concentrated on gathering available machine-readable data, whilst I took on the 2 cubic metres of record cards.

It took 5 years, with me often working 12 hour shifts at weekend to plough through the cards. Fortunately, the work could be done in the winter, so it did not impinge heavily on the summer months when natural historians like to get out. Figure 1 (below) shows how the data has grown. It paints an important picture: The period up until 1997 saw the backlog cleared. Between 1997 and 2005 the scheme was almost moribund - Stuart and I were involved in many other things and took our foot off the gas. In 2005 we decided we had to push a new project (Atlas 2010) in order to revitalise the scheme - that prompted the arrival of a lot of data from preceding years, hence the jump in records enteres in 2005 and 2006. The biggest spike was in 2011 when we finally drew in all the data and published the second provisional atlas. From 2012 onwards, the incoming data has seen a new spike, which is the impact of the UK Hoverflies Facebook group. I would expect data entered to climb further in late 2016 and early 2017, as the level of recording in 2016 has been phenomenal.

Figure 1. Data seembly and growth of the HRS dataset since 1991.

The composition of the data has also changed. This is illustrated in Figure 2 below. This clearly illustrates how data sources have changed. Traditional record cards have all-but disappeared (we have just one recorder who sends cards I then turn into a spreadsheet for Stuart to upload). The shift was towards machine-readable formats that are relatively easy to verify but do take time uploading into the database, especially when there are lots of grid reference errors. Record Cleaner has helped this process considerably. But the change towards photographic records is also apparent. The graph clearly shows how photography has started to change the data management process. It covers the period to 2015, but will look very different for 2016 because I anticipate about 30,000 records from photographs before we absorb iRecord data! So the shift over 25 years has been from time-consuming data entry in the winter to a substantially bigger effort to secure data in the summer. Of course that leaves the winter for providing feedback to recorders, so there is an upside in the time released from winter activities.

Figure 2. Sources of data

Today's paradigm

The sheer volume of data being generated by initiatives to increase public engagement in biological recording is amazing. In the case of the HRS it has resulted in the data flow doubling in the course of 5 years. I suspect the same holds for other schemes and some such as moths will have seen an explosion of data in line with the vast numbers of people who now run a moth trap in their gardens. Has the number of people who are willing to take on the task of verification and recording scheme administration also grown? Well, maybe, but a lag between the increase in activity and the generation of suitably skilled people is inevitable.

The answer is obviously to try to increase the numbers of people taking on the administration of biological records. It sounds simple, but then as the numbers of people involved increase there is a need for co-ordination and development of an administrative structure to make sure that data are maintained to a common (high) standard. If there are doubts about data quality any research outputs that point to problems in environmental policy will be dissected and easily dismissed by those who might be affected by any policy change. So, the bigger recording schemes are now grappling with data management issues. Inevitably, they are having to develop administrative superstructures that take people away from what they signed up to do and into the role of administrators, mentors and computer jockeys.

On the face of it, running a recording scheme sounds great and there will be people who will rise to the challenge, but there will be many others who are highly competent natural historians but who don't want to become administrators. The big question is: are there sufficient people to take on these roles? We must wait and see, but when one hears of County Recorders retiring with no replacement lined up, there is a serious issue. I have heard that the YNU now lacks a complete compliment of County Recorders. If that situation obtains in an organisation that has a longstanding reputation of commitment to biological recording, one needs to sit up and listen to the rumblings. Likewise, if large tracts of iRecord are also going unvalidated, or recorders are retiring without a replacement, they too are hinting at a problem.

There are many initiatives to improve the situation, but in my experience if you run a training course or mentoring programme you will probably only secure long-term commitment from about 1 in 10 people who participate. For some, it is part of a broader journey of discovery, others will decide it is not for them; there will also be those who are interested but realise that the requirements of the job are too great for the time they can commit.

Thus, is my glass half full or half empty? I think we must take comfort from the positive elements of the changing paradigm. Far more people are engaged and interested enough to contribute to iRecord. BUT, it will be readily apparent to those more active participants that the role of a recording scheme organiser is potentially very demanding and perhaps something best avoided. We must hope that by broadening the net sufficiently, there will be people who see an opportunity that can be turned into a positive experience. It is down to the schemes to try to secure that engagement.