Archive for March, 2009

Repository Softwares Day

Tuesday, March 24th, 2009

Last Thursday I attended a ’Repository Softwares Day’, organised by the Repositories Support Project (RSP). Held at the Museum of Science & Industry in Manchester, the event comprised a good mix of presentations and exhibits from key software developers such as EPrints, DSpace, Fedora, and so on.

Microsoft were there, talking about their complete cycle of solutions for the scholarly community. So, from tools to assist academics in researching and writing their paper, through to publishing platforms for hosting e-journals, and then finally their open-source repository software.

In terms of the repository end, I was left wondering whether there is room for more software – certainly in the UK, where EPrints and DSpace are very well established. Of more interest, in my opinion, was hearing about their article authoring add-in for Word 2007. Installing this enables the user to create very well structured technical documents (e.g. journal papers) in a way that captures additional metadata and semantic information at the authoring stage. The add-in also makes use of SWORD (Simple Web Service Offering for Repository Deposit), meaning an author could potentially deposit their article in whatever repository they choose from within Microsoft Word at the click of a button, assuming the repository is SWORD-compliant. This carries benefits for both the author (through ease of deposit) and Repository Managers/Administrators (possibly more full text). We will certainly be looking at making ORO SWORD-compliant in the coming months so as to take advantage of these features.

Another tool that I came away from the day feeling quite excited about is SNEEP (Social Networking Extensions for EPrints). I’d read bits and pieces about this plugin for EPrints (the software underpinning ORO) prior to attending this day, but I was really grateful of the opportunity to see an actual presentation on it. Basically, installing the plugin would give us three new features for ORO: the ability to comment, to bookmark, and tag individual eprints. The various permutations of who can and can’t add / see comments and tags are explained in the SNEEP Wiki pages; again, I expect us to look into the possibility of installing SNEEP for ORO in the coming months.

The final major point of interest from the day for me was hearing and learning more about the various CRIS (Current Research Information System) solutions on offer. I’m going to mention Symplectic here, not because I’m endorsing the product, but simply because I attended their presentation, so it’s the one I feel most informed about currently. I was particularly impressed by Symplectic’s Publications Management System, which automatically gathers publications information from key databases such as Web of Science and ‘asks’ academics by email whether the publications it has found belong to them. If the academic clicks ‘yes’ then the article can automatically pass through to their repository, giving them the option to attach full text beforehand. More needs to be known, but one can see how a system like this could take away a lot of the data entry needed to populate a repository – an element typically cited by academics as the biggest barrier to depositing their work. However, the depositor is still making a conscious decision to put their work in their repository, but at the click of a button rather than by filling in lots of data fields manually.

All told, this was an extremely informative and thoroughly enjoyable day!

Citation analysis of non-journal material

Monday, March 2nd, 2009

Now that RAE2008 is over, like all UK universities we now have the REF firmly on our minds. Actually, it’s probably fair to say it was on the minds of many long before now. But it does appear to me to be all of sudden much higher on the agendas of not only research administrators but our academics as well - and not just those involved heavily in research strategy. Indeed, as we’re talking agendas, I’ve been to quite a few departmental meetings recently where the REF has been the main item for discussion.

At one such meeting last week I was extremely impressed by just how deeply a particular department of ours here at the OU is thinking about bibliometrics. In particular, they have not only begun to look at their citation counts for journal papers, but also non-journal material as well (edited books and book chapters, for example). This struck me as being very prepared and organised indeed, as up until that point (like most people, I think) I had been taking it for granted that this type of output will not be considered as part of the REF’s bibliometrics exercise. Actually, having come away from that meeting and reflected on things I still believe that to be the case, but it did make me think long and hard about the issue, which was certainly of great value.

At the same time as pondering this topic, and formulating plans to blog about it, I also came across a recently published paper on a proposed new method for bibliometric analysis of books. Although I confess at only having read the abstract, introduction and discussion (a bad habit from my student days!) this paper only served to strengthen my views that we are some way away from a reliable measure for non-journal material. Basically, the so called ‘libcitation’ approach described in the paper uses at its core a count of the libraries holding a given book. The theory is that in deciding what books to acquire for the audiences they serve, librarians make an informed decision based on the reputation of the authors and the prestige of the publishers that can be used as a basis for gauging impact. This seems to me miles away from the relative robustness of citation analysis of journal papers through Thomson’s Web of Science or Elsevier’s Scopus.

Non-journal material can of course be analysed through Web of Science and Scopus, but, as the authors of the above paper point out, these databases will only capture citations to books and book chapters by journals – not by other books and book chapters. Academics in the arts and some social sciences might quite rightly argue that this simply isn’t fair – that most of the citations they care about about will not be from journals. What’s more, Web of Science and Scopus don’t necessarily have the greatest coverage of journals in book-oriented subjects anyway. Also, I carried out a quick citation analysis comparison on a set of non-journal articles in both Web of Science and Scopus and found massive differences. In quite a few cases books and book chapters were not being picked up as having been cited at all by Web of Science’s journals, whereas in Scopus those same items had received tens and sometimes hundreds of citations. Granted, there are some differences between the journal coverage of these two databases, but not that much.

Lastly, there is Google Scholar. Most of you will be aware that when you find an item on Google Scholar it proudly tells you how many times it has been cited. Given the ubiquity of Google it is very tempting to take what it says as gospel. Well, don’t. Most people working in academic publishing or with bibliometrics will tell you the same thing – Google Scholar has potential, but as yet lacks editorial quality control. Although I have recently been left wondering just how much Web of Science and Scopus might be letting through their doors in order to achieve the coverage required to get the REF gig, I have no doubt that the editorial value they add is essential for a reliable platform upon which to base citation analysis.

So, this is why I think the bibliometrics element of the REF will only look at journal output. I’m not saying that future REFs won’t – indeed, I’m sure it won’t be too long before we have an all-encompassing Web of Science or Scopus product that pays better attention to non-journal material. But, for the purposes of REF 2012/2013 I just think it would be too ambitious and would be too full of holes. By all accounts HEFCE are having a hard enough time arriving at how to fairly analyse journal paper citations, let alone anything else.