Diplomacy helps linking

Share this on social media:

Topic tags: 

Ed Pentz, executive director of CrossRef

The job of the diplomat is to get people with diverse interests to work together on something that is of common benefit. But these skills aren’t reserved for the likes of the United Nations, as Ed Pentz has discovered. He is executive director of CrossRef, the organisation set up to register persistent citation links between the outputs of all the major STM publishers.

Ed Pentz

According to Robert Campbell, president of Blackwell Publishing and treasurer of CrossRef, 'CrossRef is quite an achievement because you have a bunch of publishers who are used to competing with each other, sitting around a table co-operating. This only really works if everyone is on board so we had to find someone who could make that happen. Ed quietly put it together, recruited some good staff, and just got on with the job.

'It is interesting to see how he operates because it is very low key. He is cautious about expressing his own view and lets other people express theirs. Nobody on the board thinks they are being managed by Ed because he is a good diplomat. He makes suggestions to us in his quiet way and we go along with it. He is not seen as the CEO, he lets us feel that we own it and we all get along.’

The librarian influence

Pentz was born in Pittsburgh, Pennsylvania, and move to Philadelphia as a child. His father was a lawyer and his mother was a librarian. His mother’s influence led him to an early interest in books and, although he played a bit of sport, he was a bookish child. He did well enough at school to get a partial scholarship to Princeton, where he studied English. The partial scholarship meant he still had to work so he decided to learn how to type properly, a skill which would serve him well in due course.

Part of his course took him to the University of York, UK, for a year and there he met the woman who would become his wife. He had a strong urge to travel and so, after Princeton, he went to Singapore to teach English. The future Mrs Pentz came out to join him and they spent a long time travelling around South East Asia and Australia before settling in London. There, Pentz used the typing skills that had got him through college to get a series of temporary office jobs. This gave him his first opening in publishing as a temporary PA to the editorial director of WB Saunders.

He was quite taken with the idea of a job in publishing but, with no experience in the field, he found it difficult to get a break. Eventually he found a job with Harcourt, where he ended up being thrown in very much at the deep end. He said: 'I started my job at Harcourt doing exhibits and, by chance, the guy who ran the department left six months before the Frankfurt Book Fair. I was the only person left standing. But it turned out to be a fantastic experience.’

Although he enjoyed the experience, he decided that exhibits was not how he wanted to spend his career. However it allowed him to be treated as an internal candidate for other jobs in the Harcourt/Academic Press group.

He said: 'What was good about the exhibits position is that I went to the conferences and met many of the authors and editors, so being an internal candidate clearly helped.’

His next step was to become an assistant editor, helping to run six life-science journals for Academic Press (AP). He greatly enjoyed the job and learned how the academic journal production process worked, but he could clearly see that there was not much scope for advancement. He said: 'I didn’t have a scientific background so it was unlikely that I would ever become an editor, even though we did have advisors.

Experiencing electronic publishing

However, the job was providing him with other experience. The internet boom was starting to gather momentum. Browsers and web pages were springing up everywhere and AP, like all other publishers, was looking at ways of getting involved. Part of Pentz’s role involved putting the contents pages of his titles on electronic bulletin boards. By the standards of the time, this made him an expert in electronic publishing: 'I guess that, in the kingdom of the blind, the one-eyed man is king. Having the journal experience and a tiny bit of information about electronic publishing was a good combination.’

He was then moved to a department with lots of computer techies to start developing ideas. The computer techies knew nothing about academic publishing so Pentz, because he knew at least something about both sides, ended up as the missing link between the old and the new world of publishing. Most of his work was about CD publishing as this was the trend at the time, but there were also experiments into putting journals online and online subscription services.

He was involved in putting the first AP title online, the Journal of Molecular Biology. There were a few ad hoc electronic publishing projects before AP really got behind the concept in 1996 with the IDEAL library system. The internet boom was in full swing and Pentz was offered the chance in 1997 to manage a development unit in the USA. There, amongst other things, he led the redesign of the IDEAL system. 'It was a fantastic time, with lots of experimentation going on,’ he said. 'Once we got the journals online we started looking at reference linking and developing the first bilateral agreements,’ he continued.

The big publishers were looking at bilateral agreements, which would have favoured them at the expense of the smaller publishers. Eventually a consensus (or near-consensus) emerged that it was actually in everyone’s interest to have some fairly standard infrastructure for linking, otherwise the reader-experience from the emerging fast-search mechanisms was going to be extremely poor. The readers did not care which publisher hosted a piece of content that was referred to in another. If they were going to benefit from linking, it had to be seamless from their perspective. Also, the 'appropriate copy’ problem had to be addressed to avoid budgets being wasted on content that was already paid for.

Changing Locations

One of the problems emerging was that the web locations of journal articles can change as publishers come and go, journals change publisher or publisher platforms evolve. What emerged was the Digital Object Identifier (DOI), which was demonstrated at the Frankfurt Book Fair in 1999. This is a unique reference for any object, whether it is an article in a journal or anything else that publishers might need to link to. This is intended to stay the same, no matter where the object was located.

The next step from this was to collect the actual locations of these articles into a central database that publishers would update whenever something changed. There also needed to be standards for metadata that would feed into the secondary databases to make all the linking services work. This needed to be run by a neutral party so the Publishers International Linking Association was formed in January 2000. Its purpose was to agree the standards and run the service that became know as CrossRef.

Pentz was offered the chance to set up CrossRef. 'I was quite happy at AP and was not looking to leave but CrossRef was such a fantastic opportunity in terms of my career. It was like going to business school without having to go to business school,’ he explained. 'It was thought they might want a senior figure from the industry but I guess at the time it was quite risky even though it was backed by all these publishers. I suppose I had a lot to gain by taking the risk.’

A board was formed, with representatives of the major publishers and a few smaller learned society publishers. CrossRef was loaned $1.5 million for the hardware and software development needed, and worked with vendor Atypon to get the service up and running by June 2000.

Within a year CrossRef had 1,100 journals, from 33 publishers, that were linked using the prototype system. Some 10 million DOI had been issued by January 2004 and 20 million by April 2006. Today about 14,000 journals use its links. DOIs are issued automatically according to the metadata supplied by the publisher, for which they are charged a fee. The income generated from registering DOIs and membership subscriptions has been used to repay the loans. To date, nearly 40 per cent of the loans have been paid off and the target is to have repaid them all within four years. It has had an operating surplus since 2003.

In 2004 CrossRef decided to build closer links with the European publishers. Pentz himself decided that it would be quite a good time to move his own family back to England and so, rather than setting up a European Branch, the head office moved to Oxford. He said: 'We wanted to be close to new and existing customers. The European Patent Office uses CrossRef and the World Health Organisation has some content that will be joining us.’

What the future holds

Clearly CrossRef is now a success, but it did not come without a lot of hard work. Pentz said: 'It is a large group of very disparate publishers. We’ve had a great board of directors and many publishers have taken the wider view and that has brought some of the other publishers along. As we become more established, it is important to keep the momentum going. There has been a shift in the publishing industry with things like Google and other services on the web; they are focused entirely on the user. I think publishers have to start thinking that way too and CrossRef is about helping them to do exactly that.’

DOIs are not limited to journal articles. They include conference proceedings, protein descriptions in the Protein Databank and standards. Pentz believes that eventually they will be assigned to diagrams within articles, and even datasets and working papers.

In addition, Pentz believes there are plenty of other opportunities for CrossRef to provide other services which all publishers will benefit from. 'Our mission statement doesn’t mention reference linking, it’s about doing things collectively that publishers can’t do individually, to benefit researchers,’ he said. These include being able to associate multiple URLs with one DOI but they also go beyond reference linking to getting involved in guidelines and standards in publishing.

Pentz also sees the potential for multiple versions of documents as a major issue. 'With author self-archiving there could be different versions of an article around, so labelling things properly is really key. We can also expand the metadata to include rights information and pricing,’ he said.

Stamping out copying

Another hot topic for CrossRef is plagiarism. 'The plagiarism systems that exist at the moment are focused on secondary schools. With the changes that are going on in the industry at the moment, the publishers are looking at ways to add value and have authoritative quality content,’ said Pentz. 'Things are much more international these days and there is much more cross-disciplinary work so fields are more segmented. This would be a tool to ensure the quality of the content.’

'We are very keen to get everyone together, to set up a plagiarism detection service,’ agreed Blackwell’s Campbell. 'It can only be done if all the publishers work together but it would really give refereeing a boost. If we have a system to pick up straight copying it would take a lot of the work out of peer reviewing, and more people would be willing to take the job on.’

'Many of us believe that CrossRef is not just about linking; it’s about anything that improves the reader experience,’ summed up Campbell. And with so many ideas and things going on, there looks to be plenty to keep Pentz busy for some time to come.

CURRICULUM VITAE

Education
1985-98 Princeton University, BA English Language and Literature

Employment
1989-90 National University of Singapore, English teacher
1991-92 various temporary positions in London
1992-2000 Harcourt/Academic Press, London. Started as exhibitions officer, then assistant editor, life sciences, electronic publishing developer, then electronic business development manager in Burlington, MA executive director, CrossRef, Lynnfield, MA, then Oxford, UK from 2004.

John Murphy