Cut processing times in half with Elsevier
Industry: Scientific and Medical Publishing
How we helped: Technology leadership, Capability Improvement
Successfully productionised a machine learning (ML)
Developed by the Elsevier Data Science team
Embedding DevOps directly into the small multi-skilled teams
In order for developers to fix problems as soon as thy appear
Steady stream of optimisations and improvement
Which stabilised overnight batch processing
Elsevier is a Dutch publishing and analytics company specialising in scientific, technical, and medical content, including well-respected journals such as The Lancet and Cell. Elsevier’s products also include digital tools for data-management, instruction, and assessment.
Elsevier has an annual revenue of over £2.5 billion and publishes more than 560,000 articles annually in 2,650 journals. Its archives contain over 16 million documents and 42,000 e-books. Total yearly downloads amount to more than 1 billion.
“After receiving positive feedback, machine learning was used to quantify metrics including scope match, trending issues, and author credibility.”
Director, 101 Ways
Due to the sheer volume of submissions, the editorial team found it increasingly difficult to quantify all papers. Because of this, some were rejected without being reviewed, meaning important papers were getting missed.
There were 10 million articles and the team was working with a data set of 200 words; each job would take between six and 12 hours.
Once all the data was collated, it had to be made GDPR compliant. The raw data then had to be split out and entered into a separate database with Personal Independence Payment (PIP) data, which required extra engineering.
“The data science was created by a separate team and the engineer had left, leaving a gap in knowledge transference. Elsevier, therefore, needed to reverse-engineer it so that the current team could fully use it effectively,” explained Emma Hopkinson-Spark, Director, 101 Ways
101 Ways needed to help speed up the processing of manuscript submissions to reduce the workload on editors. It, therefore, decided to automate as much as possible to help Elsevier avoid having to reject high-quality documents that could otherwise end up being published by competitors.
For the pilot, a target of creating a proof of concept in two sprints was set. 101 Ways decided that an agile methodology should be used; with wireframe and raw data in one sprint, and created accounts and current data for their journals in another.
“After receiving positive feedback from the team, machine learning was then used to quantify metrics including scope match, trending issues, and author credibility,” said Hopkinson-Spark.
To do this, the team took the abstracts of approximately 200 words and did Natural Language Processing (NLP) metrics to calculate a summary of each paper.
101 Ways is committed to helping organisations solve problems and build great products.
We bring together decades of experience and high-quality consultants specific to what your organisation needs to deliver better overall outcomes. Find out how we can help you today.