"Concise industry news from the US pharmaceutical industry..."
New Account

The Magazine

Issue 4

This is a short description of the magazine.

E-magazine
  • Previous Issues

Blog

Where our team of guest writers discuss what they think about the current NGP US Issues.

Peter Duncan
Director of Business Development

Can digital pathology save drug development?

Peter Duncan of Definiens discusses the potential of digital pathology.
07 Jul 2010

Standard Issue

No Comments

NGP speaks with Professor Antony N Davies, Senior Marketing Manager, Waters Informatics about how the use of standards is a powerful enabling technology for moving information cleanly and in a compliant manner between applications.

NGP. Tony, you have been involved in laboratory automation and standardization of data formats since the late 1980s – what brought you into the field?

I moved to Germany at the end of 1988 just after the first of the JCAMP-DX data standards protocols for Infrared Spectroscopy was published. I was working at the time on a German government research project generating high-quality reference spectroscopic databases. The long-term goal was to produce a resource from which more intelligent analytical spectroscopy software solutions could draw and, although there was great willingness amongst the analytical community to contribute top quality data to the project, we were effectively hamstrung by the problems of getting the data into a useable condition.

We had every floppy disk format at the time from 3”, 3 1/2”, 5 1/4”, 8” sent to us. Obviously, tapes arrived in various formats and we received the old PDP11 exchangeable hard disks! When we found a drive into which the disks fitted it was quite often the case that the disks had been formatted by some other operating system which were common on laboratory equipment computers in those days and was effectively unreadable.

Sir Arthur Conan Doyle had his character Sherlock Holmes use the phrase “It is a capital mistake to theorise before one has data.” We used this quote at the time to advertise the project but it came back to haunt us, because we had the data but couldn’t get at it.

Fortunately, ISAS was well equipped with a wide array of hardware and software as well as IT development systems, so we managed to recover most of the data files submitted and were then confronted with the problems of writing software to convert the multitude of file formats into some common format so that we could actually work with the data.

The publication of the JCAMP-DX ASCII standard in 1988, which had been developed and implemented by the user community in close cooperation with all of the major instrument vendors, was a massive relief to use at the time.

NGP. But infrared is just one technique?

Yes, but despite vendor implementation of varying quality having the data coming in, a common standard format meant enormous savings in time and effort for us. It also showed that it was going to be important to get similar standards agreed with the instrument vendors in the other analytical fields. So we set about getting agreement in the other fields. Since the first publication in 1988, we now have seven standards published with two more for Electronic Magnetic Resonance Spectrometry (EMR, EPR, ESR) and Chromatography/Mass Spectrometry hyphenated methods in the final stages of acceptance and three more in the pipeline.

The International Union of Pure and Applied Chemistry (IUPAC) took over responsibility for the JCAMP-DX series of protocols from the Joint Committee on Atomic and Molecular Physical Data (JCAMP) in 1995 and, since that date, the standards have all been full IUPAC recommendations.

NGP. I hear a lot of talk about XML and its applications in laboratory Informatics, has the extensible markup languages reached the data standards field yet?

The JCAMP-DX standards are restricted to 7-bit ASCII character encoding. This lets you avoid a lot of trouble, but is somewhat outdated for modern IT systems, where one might have a data set generated in India, witnessed in Japan and annotated in Canada. Often the annotation can be the key to the value of the data and when global organizations want the maximum value from their data, other character sets often start to play a major role. This is where XML can help. XML is very good at transporting metadata, or data about data. It is this metadata that often provides the rich context to a data set.

The IUPAC Subcommittee on Electronic Data Standards started work on an XML migration format several years ago, but merged their efforts with ASTM International when they started to work on an XML standard to supersede the outdated netCDF standards in mass spectrometry and chromatography originally generated by the old Analytical Instruments Association.

NGP. Is that the AnIML standard?

Yes, AnIML for ANalytical Information Markup Language. This joint effort is still some way from completion, but it will be a large step forward when published as the team has addressed many of the issues that have arisen over the years as experience has been gained in the use of the netCDF binary and JCAMP-DX ASCII standards.

Interestingly enough, the key to successful data standards isn’t the format or language in which it is written, but more the agreement by all interested parties in a standard data dictionary – a set of terms whose meaning is agreed and understood by all. This is one reason why such projects fall into the sphere of IUPAC who for decades have been involved in just such work on a global scale.

NGP. Why should an informatics solutions provider invest in this area?

That’s simple to answer. In the past, instrumentation and methodology advances have been driven by the need to improve on the analytical figures of merit – accuracy, precision, limit of detection, working range, etc., but this isn’t enough for the modern laboratory. There is enormous pressure on analytical laboratories and we now have to take account of the whole of the laboratory workflow. This starts well before data acquisition and instrument control with capture of the relevant information about the samples and how they are to be analyzed. Following measurement, there comes the data processing steps including modeling, fitting, optimization and evaluation of uncertainties before moving on to the decision making steps and report generation. An informatics solutions provider must recognize these key requirements, particularly in the pharmaceutical arena, where customers are looking for vendors to provide comprehensive solutions rather than just an improved valve or solvent.

The use of standards is a powerful enabling technology for moving information cleanly and in a compliant manner between applications. And it must always be remembered that it is still the norm for the applications found in the laboratory to come from a number of suppliers and the most error-free means of moving data is by the use of agreed upon standards.

NGP. And consequently also between discovery, development and manufacturing?

Exactly. If the information generated during the discovery phase is available to applications used by the development scientists then even though they have different terms of reference this corporate knowledge can be of enormous benefit in speeding their development work and helping them to avoid problems that may well have been encountered and successfully dealt with during discovery. To achieve this, applications must be capable of communicating with one another and this is best achieved by the extensive use of standards.

If an unknown substance starts to contaminate new batches of a long established drug the trouble-hunting QC chemists may well be greatly assisted to find that under certain conditions this substance had also appeared ten years previously when the development chemists had tried to use alternative cheaper solvents and had suffered from the results of water contamination! The trouble-hunters will have received immediate pointers as to where the problem might be located.

NGP. How do you see the future turning out?

Speculation is always a dangerous occupation, but our industry now has not only the tools, but also the computing power to make a huge amount of information available to our key staff. This means not only the data and its context but also, through the widespread deployment of electronic laboratory notebooks, for the first time we will be able to capture the ideas and thought processes that have led to our best discoveries. For years, the academics have speculated about the power of data/knowledge mining but have never had the resources or the pure data volumes to do much more than speculate. I hope that as we have all the pieces available we will see whether we can achieve the quantum leaps in discovery and development productivity that they have promised. Or at least stop repeating the mistakes of our predecessors!


More like this...

Disclaimer: All comments posted in a personal capacity
POST A COMMENT
In order to post a comment you need to be regsitered and signed in.
Register | Sign in
No Comments Have Been Submitted
Disclaimer: All comments posted in a personal capacity