Updated: Jul 19, 2022
July 14, 2021|ELN, Science and Technology, Scientific Informatics
AN INDUSTRY PERSPECTIVE By 20/15 Visioneers, Leaders in Science and Technology “Our understanding of biology is very limited, let’s not compound it with an inadequately captured data and process environment.” John F. Conway CONTENTS Some History ........................................................ 2 What Makes a Great BioELN? ...................................... 3 R&D Organizational Size Does and Does not Matter............. 5 Stop the Insanity! ................................................... 6 Implementation: A Phased Approach ............................ 7 The Next Generation BioELNs ..................................... 7 Conclusion .......................................................... 8 SOME HISTORY
As we remember it, the first ELNs were designed to capture Intellectual Property (IP) and small-molecule chemistry. Biologists were on the team but, were also designated to test these new molecules in various assays. They would then enter the data in a bespoke or early decision support system like MDL’s Isis Base, which would marry up the small molecule data with the new biological assay results and summary data that was refreshed daily. Many times, the chemists used an ELN while the biologists were still working on paper or had a rudimentary paper-on-glass ELN. As we have mentioned many times, the market evolved, and several companies produced bio-oriented ELNs. The issue, however, was that these ELNs suffered from a lack of biology-persona coverage, hence diverse solutions covered the gamut of workflows.
This led to silos of non-use that ultimately damaged adoption of this foundational requirement to capture the electronic scientific method, bedrock for FAIR data and processes. (Figure 1.) The time taken to document work is a critical step in R&D organizations, ensuring it will get secondary and tertiary use out of all that data and information. The right level of process rigor and metadata contextualization is imperative to produce Model Quality Data, defined as data of sufficient breadth, resolution, and fidelity to confidently drive scientific understanding to higher levels of abstraction, informing burgeoning Artificial Intelligence (AI) and Machine Learning (ML) approaches capable of uncovering breakthrough insights. This is the golden path to more efficient innovation.
Figure 1 WHAT MAKES A GREAT BioELN?
It is now 2021 and several partner vendors have risen to the challenge, delivering platforms that provide higher levels of utility not only to the variety of different functional personas on a team (including Molecular Biology, Assay Development, Screening Sciences, In Vitro/In Vivo/In Silico Analysis, Bioanalytical, and Bioprocess Research) but also to the six needed high-level workflows. These are Request, Sample, Test, Experiment, Analyze, and Report. Today’s small molecule supportive biology and large molecule discovery and development processes are a hybrid of engineering and experimentation, hence a well-constructed BioELN needs to handle both along with any relevant chemistry since the two worlds collide more than not. Of course, there are organizations running biology-only programs that do not have to deal with the chemistry aspects of a project. The critical point is that the BioELN may not be the repository of all related project content but it must be a reliable element of truth necessary and sufficient to reproduce or replicate a previous experiment. In a high-complexity R&D world dealing with multiple data types and enormous data volumes, deficiencies in data and process contextualization engender poor communication, leading to unreliable reproducibility and replication of work. Institutional knowledge and tech transfer suffer, and the deficient IT environment often ends up like the thing you are trying to tackle, a disease.
You may resign yourself to live with it, but it affects your quality of life and can eventually result in total project failure.
In a simplified baking analogy, unless you are making Iranian Barbari bread, www.thefreshloaf.com/node/62118/most-difficult-bread-world, Biopharma biology can share the same vagaries. We are sure you have experienced a cook that winged their recipes, or maybe later in life were not as rigorous with their preparations. Though sometimes this delivered great results, you never enjoyed a consistent product because the process was not captured properly. This same type of reproducibility and replication failure occurs far too often in industry and academia, which is increasingly being recognized as a large and costly problem.
A GREAT BioELN NEEDS TO COVER THESE:
Biology personas and high-level workflows
1. Molecular Biology (engineering)- like medicinal and synthetic chemists have had now for close to 20+ years (e.g., Reactions, Molecule searching, Stoichiometry=> Recipes/Techniques, Planning, Plasmid mapping, Blast searching, alignment tools, and other genetic engineering tools, plan, visualize, and document DNA cloning and PCR, etc.)
2. Assay Development- Planning, Interactive Plate/Sample Mapping, Curve fitting with (Algorithm configuration), Integration to R, adaptive and highly configurable process, and data environment
3. Screening Sciences- (in vitro/in vivo/in silico)- HTS/HCS/DMPK-PKPD/Multi Assay type handling, High-throughput screening, Imaging, High Content Screening, Surface Plasmon Resonance, Animal Studies, DMPK
4. Bioprocess- Regulation, Fermentation, Scaleup, Real-time monitoring, Historian (continuous) data, engineering
5. BioAnalytical- Standards, planning, workflows, analysis, and reporting
Photos Courtesy of Cristof Gaenzler 6. The intuitiveness and usability of the system
7. Low code/No code configuration
8. Integration to foundational tools like entity registration, large molecule editors (e.g., HELM), Visualization (e.g., Spotfire, R, etc.), Inventory
9. Intercalation with Automation
11. Enterprise Search
12. Enterprise Data handling (FAIR)
13. Process capture with detailed contextualization.
14. The tool needs to provide rigor through its framework and capabilities, but not at the expense of the scientist. Good scientists are extremely busy in the lab and need their 8-10 hours a day to get things done
R&D ORGANIZATIONAL SIZE DOES AND DOES NOT MATTER
We work with a range of clients from biotech startups to some of the largest R&D organizations in the world and the fundamental need for an ELN is the same across the board. Many startups externalize major parts of their R&D from the beginning, while many larger and more established R&D organizations just started externalizing over the past 10 years. The tracking of science, materials, work performed (including billing), and very important intellectual property would be next to impossible without an ELN. Often, painful lessons are learned at the completion or end of a project. How do you ensure that you have received all your materials and IP for work that has been performed outside of a tracking ELN? Obviously, the larger the organization the more complex the challenges become from a project diversity and involvement perspective. This is where the high-level processes that we discussed before come in, such as Request Management, Sample Management, Test/Experiment Management, Analysis, and Reporting. You want your scientists to work in an integrated environment that delivers a seamless experience, allowing them to do more science and less data and process wrangling.
Dealing with these processes in a biology software world is probably the most underserved topic in Bioinformatics. The consortium which formulated the BioCompute paradigm defined it like this in 2014:
“It was decided that the BioCompute paradigm would be in the form of digital ‘lab notebooks’ which allow for the reproducibility, replication, review, and reuse, of bioinformatics protocols. This was proposed to enable greater continuity within a research group over the course of normal personnel flux while furthering the exchange of ideas between groups.” https://en.wikipedia.org/wiki/Bioinformatics
Not only do we need digital lab notebooks in Biology and in Bioinformatics, but we also need a platform allowing us to bring every aspect of life science research together.
STOP THE INSANITY!
At the risk of offending, we are going to generalize here. There are certainly many individuals and organizations that have done an awesome job bringing biology into the modern age. Unfortunately, too many have not. There has been a big miss when the rigorous mandates of enterprise scientific informatics collide with the looseness of biology data and process capture. Academic vs. Industry thinking is definitely one significant factor that we run into repeatedly. Biological entity registration is a particular experience that comes to mind. Many former biology-oriented clients/customers used to push back believing change was not necessary, but ten years later it is now recognized as a foundational solution in large molecule drug and therapy discovery as well as in assay development and other functional areas.
We will boil it down for you: It takes culture, strategy, and strong leadership to embark on the BioELN journey. When done right the ROI can significantly move the needle on the financial performance of large biotech and biopharmaceutical companies. On the flip side, the cost of R&D inefficiency and the consequences of risk management failures can become existential threats. We liken it to driving your car without a license and insurance. Sure, you can do it, but the consequences can be dire.
As we have stated numerous times this is a journey, not a closed-end project that gets delivered and it’s done. Yes, some Bio ELN projects embarked on years ago met with mixed success. But these provide important lessons that must be studied. In our experience, a few of the factors that caused negative or undesirable results can be attributed but not limited to:
1. A solution that was more concerned with IP than functionality and capabilities. 2. Problematic Technology stacks. 3. Paper-on-glass solutions that never got beyond initial implementation. 4. Missing integrations that were absolutely essential. 5. A culture that does not appreciate data as a durable asset. 6. Solutions that lacked enterprise searching and FAIR data compliance. 7. Solution intuitiveness and manual approaches were not adopted. Learning from this, today’s BioELNs can handle:
1. The full array of team personas 2. Automation 3. Collaboration 4. Biological workflows, and scientific business processes 5. High-performance computing and hyper scalable, cloud-based arrays of microservices 6. More seamless integrations, still an issue with many of the bespoke approaches and tools. 7. The next generation ELNs are handling the data contextualization issues.
A PHASED APPROACH
Now that you have started to think about a new BioELN, don’t neglect to review factors that can derail your project. These include:
1. Improper or insufficient attention to change management challenges 2. Missing “Journey” mentality 3. Data standards, ontologies, taxonomies, data dictionaries that are missing at the time of rollout. (This requires serious prework, do not skimp as it will cost you) 4. Big bang approach versus paper-on-glass (phase 0), phase 1, phase 2, phase n 5. Process harmonization and optimization 6. Dissention from the strategy. (After all, it is easier for many to seek the path of least resistance, but the outcomes could be dismal for the organization.
THE NEXT GENERATION BioELNS
Fortunately, we already authored this article in early 2020, Visioneering the Next Generation ELN, which you can access here: https://www.20visioneers15. com/blog/f/visioneering-the-next-generation-eln
Out-of-the-box capabilities that meet your needs are a great thing. Having a platform approach that reduces the burden of configuration and new science and workflows is also highly desirable.
The Next Generation BioELN is scientifically aware! It’s collaborative. In silico approaches are directly connected with in vitro experiments and in vivo studies. Underlying data structures are life objects in a graph, searchable and ready to be analyzed at any future time by scientists using an intuitive, no-code user interface. Workflows in the lab are documented on the fly and the basis for a digital twin using AI to recognize patterns and build models, not only in single experiments but across entire campaigns. This ultimately leads to the evolution into robotic cloud labs where experiments are performed entirely under software control. (See The Cloud Lab Revolution, which you can access here: https://www.20visioneers15.com/blog/f/the-cloud- lab-revolution) CONCLUSION
As New Modalities in NME discovery, multi-omics, microbiomes, and precision science become more pervasive in the multitude of R&D verticals so will the need to have a highly functioning BioELN. The level of detail and complexity that needs to be captured is critical to driving understanding and discovery.
Scientists must fully capture their scientific method, and a willingness and ability to do so should be a prerequisite for employment. On that note, the tools must be intuitive, flexible, and highly configurable. The scientist must be able to leverage the power of the tool and not be forced to unnaturally conform to it, retarding adoption. Nine times out of ten the major challenges will be related to culture and change management problems, not technology problems. If done properly you will also solve numerous existing data and process problems that plague too many organizations.
The BioELN is a foundational need in a data-driven R&D organization. It will ensure your scientists’ scientific method is captured, data and process standards are used, collaboration challenges are met, and reproducibility and replication of experimentation and testing are seamless. Excuses for not adopting one are most likely based on bad or missing data.