Just as data brokers in tech have been incredibly important, I believe we are at the nascent stage of data brokerage in techbio. With increased algorithmic sophistication in biology, it’s important to sources of data to continue to train and refine models.
Data brokers such as Acxiom, Experion, Epsilon, and Oracle Cloud Data have played an important role in the tech industry since the 1990’s. These have allowed for more tailored algorithms and much higher wealth creation than would have been achieved with companies using internal data alone. For biology the data broker industry is in its infancy. As with tech data brokers, techbio data brokers can use their access to data pools to not only create a successful business internally but to allow for much better therapeutic outcomes. This is starting to be integrated in companies such as Watershed.ai which allow access to publicly available data. Although this is important, there’s so much more data which could have utility and ultimately save lives. Other data sources which are less easy to access and integrate include those from pharma, techbio, biotech, hospitals and diagnostic companies. Companies which are able to pool datasets from these players and sell them will be able to create tremendous value and create their own drug discovery initiatives .
Hospitals and diagnostic companies will likely be the easiest players which would be open to sharing data. It will be absolutely essential that the entries are anonymized and HIPAA compliant. Success of the company will depend on GTM so companies here will need to decide what is the most valuable type of data, how they’re able to pool millions of samples, and easy modes to make it interoperable. Talent constrains will be around people that understand database infrastructure, healthcare data/regulation, and data brokerage.
Data Lake - secure, transparent and value-driven database of international medical data donors Hyper Unison - genomics data for better, faster, cheaper drug discovery Watershed.ai - bioinformatics workflows with established integration with public databases; not a true data broker Sano Genetics - connecting patients with clinical trials while offering DNA tests which are shared with the company Nebula Bio - DNA sequencing and privacy company Weavechain - HIPAA and GDPR compliant web3 data sharing