Client:
Coin Metrics
Challenge:
Serve an uninterrupted stream of real-time data with low latency and high levels of data quality
Solution:
Coin Metrics integrated with CME Group Data on Google Cloud Platform (GCP)
Overview
Coin Metrics is the leading provider of crypto financial intelligence, offering network data, market data, indexes, and network risk solutions to the most prestigious institutions touching cryptoassets. The company was founded in 2017 as an open-source project to determine the economic significance of public blockchains.
Today, Coin Metrics serves an institutional user base with data that enables multiple use cases, including research, quantitative trading systems, order management systems, display on customer-facing applications, back-office trade settlement and accounting, and settling financial products.
These use cases require Coin Metrics to serve an uninterrupted stream of real-time data with low latency and high levels of data quality.
As institutional interest in the cryptocurrency industry continues to grow, there has been increased demand for data on CME Group’s Bitcoin and Ethereum futures and options instruments. To meet the data requirements of these users, Coin Metrics integrated with CME Group Data on Google Cloud Platform (GCP) to collect market data about CME Group’s cryptocurrency derivative products. Coin Metrics consumes several of the datasets published by CME Group to the GCP Pub/Sub service, including time and sales data and various order book datasets.
Coin Metrics collects data from a variety of data sources using multiple different data collection methods. Compared to some traditional approaches, Coin Metrics was able to easily integrate with CME Group Data and leverage GCP’s Pub/Sub to ensure delivery of an uninterrupted stream of low latency data to end users.
Issues with traditional implementation
Centralized exchanges in the cryptocurrency domain have largely converged on serving real-time streaming data through a WebSocket API. Coin Metrics relies heavily on these APIs to collect multiple data types from exchanges and serve data to our users with low latency. Due to limitations in this protocol and occasional instability in an exchange’s systems during times of market stress, disconnects in the WebSocket connection are common and expected. Any disconnect or restart in any intermediate component between Coin Metrics and the exchange’s server can have an impact on the stability of the connection.
This poses a problem for Coin Metrics because we require the ability to serve an uninterrupted stream of data to our users and store every single observation so that our historical dataset is complete and not missing any observations. To address the limitations associated with relying on WebSocket connections and to ensure high levels of service to our users, Coin Metrics has engineered a market data collection system with high levels of redundancy and resiliency.
In addition to standard reconnect logic for our WebSocket API feed handlers, Coin Metrics will collect certain data types from an exchange’s HTTP API and WebSocket API simultaneously as an added redundancy measure. Each server that hosts our market data collection system has local database storage as a fault tolerant measure in case of a failure in our primary database. To ensure high levels of availability for our market data collection applications and that no observations are missed, we collect data from exchanges using multiple instances of each application that are hosted across geographically separated and vendor-independent data centers. Parallel pipelines consisting of data collection and data storage components exist within each data center and are resilient to interruptions to individual components. In addition, our data delivery systems contain intelligent failover logic to always route requests to healthy components in our infrastructure.
Coin Metrics strives to collect complete datasets in real-time with no missing observations, and to do this the company has expended considerable engineering effort in building and maintaining this system.
Advantages of CME Group’s cloud solution
CME Group Data on Google Cloud Platform is a scalable solution that facilitates reliable, real-time, and delayed market data leveraging Google’s cloud-native technology. With fast on-boarding and an affordable pay-as-you-go pricing model, CME Group Data on GCP provides a low barrier to entry and quick time to value.
Coin Metrics greatly benefits from the message retention properties of CME Group Data on GCP Pub/Sub. Coin Metrics subscribers consume messages from CME Group Data on GCP topics and send an acknowledgement request when a message is received. Google Pub/Sub retains all messages in CME Group Data on GCP topics for a period of seven days and will attempt to resend unacknowledged messages after a short delay. This functionality simplifies the engineering requirements for Coin Metrics and is compatible with Coin Metrics’ requirement for storing complete datasets in real-time with no missing observations. In contrast, Coin Metrics expended considerable engineering efforts to achieve similar levels of completeness for other data sources.
CME Group Data on GCP allows customers to subscribe on a channel or per product level. Operating on a cloud-based solution provides CME Group the ability to offer data in both SBE and JSON formats while allowing users to seamlessly integrate their data workflows into other GCP technologies, such as Big Query.
Coin Metrics was able to fully integrate with CME Group’s cloud solution, starting from initial development of our feed handlers to serving data in a production environment in a period of approximately three weeks.
While the level of trading activity in the cryptocurrency industry is still small relative to traditional markets, trading volumes continue to grow exponentially and activity at peak trading times can place Coin Metrics’ systems under high load. To deal with this, Coin Metrics employs various sharding and load balancing techniques, each of which add to our engineering time when connecting to new data sources. With CME Group Data on GCP Pub/Sub, multiple workers deployed by Coin Metrics can consume messages independently from a single subscription and load balancing can be achieved by relying on GCP Pub/Sub’s per-message receipt tracking and by simply deploying more identical workers. Here the subscription properties of CME Group Data on GCP Pub/Sub allow for a simple load balancing implementation compared to other data sources.
CME Group Data on GCP’s pay-as-you-go pricing model allowed for Coin Metrics to experiment with prototype implementations without having to sign a long-term fixed fee contract. Furthermore, Coin Metrics’ initial implementation only involved CME Group’s cryptocurrency derivatives, a small subset of all the data that CME Group publishes. With this pricing model, Coin Metrics only had to pay for the exact data that it consumed.
Finally, Coin Metrics utilizes all the additional benefits of being tightly integrated with Google Cloud Services, including relying on its high availability guarantees, integrations with other services, and the robust developer ecosystem.
Conclusion
CME Group Data on GCP Pub/Sub allowed Coin Metrics to rapidly implement a solution that fulfilled Coin Metrics’ requirement of collecting complete datasets with no missing observations in a short period of time. The engineering efforts were greatly simplified by relying on GCP Pub/Sub’s subscription properties. And a fair and transparent pricing plan allowed for easy experimentation and a cost-effective implementation.
Coin Metrics serves CME Group’s cryptocurrency data under a harmonized data model that is compatible with other cryptocurrency exchanges, and Coin Metrics’ ability to do this has been critical in allowing institutions to execute on their cryptocurrency product offerings.
Coin Metrics continues to connect to additional data sources to scale its infrastructure and to improve upon its resiliency. As institutional adoption of cryptocurrencies continues, there will be increased need for high quality market data from CME Group’s cryptocurrency derivatives contracts and Coin Metrics plans to continue to collect this information as CME Group’s product offerings grow.
Contact us
This material is directed only at, persons who are: (i) investment professionals (as that term is defined in article 19(5) of the Financial Services and Markets Act 2000 (Financial Promotion) Order 2005 (“FPO”)), (ii) high net worth companies (as that term is defined in article 49 of the FPO) or (iii) any other persons to whom it may lawfully be communicated. Accordingly, persons who (i) do not have professional experience in matters relating to investments or (ii) are not high net worth companies, should not act or rely on this material. The financial instruments and / or services detailed in this material will only be available to high net worth companies or investment professionals (as defined above). If you are not a high net worth company or investment professional (as defined above) you cannot invest directly and are unable to gain access to the relevant financial instruments. CME GROUP DOES NOT REPRESENT THAT ANY MATERIAL OR INFORMATION CONTAINED HEREIN IS APPROPRIATE FOR USE OR PERMITTED IN ANY JURISDICTION OR COUNTRY WHERE SUCH USE OR DISTRIBUTION WOULD BE CONTRARY TO ANY APPLICABLE LAW OR REGULATION.