The Risk Data Open Standard (RDOS) Steering Committee that guides and oversees this standard was busy in 2019 working to shape the RDOS, to enable the risk industry to simplify risk management and risk data portability. The RDOS has traveled a long journey and is now proudly an open standard.
Much has been published already about what the RDOS is. So, for this post I won’t focus on the mechanics of the RDOS, instead I’ll focus on the “why and how” the RDOS is open, and how it enables anyone in the industry to use and contribute to the standard.
Just in case you are not familiar with this new standard, I will quickly introduce the RDOS, but the majority of the post will be on diving into the “open” part of the RDOS, and to unpack how we are reusing learnings from existing “open source and open standards” that have successfully created international and industry wide collaborations.
So, let’s get started.
What is the RDOS (Risk Data Open Standard)?
The RDOS is a data standard designed to accommodate the breadth and complexity of modern risk management and to also increase risk data portability. Risk data acts as the fuel for the risk industry, critical to power all stages of the risk management life cycle (see below). For the RDOS to be regarded as an effective “fuel”, it needs to flow friction-free – the risk contained within it needs to be simple to understand, exchange, share and integrate across and in-between risk-focused organizations. The communication of risk through risk data is what the RDOS is all about.
Risk data isn’t new and there are existing popular formats out there in the wild. We know the RDOS needs to welcome all the existing data contained in these formats, so the RDOS is backwards compatible with formats such as the EDM (Exposure Data Model) and the RDM (Results Data Model). That means, if you have an EDM or an RDM, you can easily convert that to RDOS format. As the standard evolves RDOS aims to be compatible with EDM alternatives as well.
Why create a new standard if they already exist? Compared to the EDM, RDM, CEDE, OED and other similar standards, the RDOS is superior because:
1. It is complete – and can capture all aspects of exposure through analysis. While current standards are incomplete, the RDOS provides an unambiguous picture of risk, eliminating the need for forensic analysis to understand risk data, freeing up valuable analyst time and reducing misunderstanding or errors.
2. It is flexible – all aspects of the information are captured: exposure, contracts, structure, setting, results, can all be extended to accommodate new modeling algorithms, new lines of business or classes of risk, new types of contracts, and virtually any business structure
3. It is not linked to a specific technology – current formats rely on SQL, which has inherent limitations. The RDOS can be implemented using many data technologies – including object relational storage, increasing the processing power and the types of data that can be handled in the format.
All of these aspects make the standard future-proof and open up new possibilities that no other current format could support. The RDOS is open for anyone to use without fees, designed to accommodate risk data exchange between whoever needs to exchange risk data – whether it is internal risk teams or externally – between risk-focused enterprises such as insurers, reinsurers and risk investors such as insurance-linked securities providers.
The RDOS is designed to ensure that risk-focused companies such as insurance participants and the larger financial services industry that transact on risk will minimize waste and maximize opportunity.
There are much more detailed materials on the RDOS and the work of the Steering Committee. You can find them on www.riskdataobject.com or on https://github.com/RMS-open-standards/RDOS
Why is the RDOS “Open”?
The RDOS is open because open source and open standard projects are the best way to unlock new collaboration among industry participants. This “open” approach has been proven to work in many fields, especially when the problem domain is:
- Engineering a standard data exchange (HTML, JSON, XML)
- Developing data processing and analytics software (Hadoop, Spark, Presto)
- Machine Learining (ML) and Artificial Intelligence (AI) algorithms (TensorFlow, scikit-learn, mxnet, PyTorch)
- Operating systems (Linux flavored OSes, Android)
The projects listed above power much of the new development in data-driven companies.
One sector with the greatest need for collaboration, that has many international industrial participants is risk management. Standardizing risk data is one of the most important data engineering challenges that still remains in the world. The RDOS aims to address this problem by making it available for anyone to use with a clear license that permits and protects users and encourages collaboration.
Even though the RDOS is open today, we know this is just the beginning. The RDOS needs to evolve and adapt to the needs of the users. Creating a “living” open standard requires setting up the right stimulus to collect feedback and contributions. For that, RDOS copies the best practices from the most successful open collaboration projects.
I and others at RMS have had the pleasure of contributing, running and growing several open standards and open source projects created by companies like Microsoft, Cloudera, Couchbase, Redis Labs and Databricks in the last decade. Open source has an amazing energizing effect. Collectively we strongly believe that the “open” approach can bring efficiency to the risk-focused industries where collaboration is critical.
How is the RDOS “Open”?
It is important to note that there are many approaches to “open source” and “open standards”. There are three main bodies in the world that lead the open source field; the CNCF (Cloud Native Computing Foundation), the Linux Foundation and the ASF (Apache Software Foundation). These organizations are responsible for the “open” wave that creates and advances thousands of projects. At these large numbers, many of the projects will fizz away, however, the ones that succeed have a few things in common:
First, successful open projects are typically created by a commercial entity, suffering from a problem common to many. This entity generally continues to maintain the open project. There are hundreds of examples to choose from here but let’s look some of the best known successful open source projects;
- Kubernetes: Kubernetes has been created by Google. This technology is the de facto standard for orchestrating all cloud deployments. At the beginning of 2020, the Kubernetes project had 2,400 contributors with almost 87,000 changes contributed to the project. Today, the biggest contribution to Kubernetes project continues to come from Google. All this is very easy to observe on GitHub – just follow the link to https://github.com/kubernetes/kubernetes
- Spark: Spark is a project initiated by Netflix and Amp Labs. It is used for big data analytics and is the de facto standard for building ML and AI models today. This is one of the projects I have personally worked on over the past years. Spark has over 1,400 of contributors but over 80 percent of the contributions come from Databricks. You can continue to look at other examples – Hadoop created by Yahoo, Presto created by Facebook, Kafka created by LinkedIn and many others follow the same logic here.
Second, these projects pick the right licensing approach that allows for a level playing field. The licensing ensures open access and equal rights on use to anyone who wants to utilize the project or to contribute to the project. Many examples above use the Apache version 2.0 license for this reason. That’s why we have picked the Apache license version 2 for the RDOS.
Why the “Apache version 2.0” License for the RDOS?
One of the most important goals for the RDOS was to keep it actively evolving. Good news is, the open source industry has shown that it is possible to create a living project – Linux, Hadoop are over a decade old and are still active with hundreds of contributions a year. The other examples above like Kubernetes, Spark, Presto, Kafka and many others created innovation among participants using well-tuned and tested open source licenses in the last few years.
It is true that there are many alternative open licenses to pick from. Apache version 2.0 isn’t the only license available to us. However, Apache version 2.0 has been proven over and over again to be robust for shared innovation projects. Apache version 2.0 license has been around for over 15 years and has allowed long-lived free use and continuous collaboration with easy to understand terms that keep intellectual property “open.” In fact, all our detailed examples above all use Apache version 2.0 as their license. This is why we are licensing RDOS with Apache version 2.0 license.
You can easily compare open licenses. Wikipedia can help compare options – https://en.m.wikipedia.org/wiki/Comparison_of_free_and_open-source_software_licenses. There is a vast amount of material and opinion on these licenses, and Apache version 2.0 is unambiguously fairer and levels the playing field for users and contributors alike. It is the fastest growing license of choice for open source projects around the planet according to Whitesource’s measurements. Apache version 2.0 empowers all users and contributors, and Apache licensed projects today attract worldwide contributions from multiple nations with big populations like China, India, North America and Europe.
These Apache version 2.0 projects are all free to use, yet, they are all very well maintained as well. The following figure shows Hadoop’s contributions over the last years. It has sustained development every year with thousands of additions and changes. How’s this possible? You can point to licensing as one of the most important factors here.
All of these projects also continue to thrive because many vendors – even competing ones – continue to contribute to them. These Apache licensed projects have successfully attracted contributions from competitors like Microsoft and Google. None of these are small achievements and industry has benefited immensely from these activities.
We have also chosen GitHub to host the RDOS. GitHub itself is based on an open source project (Git) and is the most popular destination for open source projects.
I hope all this help showcase how committed and energized we are about this new standard. If you have any questions, you can always reach out to us through www.riskdataos.org.