Applying the Opensource Ethos to Data

Applying the Opensource Ethos to Data

Episode 116

Opensource software is mainstream. But opensource data? Yeah, and it can be found in a datamart near you. The thinking is, if you make your data freely available for all to use; all will improve it. It will be made more consumable, you will get feedback on how to use it and perhaps counterintuitively, you will learn how to value it. In this episode of the IoT Business Show, I speak with Nicolas Terpolilli about applying the opensource ethos to data within the data exchange.

In this episode of the IoT Business Show, I speak with Nicolas Terpolilli about applying the opensource ethos to data within the data exchange.

Nicolas is the Chief Data Officer at OpenDataSoft where he’s in charge of building a network based on the data shared by his customers giving him a great perspective on data governance from both a technical and a business point of view.

If you want to share your data, your valuable data, a natural instinct is to charge for it. Afterall it’s the new oil, right? Maybe. Or maybe not. Oil is a commodity meaning its use is independent of its source. This concept of fungibility only applies to data if, to take the metaphor further, it’s refined. Refined data means standardizing it: dealing with any missing bits, normalizing and scaling it for context and adding metadata. That’s a lot of work but if you provide it for free, in the right environment, others will contribute to it, in a sense refine it, making it more valuable to you and anyone else who wants to use it. This opensource mentality is alive and well and staring to creep from software and hardware into the data found in certain data exchanges, data marketplaces or my favorite, datamarts.

Here’s What We’ll Cover in this Episode

  • How datamart usage differs in the public and private sectors
  • Why bringing different data sources together is not a zero-sum gain
  • The different ways to access data in the datamart
  • Why companies post their data for free in the datamart
  • Crowdsourcing data cleanup – a time consuming part of the ETL process
  • Logical ways to separate free from non-free data in the datamart.
  • Comparing different datamart business models

Mentioned in this Episode and Other Useful Links

Support this Podcast

If you have been enjoying this podcast, there are a few ways you can support it:

  1. Share it on social by clicking on the widget on the left or bottom of the page.
  2. Click here to open iTunes and leave a one-click review or write your thoughts.
  3. Consider becoming a Certified IoT Professional by enrolling in the ICIP online training program.


Ways to Subscribe to the IoT Business Show

Like what you hear?  Subscribe to get each episode delivered to your device via iTunes, SpotifyGoogle PlayStitcher Radio or RSS (non-iTunes feed).

Have an opinion? Join the discussion in our LinkedIn group

Do you think every company should join an IoT consortium, alliance or group?