In a blogpost from September 2015, IQuantNY makes a compelling argument for company's and government agencies to release their datasets along with their findings. He writes:
- Releasing your data along with your work amplifies your messaging. It allows others to follow up on what you do, and add on to it
- More importantly, reproducibility is an important part of the scientific method. If “data science” is truly a science, it should be meeting basic tenets of the scientific method. Releasing data allows others to verify and follow up on your work.
I agree, and the chart above that tracks Uber and Lyft rides are a better argument than anything. From my experience working for a private company, data is sometimes not released because the findings are obscured for a bias towards whatever the business goals of the quarter are. You can pretty much come up with any finding you want based on your data. It's only when data becomes open sourced that companies who release it are compelled to be scientifically honest.