Thursday, October 14, 2010

Whatever happened to Swivel was a popular data visualization site that recently failed as a business. Some information is starting to trickle out about what happened, through an interview with its founders. Some choice quotes. First, expenditure:
We got a Round A funding of two million dollars. By the time I left, we had spent about three to four million dollars on the idea.
Second, income:
[How many paying users did you have in the end?] It was single digits
Ouch. Lessons learned? I'm cherry-picking my favorite here:
I think what we learned, like Roseman is saying, that the interface is not that important, that there are analysts who are really good at tools like R, SAS, etc. and prefer to continue to work in those tools to do powerful things with datasets.
Please read the article for the full context. I just wanted to highlight this point, which I think may be important for the Data Commons Project. I think it argues for letting people maintain their data with the tools they are expert with, rather than expecting them to use a new service. My personal theory is that distributed revision control systems are the way to go for collaborative data projects. Web-based services could be part of such a distributed ecosystem, but would not be its "hub".

Tuesday, October 12, 2010

FaceBook gives users access to their own data

Good news from FaceBook: users can now download everything they've ever put on the site. This is an important step towards transparency. The move is being welcomed by the DataPortability project, whose mission is to help people to use and protect the data they create on networked services - although they are careful to note that being able to download one's data is not the same as being able to control it (read Alisa Leonard).

Monday, October 11, 2010

FACT Social Justice Challenge 2010

Voting for the FACT Social Justice Challenge 2010 is now open on NetSquared. The goal of the challenge is to use web or mobile technologies to foster collaboration around social justice issues. The DCP has a proposal in there, which we invite you to vote for:
We propose building a collaborative directory of the rooted economy.

Our goal is to develop a democratically controlled, accurate, comprehensive, publicly searchable and updatable database of cooperative economic initiatives in North America. Rather than building just another website, we're experimenting with doing so in a (de-geekified) version of how programmers develop the open-source Linux code.

A vibrant movement for a cooperative and democratic economy is growing in North America. Offering innovative and effective alternatives to the “business as usual” of the distant, unaccountable corporations that dominate our economic, social and political landscape, cooperative enterprises and the organizations that support them are building an economy that truly works for people and the planet. But these efforts are often fragmented by geography, sector, and even organizational form. To succeed in changing the economic status quo, people need to find each other, help each other, work together, and be counted.

We're working on several fronts to make this happen. We're compiling a list of directories of the rooted economy (see and let us know about more to add, big and small). We are on track to create a Data Commons Cooperative to foster collaboration in the long term. We are developing free and open source software to show directories and make them editable, wiki-style (see, source code at - this directory will be used by Cooperative Maine, the California Center for Cooperative Development, and the Regional Index of Cooperation). In the course of our work, we've found that plenty of small and large groups are willing to share their data, but that right now it costs a lot more (in time) to do so than it ought to. People like to keep their data in spreadsheets or databases on their computers, and many are reluctant to move those online (e.g. to services like Google Docs). This is understandable, since it is most comfortable to edit data this way, and databases often have fields that should not be made public or stored externally. But that means sharing data can be a tedious and messy business of filtering, exporting, emailing, hand-merging, etc. Another sore spot is making updates to information: any update made in one spot filters slowly or not at all to other locations, and takes a good deal of effort to get there. We want to create a free and open source tool to speed up the process of sharing data and maintaining it up to date.

As it happens, a lot of the pieces are already in place. Programmers have created awesome tools for collaborating in comfort among themselves, which we believe could be just as useful for stream-lined knowledge sharing. What is needed now is a small, committed team who can bring “distributed revision control systems” to our communities in a useful form.

We've already developed a free and open-source program for merging tabular data, ssmerge. There aren't many such programs out there, and we believe ours out-performs them. We are integrating that with the fossil system ( for managing shared repositories. Fossil is free and open source, by the makers of Sqlite, a database used in Firefox, Skype, Apple Macs, many smart-phones, and plenty of websites. Fossil is uniquely lightweight and dependency-free, making it a pleasure to adapt to new uses. There's a first prototype of this work at (with links to documentation and download for Linux, Windows, or Macs). We believe it could be extended to become a great tool for easily sharing parts of spreadsheets and databases, with pain-free two-way flow of updates and additions between collaborators. That would really change the culture, just like similar tools have revolutionized the software world in favor of openness and distributed collaboration. We hope FACT will support us in bringing this change to life. For programmers, distributed revision control systems such as "git" and "bazaar" have proven their feasibility, and can scale from the smallest personal project to the largest (e.g. the Linux kernel). The Data Commons Project has data-sharing partners of all sizes, so we have a strong need for a similarly flexible tool. We suspect others have this need too. Our practice and commitment is to release all our code as free and open source, in ready-to-reuse form. If successful, the tools we build will have no need for central coordination, so there will be nothing to stop our work being used in any country, by any community, without needed to talk to us.

Here are some concrete examples of who we are working to benefit. This work will benefit large umbrella organizations such as NCBA (National Cooperative Business Association) and SEN (US Solidarity Economy Network). They will be able to systematically pull data (with filtering) from small, active networks and pool data with other large peer organizations with overlapping fields of interest. Just as importantly, the work will magnify the impact of data collection activities of smaller networks, such as Cooperative Maine, that focus on specific geographies, sectors, and/or organizational forms. The data-sharing tools and repository we build will assist collaboration between those networks and the overlapping umbrella organizations that represent their interests.

In general, through the proposed work and our other efforts, here's the "status quo" we're trying to improve on (quote anonymized on request):

“For [our co-op directory] we used a database that [a large co-op] had used previously and we added minimal additional data to it—I’m not sure what its origins are, however. I found that [a large sectoral umbrella group] was unable to provide a database of their members due to agreements with their members not to share data and [another large sectoral umbrella group] does not share [member organization] data because changes occur so frequently that the database would quickly become outdated. Those two entities hold a good chunk of the database info I had hoped to collect. The smaller cooperative organizations often do not have a well organized database—I got a [small co-op sector] list in a Word doc. So, those are some of the challenges I faced. That said, I think there are many in the cooperative community who would love to see a comprehensive cooperative database come to fruition.” – a co-op data aggregator

We are excited to find ways to systematically "hold a mirror" to bottom-up community driven organizations of all kinds, so that they become visible individually and in aggregate to their peers, to researchers, and to civic institutions.

Want to help? Vote here before October 15th:


CoopMetrics is a nifty service for benchmarking a cooperative against its peers, and finding ways to improve. By pooling and comparing financial data cooperatives in a particular sector can find out what works, what doesn't, and good ideas spread faster.
  • Details on the process: In summary, a co-op's accounts are mapped onto a standardized chart of accounts, quarterly trial balances are uploaded, and CoopMetrics crunches the data to provide various reports and comparisons. There are videos that give a sense of the steps involved.
  • History: CoopMetrics traces its lineage back to 1996, with the CoCoFiSt ("Common Cooperative Financial Statements") program developed by Walden Swanson and Kate Sumberg.
This is a great example of organizations gaining an advantage by pooling information, and extracting insight that would otherwise be elusive.

Hat tip: Jim Johnson

Friday, October 1, 2010

Happy October 1!

October is Co-op Month, with lots of fun Co-opy stuff going on. If all goes on schedule, by this time next year, the Data Commons Cooperative should be emerging from a warm nest of feasibility studies and market analysis to spread its wings and take flight. The cooperative is intended as a way to sustain our work long term, but we'll be excited to work with principled organizations of all kinds.

Happy October!