It is a well-known that due to the availability of digital technology, there is a surfeit of data that represent human behaviour than ever. And yet the availability of vast amounts of data does not mean that it is necessarily more accessible or comprehensible. For instance, Portland, a firm that tracks technology uses has presented the results of its research in a chart appearing in image on this post and appearing on this datablog by Simon Rogers of the Guardian. The chart presents data on the the volume of Twitter messages sent within African countries. As is possible the image looks neat but I am certain that the chart tells the casual observer virtually nothing except that it was prepared by a person who does not understand the data that was collected.
To start with, there is a diversity of countries in Africa and so presentation of the absolute numbers is useful. However, there are number of school boy errors that emerge from that presentation. First, the placement of the data side by side invites comparisons among countries and creates the ranking system that the developers of the charts displays. This ranking is not possible because of the differences in population among these countries. Second, given the failure to account for the populations differences that are truly vast, the data cannot provide information that the heading of the chart purports. This second error may not be the fault of the data collectors but I suspect that the Guardian's data editors would spot this. The volume of messages are definitely provided by a different number of people in each country.
To start with, there is a diversity of countries in Africa and so presentation of the absolute numbers is useful. However, there are number of school boy errors that emerge from that presentation. First, the placement of the data side by side invites comparisons among countries and creates the ranking system that the developers of the charts displays. This ranking is not possible because of the differences in population among these countries. Second, given the failure to account for the populations differences that are truly vast, the data cannot provide information that the heading of the chart purports. This second error may not be the fault of the data collectors but I suspect that the Guardian's data editors would spot this. The volume of messages are definitely provided by a different number of people in each country.