The automatic self-bias detector


Previously I published a website that lets you parse tens of thousands of stories by what was talked about, whom was mentioned, where the events took place in each story (Thanks Fabio for doing the programming!):

Themes from ALL stories in Kenya & Uganda (through April 2012)

The bigger the bubble, the more common that word is in stories. And the higher the bubble, the more diversity of sources for stories containing that word.

Scrolling through the places, you’ll see that community priorities differ:

HIV/AIDS and WORK are more frequently talked about in Kampala compared to Kibera. FOOD, YOUTH, SLUM, and PROBLEM dominate in Kibera vs. Kampala.

Nairobi (11,000 stories) is dominated by the Kibera neighborhood (7,000 stories). Smaller town still have a lot of data. For example Kakamega and Masaka:

Masaka, Uganda (2,324 stories)

SCHOOL FEES dominate stories from Masaka, and WORK dominates stories from Kakamega.

Interactive visualization

Of course this is an interactive web page that let’s you instantly filter by any word and visualize story patterns. Just click the DYNAMIC BUBBLES button on the right and type in your keyword. It will filter out stories that don’t include that word:

Click this image to try the DYNAMIC story bubbles parser:

And that dynamic feature comes in handy when you want to compare two rape prevention programs in Nairobi but never asked any questions about rape or HIV on your survey. After all, in the 21st century, the static way of thinking about the world is archaic, too costly, and too slow to really transform international development. We need instant feedback and dynamic tools. Having to know what data will be important before you start an evaluation just seems silly to me. Half of my successful science lab experiments were based on data that we collected before we knew that it would be essential to improving our understanding the brain. And this isn’t brain surgery!

The locally-funded Mrembo project in Eastleigh (above) differs from the USAID funded Sita Kimya project (targeting Kibera), in that Sita Kimya stories (from men) lack any connection between rape and HIV. In contrast, Mrembo girls understand that rape is a risk factor for HIV:

Sita Kimya stories from Kibera reflect a lack of association in the minds of men that rape and HIV/AIDS are related.

Automatic self-bias detection

And now for the punch line. While this tool is a convenient way to scan stories from a location or about a particular topic, it also does a great job at revealing which of the named organizations are really known by a diverse community, and which seem to be sampling from a very small, closed set of people. The lack of diversity in how people describe an organization is evidence that they have been influencing the story collection, and self-biasing the stories to reflect THE ORGANIZATION perspective, and not the community perspective. This is a HUGE problem in all forms of evaluations, including “3rd party evaluations” — because after all, how do you think these 3rd party evaluators find their sample of local “beneficiaries” to talk to? The organization provides them with a list.

See for yourself how the picture of legitimate and isolated organizations differ:

WFP stories are reliable because there is a lot of blue noise at the bottom of the plot.

Even though you only have 11 stories that are attributed to USAID and mention Sita Kimya, the pyramid of topics still remains:

The sketchball inverted bubble pyramids

Some organizations have an inverted bubble pyramid. These are organizations we know had the opportunity to influence the story collecting, because we relied on them to help us find local scribes (who interviewed people in the community):

CFCA Uganda has an alarming lack of diversity in their 61 stories

In contrast, one organization (SWIM) that could not manage to collect any stories NOT about themselves has a little more diversity, but not enough:

The African Medical Relief Foundation (AMREF) has much more diversity:

The logical evolution of this tool would be a site that is aware of your interests (as someone working at a particular organization with a known mission) and parses stories by location, topic, and relevance to your work, then provides you with a custom reference data set to compare yourself to. Since we ask each storyteller to share two stories about two different community efforts, looking the same stories shared NOT about your organization would be a useful reference data set. Stay tuned. We’re building it.

Continued: Feedback through storytelling to gain insight and write stronger grant proposals


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s