What makes a data scientist? creativity, uniqueness!

I’m closing out the application process for our Data Scientist in Training program, an opportunity for junior techies to explore real world problems at Keystone. Here’s a brief overview:

ka-logo

Want to learn about data manipulation, python, R, APIs, and other tech secrets as part of a mentorship-style internship?

 

As a data scientist in training, you’ll learn about launching a technology platform that serves nonprofits. You’ll support users who are building, collecting, analyzing, and collaborating on FeedbackCommons.org. You’ll curate case studies and work with a network of high-impact organizations. This is an unpaid mentorship position but offers a valuable opportunity to develop new tech skills that can get you a high paying job later.

 

To apply, tell us what you hope to learn? What is your idea of a cool job? Instead of boring stuff, tell us something unusual you’ve done that you’re proud of.

Essays

The response was huge! Here are some of the more interesting things applicants did or shared. This is the stuff of creativity and excitement from structured problem solvers:

separatorI really want to be part of your team to learn how to launch a technology platform that will have a big impact to the society in the future. … I’m proud of knowing what I want out of life. Since Aug 2011, I left my comfort zone hometown Beijing and started a new life in U.S with on own two feet. I learnt driving, researching lodging within budge, diet and work out, financial Independence; self-employment, balance my life and study. I am also proud of getting over my fear of public speaking and present my CSR proposal to a larger group of audience and received praises and applause.

separator

During an 8 month volunteering stint in Ghana back in 2008, I went for weeks without reading world news due to the closure of the only internet cafe in town. When I returned just 3 years later to research access to reproductive health education, the clinic in which I worked was in the midst of transitioning from paper to electronic medical records, and most middle class youth were engrossed with Facebook on their brick phones. A large portion of my final thesis was ultimately dedicated to examining the ways in which Ghanaian youth interfaced with social media to explore their growing interests in western-style dating.separatorI am a 16 year old junior at Brookwood High. I would love to become a data scientist in training for Keystone Accountability. I hope to learn not only how to manipulate sets of data but also how to analyze it, and how software connects with databases and how everything fits together. I have spent a lot of time recently trying my hand at algorithmic programming and have in the past gone to multiple science and programming competitions. I even got 2nd place at the Lockheed Martin CodeQuest in the advanced division. One of my projects that I quite enjoyed was one that I did for school on encryption (https://github.com/DrAlias/). I did this in java and had multiple pieces of software that would encrypt and decrypt pieces of text. This was very enjoyable and really got me thinking about computer science as a field I would like to explore further and I think this would be the perfect opportunity to expand my horizons.separator

I worked as a Mathematical Statistician for the US Census Bureau for the last six years. I would like to develop the computational skills necessary to solve both “big” and “little” data problems. I have relevant experience with Python, R, Java… I am very proud I was the first African-American woman to obtain the doctoral degree in Applied Mathematics at … University. It was very important for me to complete the degree as my mother got to see the end result before she passed away three months later. I also knew what completion would mean for others who came after me. I was determined that the future be filled with others who could fathom making this choice too.

separator

I work full-time at the National Science Foundation which has given me a behind-the-scenes look at how important information communication is. For example, we’re currently updating our IT systems to adhere to the Digital Accountability and Transparency Act which will grant public access to raw data currently only available in slices and chunks after it’s been aggregated by NSF.

Open information to empower communities, and create accountability feedback loops was the first thing I saw on the Keystone website that resonated with me. I noticed the posting is a few months old, but when I saw the ACAPS Sierra Leone lessons learned report I knew I had to reach out. The report is one of the only aid organization assessments of the epidemic that showed a good understanding of the situation on the ground in SL, and that’s extremely important to me. separator

I want to learn how to analyze and manipulate data, while being able to interpret and explain the outcomes and results to a lay audience. My idea of a cool job is a job that can help you understand and communicate with people from different sectors.

separator

Unlike most people, I am in love with numbers. I have come to learn there is a new cool job in this world called being a “data scientist”. I was amazed by how many incredible insights can be drawn from data. One of the insights I remember is that although people always complain about overpopulation of Chinese metropolises like Beijing and Shanghai, research shows that the population of these cities are smaller than expected, based on the Zipf’s law.

Data are the rich ores of our future society. We can acquire so much loot if we keeping mining them.

I worked in a Reasoning Lab, where we built a deontic logic system for the robot to perform moral reasoning. Given premises and a expected conclusion, the reasoner tells you whether the conclusion follows the premises, and what path to go from the premises to the conclusion.separator

Working with data is how I entertain myself (admittedly, when I should probably be studying). Over the last couple years I worked with Department of Education data, and interrogated my college’s staffing levels and finances. Curiosity about my college’s opaque finances has led me to pore over many IRS filings, financial statements, municipal bond releases, county property records, and the like. My next goal for the report is to redo it with Common Data Set enrollment data (it’s better suited to the question I’m asking) using R.

I see the importance of non-profits to our society, and understand the key to a success for a not-for-profit organization is rigorous and effective assessment.

separator

For me data science and analytics has been a way to realize this interest in a practical way and create a career out of my curiosity. I interned with the NYC public advocate’s office doing outreach and have been involved with a local group involved in gender equality. Over the summer I did a short data science bootcamp that taught me the basics of data analysis in Python. I was able to complete some short projects with a mentor and found that experience to be very helpful. In my current courses I am working in Python, R, and SQL more in depth.

separator

My introduction to the world of computers came at a late stage in my life. My first job after high school was in the field of welding, undertaken for the necessity of developing a practical and marketable trade. The truth is, I had never even used a computer up until my second job as an auditor for a local beverage distribution company. Growing up as the second to last of eleven children from a poor family in a developing country, this circumstance was far from uncommon.

In fact, I knew only a handful of people who had any kind of computer knowledge and even fewer who owned an actual computer system. My initial computer initiation, albeit a bit late, only served to whet my appetite for Information Technology. I was captivated – continually interested in how it all worked. I’ve grown to really appreciate the intricacies of it all – how it fits together and how data is vital to free enterprise’s very survival. Even before I had an official job in the IT field, I found myself always working to make information more easily accessible and usable.

One of my first data projects was undertaken during my time at the Parliament of Guyana. As senior Human Resources Officer, I couldn’t help but notice how antiquated and essentially pointless our data systems were. The offices were equipped with computers, yet all filing was done manually.  Records were kept in old dusty cabinets which, if mold and moisture didn’t get to it first, then the wood-ants certainly did. My solution to this was to advocate for the systematic digitizing of all personnel records. This took considerable time and efforts, and the usefulness wasn’t seen at the time by a my immediate superiors, but they humored me. However, by the end of the exercise, all HR records were digitized and catalogued.  Naming conventions were set up and retrieval of personnel records became a breeze. My next major project was to migrate the leave data to a database as I had taught myself Microsoft Access and had proceeded to set up a small database which kept track of all staff leave. Once all the historic data was copied from the ledgers in which they were once housed into the Access database, everyone got on board! I proceeded to train the department on how to use Microsoft Access in order to input and retrieve data. My next phase of this plan was to expand my database to include all HR Records, including those historical ones which were digitized.

My first run at this was to import the documents to the database – a first run that failed miserably as the database grew in size until it was finally corrupted. I was not in the habit of taking back-ups, a mistake I suffered dearly for. Over the course of my stay at the parliament though, I was able to refine the database by setting up filing structures and creating links to the individual files, instead of adding them to Access. This experience served as an important stepping stone to my next job.

I’m currently a DB Admin for a mining company. Data is typically unusable when it arrives at my door – certainly no bow-ties there. I review, validate, and import to make the data usable for the geology department.

Chances are I am not the most qualified applicant, but if you take a chance on me, I can say with certainty that it will be a decision that you won’t regret. What you’ll be choosing is a sincere and hardworking individual with passion and drive. Someone who is willing and eager to learn. One who has worked hard all of his life to achieve the things he has and who believes that he is just getting started.

separator

In my career, I would like to combine statistics and philanthropy. I want to analyze the data to identify the most vulnerable population, and ensure that they are being efficiently funded in order to make their lives easier. The work of Keystone Accountability has made strides in closing the gap between organizations and their constituents. I believe that transparency is key to increasing customer satisfaction, which should be a priority for all organizations.incorporating art in the representation of large amounts of data.

separator

I am a double major in Computer Science and Art. I want to incorporate art in the representation of large amounts of data. I find it fascinating the amount of work that it takes to analyze large amounts of data and shape them in such a way that can be consumed by the masses through visually pleasing and interpretative designs.

separator

Last thoughts

In addition to the intellectual curiosity that all these candidates express, I think a good really data scientist is comfortable doing these things:

  1. Iterative development – Pull a git repo, start a fresh branch, commit and test changes on your local copy of the server (in virtualbox / vagrant) then commit and merge with the master and push to production server.
  2. Write good documentation and automate it with sphinx / readthedocs.org or use iPython notebooks with their sci-kit-learn code.
  3. Be Agile: Estimate time it takes to complete your work tasks, log it, and maintain a 90% accurate rate on sprints
  4. Build unit tests for your code and live by the framework constraints that you impose in these tests.

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s