HRDAG is hiring – technical lead
If this could be you, let us know. Also, please feel free to pass on this link to great people.
Job Title. Technical lead with a hacker’s heart
Location. A cool office in SOMA, San Francisco. You need to be on-site with us.
What we do. The Human Rights Data Analysis Group (HRDAG) develops statistical techniques to measure human rights atrocities. Our work helps bring dictators to justice through data analysis of human rights atrocities around the world. Over more than 20 years, our small team has developed technology and statistical techniques to take disjoint, incomplete, and inaccurate information from conflict zones and process it to identify reliable, actionable analysis that is relied upon by organizations such as the UN’s Office for the High Commission on Human Rights and international and domestic war crimes tribunals. Members of our technical team have testified in domestic and international war crimes trials, built tech and conducted analysis for nine truth commissions, and supported four UN human rights missions. We currently provide ongoing support to human rights projects in Syria, Colombia, DR Congo, Guatemala, Serbia, and other places.
We’re demographers, mathematical statisticians, biostatisticians, political scientists, survey statisticians, and computer scientists. We’re extremely mission-focused, this work is who we are. We need more tech depth: is it you?
Check out what the press have said about us, including stories on BoingBoing.net, NPR, The Atlantic, Wired, Plaza Pública, The New York Times, and others.
What you will do. Over the last several years we’ve built an end-to-end data processing pipeline that can ingest dirty datasets and output refined statistical analysis. There’s a lot of moving parts in between: data normalization / sanitization / transformation; survey estimation; cloud-based record linkage and statistical estimation; geospatial analysis; and general linear model fitting. In addition to the data provided by our partners in the field, we’ve also written a series of custom web-scrapers to pull data that gets posted on various websites and have built a data de-duplication framework for record linkage (aka database deduplication, entity resolution) written in python and Java. The de-duplication process is used for, among other things, identifying people’s names written in different languages, across numerous data sets.
Some examples of what we’ll ask you to do.
— Process, clean, and transform data, including data standardization code, mostly in python and R.
— Under supervision of statisticians, write and test statistical analysis in R, including survey estimation, geospatial analysis, and general linear model fitting.
— Maintain and develop our data deduplication (i.e., record linkage/entity resolution) framework written in python and Java.
— Write and run web crawlers and scrapers for data collection, mostly in python.
— Perform data archeology, recovering data from ancient files in odd formats, in a variety of human and computer languages.
— Maintain and develop our team’s automated data processing and analysis infrastructure: POSIX-environment command line tools built on bash, make, python, java, and R.
— Maintain and develop our team’s internal groupware websites, including a MoinMoin wiki, and several small custom sites built on python/Django.
You’ll also teach good programming practice and provide general programming and tech support (stuff well beyond basic IT) to the rest of the team, who are experts in statistics and social science with varying degrees of techiness. Debugging ssh configs, explaining closures, testing new encryption software, or writing new editor macros in python for SublimeText or in elisp for Emacs: yes. Helping someone with MSWord: no way.
And we’ll call on you to write technical descriptions of HRDAG methods and projects at varying levels of detail for academic publications, blog posts, human rights reports, white papers, grant applications, grant reports, and internal documentation. Automate generation of publications based on data using LaTeX and Sweave/knitr. Give technical and non-technical presentations of HRDAG ideas, projects, and findings at conferences.
— A personal commitment to the Universal Declaration of Human Rights.
— Bachelor’s or Master’s degree in computer science, electrical engineering, or closely related field with a programming focus.
— Some relevant experience: this could be two years related experience post-Bachelor’s; it could be a lot of patches submitted to an open source project; it could be something else. Tell us why you’re ready.
— Proven ability to write solid, readable, maintainable code, with unit and acceptance tests, that can handle large, distributed compute jobs.
— Skill areas: Qualitative text mining & analysis. Distributed and parallel algorithms, machine learning, computer security & crypto, data visualization, web programming, and statistics.
— Demonstrated strong skill programming in Python and R, and comfort in unix.
— Experience with or ability to quickly get up to speed with: bash scripting, make, BibTex, SWeave/knitr, Weka, svn (we can explain why we don’t use git), Django, and xml processing.
— Experience with unicode, unusual character encodings, and handling non-latin character sets.
— Willingness to travel, sometimes for up to 4 weeks, to places well off the beaten path.
— If hired, ability to provide proof of eligibility to work in the United States.
— Also helpful: Interest in and comfort with languages other than English, especially Spanish, French, Russian, or Arabic. Let us know how you can communicate.
How to apply. Send a cover letter, a CV/résumé, a link to your blog, and a description or link to the code for a piece of software you’re proud of to firstname.lastname@example.org. Please use open file formats (PDF is fine). The cover letter is in some ways the most important part! Please explain what interests you about the Human Rights Data Analysis Group. We are especially interested to hear comments on our work. Tell us how you would strengthen our team.
[Ed.: This position has been filled.]