Charles Basenga Kiyanda

I’m now an engineering professor

Well an assistant one at least. I’m already writing grant proposals, so I guess that makes me a real prof nonetheless.


ICDERS 2015 poster

I had a poster at icders 2015 this year on a version 0 of an online database for experimental HE data. I did put the address of my personal webiste on that poster, so you might have ended up here from there.


I couldn’t attend, but my colleague was nice enough to put the poster up and probably discuss a bit with some people. The whole point of the poster was to present the first pass at a database that is meant to be, in time, explorative and participative. By that, I mean that the goal is for people to use the interface as an exploration tool and not merely to download data and for researchers to share, submit, download, use data in a (relatively) simple format that facilitates things. The tone of the poster ended up being a little light-hearted, but the content was very real. The main criticism I’ve heard of the participative approach is that some sort of central command is needed to ensure the quality of the available data. I disagree and I gave a few examples, one of which is probably not very well known, yet is very close to what this project aims to be, that is openstreetmap.

OpenStreetMap (OSM)  is a participative and collective database that aims at mapping everything that is permanent in the world. There is no enforced tagging scheme in the database, so you can end up with competing nomenclature in the database. Everything is community decided and nothing is actually technically enforcable. The file format is akin to xml and hence is completely open. Nonetheless, the project is, I would say, unquestionnably succesful. Large organizations use openstreetmap as the backbone of their mapping needs. Some examples of that:

These examples, I hope, will convince you that the participative model can and has lead to high quality, reliable products in the recent past. This database project’s aim is to use a similar model in our field. Hopefully, this will be the start of a fruitful discussion.

Again, ff you want to try the databse, go here.

NB: It’s been pointed out to me that there was an overlay mistake in the poster and some words were cut off slightly. Annoingly, everything was fine in the original PPT file and it seems powerpoint screwed something up when the file got converted to a tiff for printing. This is highly annoying as I would kind of expect (and indeed use to trust) powerpoint to get “overlay of objects on a page” right as that’s kind of a core function of the software. Another reason to move towards open source and open data: auditability. The original poster is available here (Icders 2015 poster on HE database) if you’d like to see it.

How surprising is it actually that nurses were infected with Ebola in Texas?

I’ve been using twitter more and more (mostly given that I’m involved in a local Montreal openstreetmap group and I’m one of the two curating that twitter account) and recently got involved in a back and forth with @2closetocall about his following tweet:

@2closetocall: Ebola, this disease that is supposedly so hard to transmit… And yet we now have two nurses infected by one patient. Are the proba. right?

The exchange then went:

@cbkiyanda: @2closetocall yes, probs are right. The 3 Dallas cases are people who handled/medically cared for infected. Not random people on the street.

@2closetocall: @cbkiyanda but ehy also (at leats in theory) took way more protections than a random person

@cbkiyanda: @2closetocall which is why science is preferred (e.g. ) over asking questions on twitter based on 2 data points.

@2closetocall: @cbkiyanda and ironically, by asking on Twitter somebody provides me with that…

@2closetocall: @cbkiyanda also, these rates still show that two nurses being infected while taking crazy precautions was unlikely it seems

@cbkiyanda: @2closetocall: need contact w/ vomit, blood, feces.  RT @NateSilver538: No, you didn’t catch Ebola on the subway.

@cbkiyanda: @2closetocall : hence, while they take more precautions, med. staff are also much more likely to come in contact with blood, vomit, feces.

@2closetocall: @cbkiyanda I know all this obviously. I still think you’re missing my point. But thanks anyway

@cbkiyanda: @2closetocall I feel you’re missing mine. Lower likelihood of infection given precaution, comes with increase in likelihood due to task.

I’m not sure I’m so enamored with twitter as a discussion platform, so I figured I’d go back over what I view as the underlying problem in his argument in more than 140 characters. Continue reading How surprising is it actually that nurses were infected with Ebola in Texas?

On grading essays (and procrastination)

If you don’t read tomorrow’s professor mailing list yet, I encourage you to do so. Post 1360 looks at grading (essays specifically, but also just grading in general). The posting ends with sound advice that can be applied probably very generally:

1. Work in moderation, a little bit each day, rather than procrastinating and bingeing.
2. Remain fresh and alert by taking breaks when needed.
3. Practise going a bit faster while maintaining quality.
4. Aim to do what’s good enough, not at perfection.
5. Redesign the task to make it more interesting.

A vs. An

If you’re a non-native English speaker you might have struggled about when to use a vs. an. A previous adviser of mine, after years of corrections, finally managed to make me internalize the basic rule:

If the following word starts with a vowel, use an. If the following word starts with a consonant, use a.

That’s not the actual rule though. The actual rule is

If the following word starts with a vowel sound, use an. If the following word starts with a consonant sound, use a.

You’re confused? Good. Thank Purdue for the easily accessible answer.

My guess is you had to scroll right

An article on La Presse (in french) reports on a study during which researchers asked participants to place Ukraine on a map. If you don’t read french it’s ok, just look at this picture.

carte_ukrain_originalThe original picture looks like this. My first thought, when looking at this image, is “What’s so bizarre in Eurasia that one just stops clicking right of India?” If you don’t see it, look at this updated version with a line, below.

Let’s do a little detective work. The original image size is 1540 pixels wide by 1025 pixels high, or an aspect ratio of roughly 1.5. That corresponds to an aspect ratio of 4.5:3,  13.5:9, or 15:10. None of which sound familiar. My red line is about 1120 pixels from the left edge. 1120:1025 is really close to 1:1 (it’s 1.09:1). My guess is the map was shown within a square box, centered around Greenwich for longitude. There may have been 95 pixels cut from the left side (basically Alaska).

carte_ukraine_with_barThe take home message is that such an error (if I’m right) would have been easily fixable. (I’m also trying to figure out which data was used to make the outline of the continents, but I can’t tell. Both google and openstreetmap show the great lakes at very low zoom settings.)

Use version control

I’m serious, do. (So, in other news, I’m back after a long hiatus, maybe I’ll explain one day.)

I kept track of my Ph.D. thesis using subversion as I wrote it. I’m sure my blood pressure went down a few notches because of that. I highly recommend everyone do the same. Now I recently participated in the transition of a project from subversion to git. I struggled at the start, but finally grew to wrap my head around maybe 60% of what git does. A colleague found a great visualization of git processes which I pass along here. It’s not a tutorial, of which there are plenty. Here’s one, for example. Now, if you try to learn git and find yourself a little baffled, I recommend you play with that D3-powered site.

Kludging your way to nice figures with ubuntu+kile+inkscape+cairo+eps+psfrag

Several inkscape+latex+psfrag users have been complaining for quite a while now about the changes in cairo. I periodically forget how to kludge my way through while I wait for the situation to stabilize and I have to search the tubes for a couple hours before I remember all the details, so here it is for all to enjoy (and for me to find much more quickly the next time). I use linux (ubuntu), kile, tetex, inkscape and psfrag. If you’re on windows, well, I can probably not help.

The problem: The later versions of inkscape use cairo to generate eps files. The new version of cairo doesn’t have explicit strings for the words in the eps files. Instead of having the string “(hello)Tj” somewhere in your eps file, you have the string “<0102030304>Tj”. Psfrag looks for hello somewhere, can’t find it and hence is utterly confused. As a result, so are you. (Un)fortunately, inkscape is a really good FOSS program and I’m unwilling to change my ways at this point.

The solution: Here’s my workflow.

  1. Make a file in inkscape as usual. Use short labels (which will later be replaced by psfrag) if you want to make it quicker later on. I just go through the alphabet in sequence.
  2. Replace all the <##>Tj strings with (CC)Tj. Make sure to change the numbers to the right letters. (If you write the letters in order in inkscape, then 01 corresponds to a, 02 corresponds to b, etc.) To be a bit quicker, I make a script, which does it for me once I’ve figured it out. This way, I can modify the figure later on and not go completely mad. A sample script is shown at the bottom for 2 labels, a and b.
  3. run
  4. produce dvi using kile (I use kile, you might not)
  5. produce ps from dvi using kile (or whatever it is you use)
  6. ps2pdf text.pdf (because using kile and dvi2pdf(?), the pdf doesn’t have the psfrag substitutions in)

As an extra (which I always forget also): to get text on multiple lines using psfrag, use a shortstack: “\psfrag{label}{\shortstack[l]{some more\\text}}”.

Not simple and not pretty. There are probably more succint and elegant scripts for You might be able to get the pdf generation to work from withing kile. Personally, I’ve given up and just keep a terminal window open for that operation.

Sample script with 2 labels, a and b:

#! /bin/sh
cp fig.eps fig.eps.backup ##(for your sanity the first time you run the script)
mv temp.eps fig.eps
sed ‘s/<01>/(a)/’ fig.eps | sed ‘s/<02>/(b)/’ > temp.eps
rm fig.eps
mv temp.eps fig.eps

How to improve, as a foreign speaker, your written english.

I really, really, REALLY hope I got the commas right in the title. I’m a native French speaker and struggle to no end when writing documents in English. I’ve been looking for ways to improve my written English. Writing more often on this blog is, in part, an effort to achieve this goal.

My wife recently started a Master’s degree and I’ve discovered that correcting the texts of another non-native speaker. Consider it a game of chess against a well-matched opponent. Despite my limited abilities, I’m amazed I can actually make her texts better.

I’ve found some resources most useful in correcting (others and myself alike). Regarding commas, the best explanation I’ve seen so far online is here. The oatmeal also has an informative and funny poster about semicolons.

social software and science

I haven’t been through this entire post on, but glancing at the first few lines seems interesting. On the subject of science and social software.