Charles Basenga Kiyanda

Science and copyright

I swear, in my next post, I’ll give all the details I think I can give (that is, all the details which are set in my head) about my “great” idea for the future of scientific communication. In the meantime, I just want to have a little chat about science and copyright.

Part of the reason I’m going through this thought process and devising this master system (I’d use the word framework, but for personal reasons, I hate the word framework) is that I believe there’s something wrong with how scientists communicate. I’m not saying that scientists are doing everything wrong when it comes to communication. In fact, I think many things scientists do are right! Let me outline this in 3 points:

  1. The fundamental spirit of how scientists approach communication is perfect. The open-ness of scientific communication should never change. All scientists I know, not only recognize this fact, but probably worship that principle. I don’t think I’m teaching anyone, anything here.
  2. The current use of copyright is dangerous. Basically, we’re going about our business, applying number 1 just above to our results, but giving our copyright away to scientific journals, some of which are looking to use it, as the PRISM initiative shows us.
  3. The tools and methods we use to disseminate our scientific results and ideas are outdated and inefficient. This is where my framework idea comes in. I’m really not looking at revolutionizing scientific communication (although I’ll claim it at some point so I make noise and people notice what I’m doing), but rather to evolve it to the next step. The underlying principle (number 1 above) will remain. Only the packaging and the speed at which it traves will change.

Let me discuss points 1 and 2 a little here today. More explicitely, let me depart from the usual and try to take an idea from science and apply that to another field.

In number 1 above, I praise the principle of open-ness of scientific communication. What I’m saying is that the founding principle we use to do science is that anyone can take what I’ve done, scrutinize it, critic it, replicate it (in fact, we’re happy when someone redoes something we’ve done and confirms we’ve done it properly), expand on it, etc. All that scientists ask is that when you re-use what someone’s done, you mention who’s done it first, done it wrong, done it partially. Expressed in this way, it sounds a lot like the attribution creative commons license. What’s a creative commons license?

Simple. The Creative Commons is a non-profit organization whose sole purpose is to create lienses which can be used by content creators. These licenses come in different forms, a machine readable form, a lawyer-ese form and a human-speak (for laymen) form. As a content creator, you select which rights you WANT to give to people over your work and you attach the appropriate license to your work. This indicates to potential (re-)users of your work “These are the things you can do with this copyrighted work.”

The attribution license says that you can re-use, use, distribute, mashup (the new buzzword on the internet), remash, paint over, trace over, deform and enjoy the current work. All you have to do is attribute the source properly. This is exactly what we do in science.

One point of clarification though. Legally, copyright extends over EXPRESSIVE WORKS and not over IDEAS AND FACTS. So technically, someone can take your paper, extract the actual data points and re-use that information. Information is not copyrightable. So people don’t even need to attribute the source. In this case, the “science code” goes beyond copyright. When re-using INFORMATION laid out by a previous scientist, we still attribute the source.

On the other hand, there are behaviours scientists exhibit, which don’t respect copyright. For example, we often re-use images from other scientists and only attribute the source. In my master’s thesis, there are images from the phd thesis of a student at Caltech. I still attributed the source, of course. I’m sure I can pull loads of master’s/ph.d. thesis in my field with copies of images from older papers, probably from many recent papers. In the case where one pulls an image DIRECTLY from a copyrighted work and only attributes, one is technically breaking copyright law. Now, I doubt Caltech is going to sue me for using that image (or the copyright holder for that matter, whom I know well enough by now). In the end, using that image and simply attributing the source FEELS right. This is a case where using the creative commons attribution license would work well. This would have enabled me to use the image without worrying about the copyright detail. I still didn’t care, but technically I should have. Were I to be sued, I would argue fair use (as a critic of the work). Still I’m really not likely to be sued by Caltech for this.

This is where my point 2 comes in. We’re using our traditional distribution methods (archival journals and the such), knowing full well that we’re signing our copyright away in most cases, yet without caring about the consequences. Basically, because there have been no real consequences so far. Once in a while, there’s the odd over-zealous journal staff who sees infringiment and sends a letter to the infringer without clearing it with a boss first. Yet, I’m not aware of anyone getting really screwed over this. I’m willing to bet most (all?) scientists are more worried of being discretited because of accusations of plagiarism than because of lawsuits over copyright infringiment.

So in light of this unwritten policy in the scientific domain (we could call it the no-sue poliy), why should we care? While the danger isn’t imminent, groups like PRISM are worrisome. They show that someone, somewhere cares. That person hasn’t used their right yet, but they know it exists. I don’t like the status quo, especially when I’m not in a situation of power. My solution to this is to say that while we devise a system to revamp our distribution methods, let’s take the opportunity to cut-out the middle-man and use licensing schemes which solve this issue. More on this next time.

For now, as promised, let me draw a parallel between what happens in the scientific world and what happens in the music world.

As I outlined above, the way science is done (on a global scale, not on a laboratory scale) is that we constantly re-use, re-hash, re-mash other scientists work. Joe A, in Italy in 1971 publishes a paper which is really interesting. He shows that a species of frogs in Madagascar has the ability to spontaneously change sex. (I’m making this up as I go along now, in case you hadn’t noticed.) Fast-forward to 2002, when Jane B, in South Africa, shows that all frogs have the genetic makeup to do this, it simply happens that only a fraction of those species have the need to use spontaneous sex change. (Again, it’s completely out of thin-air, just an example. I’m pulling this out of the script of a bad science-fiction/cataclysm movie right now.)

In the end, Jane B simply writes in her article that “Joe A has previously shown that…” and maybe pastes in a graph from the original 1971 article, showing the main conclusions. Re-use of previous science to make new science.

What about music? Listen to this clip about the song “Amen Brother”.

Basically, from the sampling of a few seconds of a rather unknown 1969 song, a whole genre of music was born. A spoon-full of copyright infringiment over a bed of creative re-use led to a whole genre of music. In the most basic sense, the birth of hip-hop was basically the same as the evolution of science.

Does the comparison really apply? I would say that it does. Joe A might publish an article with a picture and say, “I have no idea what this means.” Jane B then comes along, realizes what the picture is, scans it from the paper, applies some image transformations to get some features out and say “Ah ah! It is now clear that the large feature in the image from the article of Joe A is …” Remember. FACTS are not copyrightable. EXPRESSIONS OF FACTS are copyrightable. Images are expressions of facts. Scientific images may be argued either way. They’re data, really, we (scientists) wouldn’t differentiate them from raw fact, at least conceptually. But they’re still images. In the modern scientific community, we wouldn’t think twice about this. We might phone up Joe A and say, “Look at this! I’ve figured it out!” Joe A will be happy. And so we’ll publish our new result, with attribution, feeling quite content. Unfortunately, Joe A has signed his copyright away to the international journal of frogs back in 1971. Not too unfortunate, because we don’t really enforce it. In comes PRISM… and I get a little worried. Not that much, but a little. And so I ask, given that we have the technology to solve this problem easily and make our system more efficient, why not?

Now, to give you something to chew on, think about what would happen to science if somehow, we weren’t allowed to re-use other scientists stuff. Think about what would happen if we started suing each others every time re-use occured.

Now, having thought about that for a minute, think about what would have happened to music if every DJ in the 80’s had been sued out of existence?

Consider the approach the mainstream expressive industry is taking to the application of copyright law. The “copyright should last forever and nobody should be allowed to use even a single second of my work withouth paying me” camp. Forward the clip above to the 14:00 mark and listen to what the author considers as the significance of learning the origin of the amen break. When samplers came to light, the industry in place let it slide. They may not have regarded the technology and its uses as viable in any way. Now, copying and sampling possibilities are at everyone’s fingertips. It doesn’t seem they’ll make that “mistake” twice.

Now, what if all musicians agreed to say that “You may reuse portions of my work for the purpose of sampling and creating new expressive works.” How would hip-hop and electronica evolve?

[Update 1: I edited this post to add some links to the PRISM webpage and the Creative Commons website. –CBK]
[Update 2: Fixed link to the youtube video. It seems wordpress’ visual editor doesn’t quite like the embed tag. –CBK

3 comments to Science and copyright

  • Good post Charles, very well-thought and articulate. I had never heard of PRISM before since I’m not in the same universe as you, being in social sciences, but I paid a visit on their website and their objectives regarding copyrights appear worrisome. The scientific community really ought to familiarize itself with actual copyright laws, and their implication. At the same time, it should pay a close eye to the emerging trend from private interest groups, which are willing to seriously hamper the sharing of knowledge, crucial to the improvement of the well-being of humanity, for purely pecuniary motives.

    I agree to a certain extent that intellectual property must be protected as to allow an economic motive to generate investment and research, but not at the price of “imprisoning knowledge”. The Creative Commons license is a good place to start. and innovative initiatives like the Public Library of Science ( are beggining to bear fruit. There is hope 🙂

  • BTW, I’d like to hear your thoughts on the PLOS, what do you think of it ?

  • Thanks for the comment. I’d like to clarify though, I do think that scientists are aware of copyright laws. We hear about this often enough to know at least the basics of the system. I get the feeling we’re more in a state of willful blindness than blissful ignorance.

    The PLOS is a great initiative. As I said above, the basic idea of how we WANT to communicate science is fine. How we IMPLEMENT it today is something else. I’m not all that familiar with PLOS, but my understanding is that it’s open (free) access and the rights are kept with the publishing scientist(s). PLOS does have limitations.

    1- it charges the authors for publishing their article. At the core, I have nothing against it. Someone has to pay for it. From the looks of their webpage, it seems quite a hefty fee. They do have fee waivers if you don’t have funds, so this is probably one of the reasons why the fee is so high. The people that get waivers have to be offset by the paying users. I’ve had argued to me by a journal editor (a good type, I quite enjoyed discussing with him when I had access to him) that there were only 2 business models available. “Either the person who reads pays or the person who publishes pays. In the end, someone has to pay the fee.” (I’m paraphrasing, this was over a year ago, I don’t remember the exact words.) I don’t agree with that statement. First, we should make the system almost completely an online one. This brings the publishing costs to a bare minimum. (There are still costs associated with moving bits of information instead of stacks of glossy paper, but it’s much lower. Plus it’s more ecological, presumably.) The only piece of information I would distribute on paper form, and only if it is requested, is a digest of the new articles on the site or a summary of the ones that may interest you. I’d much rather send that information via e-mail though. Second, I’d offset the remaining costs with advertising. Yes, I said it, advertising. People don’t like advertising, yet we put up with it even as scientist in large distribution scientific publications. Think “Physics Today”, etc. I believe that advertising well done can be a benefit for the user. Think “Amazon”. Seriously. Anyone’s who’s looked at more than 5 products (not even bought them) on Amazon knows exactly what I’m talking about. You can’t lie and claim that the “Here’s a product you might like” section isn’t always spot on. Unobstrusive, well-targeted, thoughtful and visually appealing advertising can be a added value for the user and a revenue stream for the maintainer of the system. Yes, I said it, I’m sorry. Advertising-supported, (not necessarily) peer-reviewed, scientific publication. Hate me if you will, I think it has potential.

    2- PLOS is limited to biology/genetics/medicine. I guess I should call it Health Sciences. This isn’t really a bad thing. It’s a business decision. At the beginning, Facebook was only meant as a platform for university students. Maybe PLOS will expand, I don’t know. I’d like it if it did, though.

    3- (And probably the only real criticism:) PLOS is limited to conventional expressions of scientific results. By this I mean that what you get from PLOS is a collection of bar graphs, texts and fixed graphics. To be fair, this is not only the state of just about every single scientific publication today, but PLOS is also showing signs of improvements. There’s this new thing, called a “SciVee”. I think it’s at I’m not a huge fan. Basically the idea is that once your paper is published, you can go and make a video explaining your paper. Then, you attach the video to the paper on the site. It’s an improvement. It’s a step. Not a leap, not a bound and not a revolution. It’s a step in the right direction. Ultimately, I think the format of scientific expressions has to explode. The only reason why we’ve used bar graphs, text and static images is because, until now, that’s the only thing we knew how to publish. Those were the limitations of the medium. Well not anymore. By making the papers directly available online, there’s nothing stopping you from attaching executable code to your paper (hopefully, nobody puts viruses there). Maybe some people would want to attach source code. I can’t count the number of instances where I read a paper titled “A new algorithm for…” with no reference to a webpage with an actual nuts&bolts implementation of that algorithm in a language. An algorithm in mathematical form is necessary, an implementation in a known computer language is useful. Why not attach movies? High resolution computer simulations today can generate great movies and it’s sometimes actually informative to watch them. There are some conferences offering that option now, but it’s not all of them and it’s still mostly in its infancy. But why stop there? How about having plots that move? I’ve often wanted to plot something and wished I could have a moving plot where the line I’m showing would move as a parameter is changed. Unfortunately, we can’t publish that today. I submitted an abstract recently where I had to hammer at the text for 2 days to make it fit in the 4 page limit (ah yes, don’t get me started about page limits) when I could have cut out an entire page if I only would have had an animated figure.

    In essence, that’s my take on PLOS. I’m not saying PLOS is bad. PLOS is here to stay and I’m all for any organizationi that offers free access to science with the copyright squarely in the hands of the publishing scientist(s). Everything else is an argument about how to implement the perfect system.

Leave a Reply




You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>