From 1cc371ceb379c85f25d54db13f2e3128ac4b136d Mon Sep 17 00:00:00 2001 From: Alexis Metaireau Date: Tue, 16 Aug 2011 21:32:13 +0200 Subject: [PATCH] playing around with rdf and dbpedia --- dev/pelican-status-update.rst | 1 + python/languages-influences.rst | 52 +++++++++++++++++++++++++++++++++ 2 files changed, 53 insertions(+) create mode 100644 python/languages-influences.rst diff --git a/dev/pelican-status-update.rst b/dev/pelican-status-update.rst index 3b8ca27..f4c3dbb 100644 --- a/dev/pelican-status-update.rst +++ b/dev/pelican-status-update.rst @@ -3,6 +3,7 @@ Pelican, 9 months later :tags: pelican, python, open source, nice story :date: 25/07/2011 +:description: or why I like opensource so much Back in October, I released `pelican `_, a little piece of code I wrote to power this weblog. I had simple needs: I wanted diff --git a/python/languages-influences.rst b/python/languages-influences.rst new file mode 100644 index 0000000..9e8862d --- /dev/null +++ b/python/languages-influences.rst @@ -0,0 +1,52 @@ +Using dbpedia to get languages influences +######################################### + +:date: 2011/08/16 +:tags: dbpedia, sparql, python + +While browsing the Python's wikipedia page, I found information about the languages +influenced by python, and the languages that influenced python itself. + +Well, that's kind of interesting to know which languages influenced others, +it could even be more interesting to have an overview of the connexion between +them, keeping python as the main focus. + +This information is available on the wikipedia page, but not in a really +exploitable format. Hopefully, this information is provided into the +information box present on the majority of wikipedia pages. And… guess what? +there is project with the goal to scrap and index all this information in +a more queriable way, using the semantic web technologies. + +Well, you may have guessed it, the project in question in dbpedia, and exposes +information in the form of RDF triples, which are way more easy to work with +than simple HTML. + +For instance, let's take the page about python: +http://dbpedia.org/page/Python_%28programming_language%29 + +The interesting properties here are "Influenced" and "InfluencedBy", which +allows us to get a list of languages. Unfortunately, they are not really using +all the power of the Semantic Web here, and the list is actually a string with +coma separated values in it. + +Anyway, we can use a simple rule: All wikipedia pages of programming languages +are either named after the name of the language itself, or suffixed with "( +programming language)", which is the case for python. + +So I've built `a tiny script to extract the information from dbpedia `_ and transform them into a shiny graph using graphviz. + +After a nice:: + + $ python get_influences.py python dot | dot -Tpng > influences.png + +The result is the following graph (`see it directly here +`_) + +.. image:: http://files.lolnet.org/alexis/influences.png + :width: 800px + +Interestingly enough, it's stated that Java was an influence for Python (!!) + +You can find the script `on my github account +`_. Feel free to adapt it for +whatever you want if you feel hackish.