mirror of
https://github.com/almet/notmyidea.git
synced 2025-04-28 19:42:37 +02:00
playing around with rdf and dbpedia
This commit is contained in:
parent
8f7a9cd2a5
commit
1cc371ceb3
2 changed files with 53 additions and 0 deletions
|
@ -3,6 +3,7 @@ Pelican, 9 months later
|
||||||
|
|
||||||
:tags: pelican, python, open source, nice story
|
:tags: pelican, python, open source, nice story
|
||||||
:date: 25/07/2011
|
:date: 25/07/2011
|
||||||
|
:description: or why I like opensource so much
|
||||||
|
|
||||||
Back in October, I released `pelican <http://docs.notmyidea.org/alexis/pelican>`_,
|
Back in October, I released `pelican <http://docs.notmyidea.org/alexis/pelican>`_,
|
||||||
a little piece of code I wrote to power this weblog. I had simple needs: I wanted
|
a little piece of code I wrote to power this weblog. I had simple needs: I wanted
|
||||||
|
|
52
python/languages-influences.rst
Normal file
52
python/languages-influences.rst
Normal file
|
@ -0,0 +1,52 @@
|
||||||
|
Using dbpedia to get languages influences
|
||||||
|
#########################################
|
||||||
|
|
||||||
|
:date: 2011/08/16
|
||||||
|
:tags: dbpedia, sparql, python
|
||||||
|
|
||||||
|
While browsing the Python's wikipedia page, I found information about the languages
|
||||||
|
influenced by python, and the languages that influenced python itself.
|
||||||
|
|
||||||
|
Well, that's kind of interesting to know which languages influenced others,
|
||||||
|
it could even be more interesting to have an overview of the connexion between
|
||||||
|
them, keeping python as the main focus.
|
||||||
|
|
||||||
|
This information is available on the wikipedia page, but not in a really
|
||||||
|
exploitable format. Hopefully, this information is provided into the
|
||||||
|
information box present on the majority of wikipedia pages. And… guess what?
|
||||||
|
there is project with the goal to scrap and index all this information in
|
||||||
|
a more queriable way, using the semantic web technologies.
|
||||||
|
|
||||||
|
Well, you may have guessed it, the project in question in dbpedia, and exposes
|
||||||
|
information in the form of RDF triples, which are way more easy to work with
|
||||||
|
than simple HTML.
|
||||||
|
|
||||||
|
For instance, let's take the page about python:
|
||||||
|
http://dbpedia.org/page/Python_%28programming_language%29
|
||||||
|
|
||||||
|
The interesting properties here are "Influenced" and "InfluencedBy", which
|
||||||
|
allows us to get a list of languages. Unfortunately, they are not really using
|
||||||
|
all the power of the Semantic Web here, and the list is actually a string with
|
||||||
|
coma separated values in it.
|
||||||
|
|
||||||
|
Anyway, we can use a simple rule: All wikipedia pages of programming languages
|
||||||
|
are either named after the name of the language itself, or suffixed with "(
|
||||||
|
programming language)", which is the case for python.
|
||||||
|
|
||||||
|
So I've built `a tiny script to extract the information from dbpedia <https://github.com/ametaireau/experiments/blob/master/influences/get_influences.py>`_ and transform them into a shiny graph using graphviz.
|
||||||
|
|
||||||
|
After a nice::
|
||||||
|
|
||||||
|
$ python get_influences.py python dot | dot -Tpng > influences.png
|
||||||
|
|
||||||
|
The result is the following graph (`see it directly here
|
||||||
|
<http://files.lolnet.org/alexis/influences.png>`_)
|
||||||
|
|
||||||
|
.. image:: http://files.lolnet.org/alexis/influences.png
|
||||||
|
:width: 800px
|
||||||
|
|
||||||
|
Interestingly enough, it's stated that Java was an influence for Python (!!)
|
||||||
|
|
||||||
|
You can find the script `on my github account
|
||||||
|
<https://github.com/ametaireau/experiments>`_. Feel free to adapt it for
|
||||||
|
whatever you want if you feel hackish.
|
Loading…
Reference in a new issue