mirror of
https://github.com/almet/notmyidea.git
synced 2025-04-28 11:32:39 +02:00
105 lines
No EOL
5.5 KiB
HTML
105 lines
No EOL
5.5 KiB
HTML
<!DOCTYPE html>
|
|
<html lang="fr">
|
|
<head>
|
|
<title>
|
|
Using dbpedia to get languages influences - Alexis Métaireau </title>
|
|
<meta charset="utf-8" />
|
|
<meta name="viewport" content="width=device-width, initial-scale=1">
|
|
<link rel="stylesheet"
|
|
href="https://blog.notmyidea.org/theme/css/main.css?v2"
|
|
type="text/css" />
|
|
<link href="https://blog.notmyidea.org/feeds/all.atom.xml"
|
|
type="application/atom+xml"
|
|
rel="alternate"
|
|
title="Alexis Métaireau ATOM Feed" />
|
|
</head>
|
|
<body>
|
|
<div id="content">
|
|
<section id="links">
|
|
<ul>
|
|
<li>
|
|
<a class="main" href="/">Alexis Métaireau</a>
|
|
</li>
|
|
<li>
|
|
<a class=""
|
|
href="https://blog.notmyidea.org/journal/index.html">Journal</a>
|
|
</li>
|
|
<li>
|
|
<a class="selected"
|
|
href="https://blog.notmyidea.org/code/">Code, etc.</a>
|
|
</li>
|
|
<li>
|
|
<a class=""
|
|
href="https://blog.notmyidea.org/weeknotes/">Notes hebdo</a>
|
|
</li>
|
|
<li>
|
|
<a class=""
|
|
href="https://blog.notmyidea.org/lectures/">Lectures</a>
|
|
</li>
|
|
<li>
|
|
<a class=""
|
|
href="https://blog.notmyidea.org/projets.html">Projets</a>
|
|
</li>
|
|
</ul>
|
|
</section>
|
|
<header>
|
|
<h1 class="post-title">Using dbpedia to get languages influences</h1>
|
|
<time datetime="2011-08-16T00:00:00+02:00">16 août 2011</time>
|
|
</header>
|
|
<article>
|
|
|
|
<p>While browsing the Python’s wikipedia page, I found information about
|
|
the languages influenced by python, and the languages that influenced
|
|
python itself.</p>
|
|
<p>Well, that’s kind of interesting to know which languages influenced
|
|
others, it could even be more interesting to have an overview of the
|
|
connexion between them, keeping python as the main focus.</p>
|
|
<p>This information is available on the wikipedia page, but not in a really
|
|
exploitable format. Hopefully, this information is provided into the
|
|
information box present on the majority of wikipedia pages. And… guess
|
|
what? there is project with the goal to scrap and index all this
|
|
information in a more queriable way, using the semantic web technologies.</p>
|
|
<p>Well, you may have guessed it, the project in question in dbpedia, and
|
|
exposes information in the form of <span class="caps">RDF</span> triples, which are way more easy
|
|
to work with than simple <span class="caps">HTML</span>.</p>
|
|
<p>For instance, let’s take the page about python:
|
|
<a href="http://dbpedia.org/page/Python_%28programming_language%29">http://dbpedia.org/page/Python_%28programming_language%29</a></p>
|
|
<p>The interesting properties here are “Influenced” and “InfluencedBy”,
|
|
which allows us to get a list of languages. Unfortunately, they are not
|
|
really using all the power of the Semantic Web here, and the list is
|
|
actually a string with coma separated values in it.</p>
|
|
<p>Anyway, we can use a simple rule: All wikipedia pages of programming
|
|
languages are either named after the name of the language itself, or
|
|
suffixed with “( programming language)”, which is the case for python.</p>
|
|
<p>So I’ve built <a href="https://github.com/ametaireau/experiments/blob/master/influences/get_influences.py">a tiny script to extract the information from
|
|
dbpedia</a>
|
|
and transform them into a shiny graph using graphviz.</p>
|
|
<p>After a nice:</p>
|
|
<div class="highlight"><pre><span></span><code>$<span class="w"> </span>python<span class="w"> </span>get_influences.py<span class="w"> </span>python<span class="w"> </span>dot<span class="w"> </span><span class="p">|</span><span class="w"> </span>dot<span class="w"> </span>-Tpng<span class="w"> </span>><span class="w"> </span>influences.png
|
|
</code></pre></div>
|
|
|
|
<p>The result is the following graph (<a href="http://files.lolnet.org/alexis/influences.png">see it directly
|
|
here</a>)</p>
|
|
<p><img alt="Graph des influances des langages les uns sur les
|
|
autres." src="http://files.lolnet.org/alexis/influences.png"></p>
|
|
<p>While reading this diagram, keep in mind that it is a) not listing all
|
|
the languages and b) keeping a python perspective.</p>
|
|
<p>This means that you can trust the scheme by following the arrows from
|
|
python to something and from something to python, it is not trying to
|
|
get the matching between all the languages at the same time to keep
|
|
stuff readable.</p>
|
|
<p>It would certainly be possible to have all the connections between all
|
|
languages (and the resulting script would be easier) to do so, but the
|
|
resulting graph would probably be way less readable.</p>
|
|
<p>You can find the script <a href="https://github.com/ametaireau/experiments">on my github
|
|
account</a>. Feel free to adapt
|
|
it for whatever you want if you feel hackish.</p>
|
|
</article>
|
|
<footer>
|
|
<a id="feed" href="/feeds/all.atom.xml">
|
|
<img alt="RSS Logo" src="/theme/rss.svg" />
|
|
</a>
|
|
</footer>
|
|
</div>
|
|
</body>
|
|
</html> |