mirror of
https://github.com/almet/notmyidea.git
synced 2025-04-29 12:02:39 +02:00
273 lines
No EOL
12 KiB
HTML
273 lines
No EOL
12 KiB
HTML
<!DOCTYPE html>
|
|
<html lang="en">
|
|
<head>
|
|
<title>Alexis' log - python</title>
|
|
<meta charset="utf-8" />
|
|
<link rel="stylesheet" href=".././theme/css/main.css" type="text/css" />
|
|
<link href=".././feeds/all.atom.xml" type="application/atom+xml" rel="alternate" title="Alexis' log ATOM Feed" />
|
|
|
|
|
|
<!--[if IE]>
|
|
<script src="http://html5shiv.googlecode.com/svn/trunk/html5.js"></script><![endif]-->
|
|
|
|
<!--[if lte IE 7]>
|
|
<link rel="stylesheet" type="text/css" media="all" href=".././css/ie.css"/>
|
|
<script src=".././js/IE8.js" type="text/javascript"></script><![endif]-->
|
|
|
|
<!--[if lt IE 7]>
|
|
<link rel="stylesheet" type="text/css" media="all" href=".././css/ie6.css"/><![endif]-->
|
|
|
|
</head>
|
|
|
|
<body id="index" class="home">
|
|
|
|
<a href="http://github.com/ametaireau/">
|
|
|
|
<img style="position: absolute; top: 0; right: 0; border: 0;" src="http://s3.amazonaws.com/github/ribbons/forkme_right_red_aa0000.png" alt="Fork me on GitHub" />
|
|
|
|
</a>
|
|
|
|
<header id="banner" class="body">
|
|
<h1><a href="../.">Alexis' log </a></h1>
|
|
<nav><ul>
|
|
|
|
|
|
|
|
<li><a href=".././pages/projects.html">projects</a></li>
|
|
|
|
|
|
|
|
<li ><a href=".././category/asso.html">asso</a></li>
|
|
|
|
<li ><a href=".././category/dev.html">dev</a></li>
|
|
|
|
<li ><a href=".././category/python.html">python</a></li>
|
|
|
|
<li ><a href=".././category/system.html">system</a></li>
|
|
|
|
<li ><a href=".././category/thoughts.html">thoughts</a></li>
|
|
|
|
</ul></nav>
|
|
</header><!-- /#banner -->
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<aside id="featured" class="body">
|
|
<article>
|
|
<h1 class="entry-title"><a href=".././using-dbpedia-to-get-languages-influences.html">Using dbpedia to get languages influences</a></h1>
|
|
<footer class="post-info">
|
|
<abbr class="published" title="2011-08-16T00:00:00">
|
|
Tue 16 August 2011
|
|
</abbr>
|
|
|
|
|
|
<address class="vcard author">
|
|
By <a class="url fn" href=".././author/Alexis Métaireau.html">Alexis Métaireau</a>
|
|
</address>
|
|
|
|
<p>In <a href=".././category/python.html">python</a>. </p>
|
|
<p>tags: <a href=".././tag/dbpedia.html">dbpedia</a><a href=".././tag/sparql.html">sparql</a><a href=".././tag/python.html">python</a></p>
|
|
|
|
|
|
</footer><!-- /.post-info --><p>While browsing the Python's wikipedia page, I found information about the languages
|
|
influenced by python, and the languages that influenced python itself.</p>
|
|
<p>Well, that's kind of interesting to know which languages influenced others,
|
|
it could even be more interesting to have an overview of the connexion between
|
|
them, keeping python as the main focus.</p>
|
|
<p>This information is available on the wikipedia page, but not in a really
|
|
exploitable format. Hopefully, this information is provided into the
|
|
information box present on the majority of wikipedia pages. And… guess what?
|
|
there is project with the goal to scrap and index all this information in
|
|
a more queriable way, using the semantic web technologies.</p>
|
|
<p>Well, you may have guessed it, the project in question in dbpedia, and exposes
|
|
information in the form of RDF triples, which are way more easy to work with
|
|
than simple HTML.</p>
|
|
<p>For instance, let's take the page about python:
|
|
<a class="reference external" href="http://dbpedia.org/page/Python_%28programming_language%29">http://dbpedia.org/page/Python_%28programming_language%29</a></p>
|
|
<p>The interesting properties here are "Influenced" and "InfluencedBy", which
|
|
allows us to get a list of languages. Unfortunately, they are not really using
|
|
all the power of the Semantic Web here, and the list is actually a string with
|
|
coma separated values in it.</p>
|
|
<p>Anyway, we can use a simple rule: All wikipedia pages of programming languages
|
|
are either named after the name of the language itself, or suffixed with "(
|
|
programming language)", which is the case for python.</p>
|
|
<p>So I've built <a class="reference external" href="https://github.com/ametaireau/experiments/blob/master/influences/get_influences.py">a tiny script to extract the information from dbpedia</a> and transform them into a shiny graph using graphviz.</p>
|
|
<p>After a nice:</p>
|
|
<pre class="literal-block">
|
|
$ python get_influences.py python dot | dot -Tpng > influences.png
|
|
</pre>
|
|
<p>The result is the following graph (<a class="reference external" href="http://files.lolnet.org/alexis/influences.png">see it directly here</a>)</p>
|
|
<img alt="http://files.lolnet.org/alexis/influences.png" src="http://files.lolnet.org/alexis/influences.png" style="width: 800px;" />
|
|
<p>While reading this diagram, keep in mind that it is a) not listing all the
|
|
languages and b) keeping a python perspective.</p>
|
|
<p>This means that you can trust the scheme by following the arrows from python to
|
|
something and from something to python, it is not trying to get the matching
|
|
between all the languages at the same time to keep stuff readable.</p>
|
|
<p>It would certainly be possible to have all the connections between all
|
|
languages (and the resulting script would be easier) to do so, but the resulting
|
|
graph would probably be way less readable.</p>
|
|
<p>You can find the script <a class="reference external" href="https://github.com/ametaireau/experiments">on my github account</a>. Feel free to adapt it for
|
|
whatever you want if you feel hackish.</p>
|
|
<p>There are <a href=".././using-dbpedia-to-get-languages-influences.html#disqus_thread">comments</a>.</p>
|
|
</article>
|
|
|
|
</aside><!-- /#featured -->
|
|
|
|
<section id="content" class="body">
|
|
<h1>Other articles</h1>
|
|
<hr />
|
|
<ol id="posts-list" class="hfeed">
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<li><article class="hentry">
|
|
<header>
|
|
<h1><a href=".././pelican-9-months-later.html" rel="bookmark" title="Permalink to Pelican, 9 months later">Pelican, 9 months later</a></h1>
|
|
</header>
|
|
|
|
<div class="entry-content">
|
|
<footer class="post-info">
|
|
<abbr class="published" title="2011-07-25T00:00:00">
|
|
Mon 25 July 2011
|
|
</abbr>
|
|
|
|
|
|
<address class="vcard author">
|
|
By <a class="url fn" href=".././author/Alexis Métaireau.html">Alexis Métaireau</a>
|
|
</address>
|
|
|
|
<p>In <a href=".././category/dev.html">dev</a>. </p>
|
|
<p>tags: <a href=".././tag/pelican.html">pelican</a><a href=".././tag/python.html">python</a><a href=".././tag/open source.html">open source</a><a href=".././tag/nice story.html">nice story</a></p>
|
|
|
|
|
|
</footer><!-- /.post-info -->
|
|
<p>Back in October, I released <a class="reference external" href="http://docs.notmyidea.org/alexis/pelican">pelican</a>,
|
|
a little piece of code I wrote to power this weblog. I had simple needs: I wanted
|
|
to be able to use my text editor of choice (vim), a vcs (mercurial) and
|
|
restructured text. I started to write a really simple blog engine
|
|
in ...</p>
|
|
<a class="readmore" href=".././pelican-9-months-later.html">read more</a>
|
|
<p>There are <a href=".././pelican-9-months-later.html#disqus_thread">comments</a>.</p>
|
|
</div><!-- /.entry-content -->
|
|
</article></li>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<li><article class="hentry">
|
|
<header>
|
|
<h1><a href=".././using-jpype-to-bridge-python-and-java.html" rel="bookmark" title="Permalink to Using JPype to bridge python and Java">Using JPype to bridge python and Java</a></h1>
|
|
</header>
|
|
|
|
<div class="entry-content">
|
|
<footer class="post-info">
|
|
<abbr class="published" title="2011-06-11T00:00:00">
|
|
Sat 11 June 2011
|
|
</abbr>
|
|
|
|
|
|
<address class="vcard author">
|
|
By <a class="url fn" href=".././author/Alexis Métaireau.html">Alexis Métaireau</a>
|
|
</address>
|
|
|
|
<p>In <a href=".././category/dev.html">dev</a>. </p>
|
|
<p>tags: <a href=".././tag/python.html">python</a><a href=".././tag/java.html">java</a></p>
|
|
|
|
|
|
</footer><!-- /.post-info -->
|
|
<p>Java provides some interesting libraries that have no exact equivalent in
|
|
python. In my case, the awesome boilerpipe library allows me to remove
|
|
uninteresting parts of HTML pages, like menus, footers and other "boilerplate"
|
|
contents.</p>
|
|
<p>Boilerpipe is written in Java. Two solutions then: using java from python or
|
|
reimplement boilerpipe ...</p>
|
|
<a class="readmore" href=".././using-jpype-to-bridge-python-and-java.html">read more</a>
|
|
<p>There are <a href=".././using-jpype-to-bridge-python-and-java.html#disqus_thread">comments</a>.</p>
|
|
</div><!-- /.entry-content -->
|
|
</article></li>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
</ol><!-- /#posts-list -->
|
|
</section><!-- /#content -->
|
|
|
|
|
|
|
|
<section id="extras" class="body">
|
|
|
|
<div class="blogroll">
|
|
<h2>blogroll</h2>
|
|
<ul>
|
|
|
|
<li><a href="http://biologeek.org">Biologeek</a></li>
|
|
|
|
<li><a href="http://filyb.info/">Filyb</a></li>
|
|
|
|
<li><a href="http://www.libert-fr.com">Libert-fr</a></li>
|
|
|
|
<li><a href="http://prendreuncafe.com/blog/">N1k0</a></li>
|
|
|
|
<li><a href="http://ziade.org/blog">Tarek Ziadé</a></li>
|
|
|
|
<li><a href="http://zubin71.wordpress.com/">Zubin Mithra</a></li>
|
|
|
|
</ul>
|
|
</div><!-- /.blogroll -->
|
|
|
|
|
|
<div class="social">
|
|
<h2>social</h2>
|
|
<ul>
|
|
<li><a href=".././feeds/all.atom.xml" rel="alternate">atom feed</a></li>
|
|
|
|
|
|
|
|
<li><a href="http://twitter.com/ametaireau">twitter</a></li>
|
|
|
|
<li><a href="http://lastfm.com/user/akounet">lastfm</a></li>
|
|
|
|
<li><a href="http://github.com/ametaireau">github</a></li>
|
|
|
|
</ul>
|
|
</div><!-- /.social -->
|
|
|
|
</section><!-- /#extras -->
|
|
|
|
<footer id="contentinfo" class="body">
|
|
<address id="about" class="vcard body">
|
|
Proudly powered by <a href="http://alexis.notmyidea.org/pelican/">pelican</a>, which takes great advantages of <a href="http://python.org">python</a>.
|
|
</address><!-- /#about -->
|
|
|
|
<p>The theme is by <a href="http://coding.smashingmagazine.com/2009/08/04/designing-a-html-5-layout-from-scratch/">Smashing Magazine</a>, thanks!</p>
|
|
</footer><!-- /#contentinfo -->
|
|
|
|
|
|
|
|
|
|
<script type="text/javascript">
|
|
var disqus_shortname = 'blog-notmyidea';
|
|
(function () {
|
|
var s = document.createElement('script'); s.async = true;
|
|
s.type = 'text/javascript';
|
|
s.src = 'http://' + disqus_shortname + '.disqus.com/count.js';
|
|
(document.getElementsByTagName('HEAD')[0] || document.getElementsByTagName('BODY')[0]).appendChild(s);
|
|
}());
|
|
</script>
|
|
|
|
</body>
|
|
</html> |