blog.notmyidea.org/feeds/dev.atom.xml

<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"><title>Alexis' log</title><link href="http://blog.notmyidea.org" rel="alternate"></link><link href="http://blog.notmyidea.org/feeds/dev.atom.xml" rel="self"></link><id>http://blog.notmyidea.org</id><updated>2011-07-25T00:00:00+02:00</updated><entry><title>Pelican, 9 months later</title><link href="http://blog.notmyidea.org/pelican-9-months-later.html" rel="alternate"></link><updated>2011-07-25T00:00:00+02:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2011-07-25:/pelican-9-months-later.html/</id><summary type="html">&lt;p&gt;Back in October, I released &lt;a class="reference external" href="http://docs.notmyidea.org/alexis/pelican"&gt;pelican&lt;/a&gt;,
a little piece of code I wrote to power this weblog. I had simple needs: I wanted
to be able to use my text editor of choice (vim), a vcs (mercurial) and
restructured text. I started to write a really simple blog engine
in something like a hundred python lines and released it on github.&lt;/p&gt;
&lt;p&gt;And people started contributing. I wasn't at all expecting to see people
interested in such a little piece of code, but it turned out that they were.
I refactored the code to make it evolve a bit more by two times and eventually,
in 9 months, got 49 forks, 139 issues and 73 pull requests.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Which is clearly awesome.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;I pulled features such as translations, tag
clouds, integration with different services such as twitter or piwik, import
from dotclear and rss, fixed
a number of mistakes and improved a lot the codebase. This was a proof that
there is a bunch of people that are willing to make better softwares just for
the sake of fun.&lt;/p&gt;
&lt;p&gt;Thank you, guys, you're why I like open source so much.&lt;/p&gt;
</summary><category term="pelican"></category><category term="python"></category><category term="open source"></category><category term="nice story"></category></entry><entry><title>Using JPype to bridge python and Java</title><link href="http://blog.notmyidea.org/using-jpype-to-bridge-python-and-java.html" rel="alternate"></link><updated>2011-06-11T00:00:00+02:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2011-06-11:/using-jpype-to-bridge-python-and-java.html/</id><summary type="html">&lt;p&gt;Java provides some interesting libraries that have no exact equivalent in
python. In my case, the awesome boilerpipe library allows me to remove
uninteresting parts of HTML pages, like menus, footers and other &amp;quot;boilerplate&amp;quot;
contents.&lt;/p&gt;
&lt;p&gt;Boilerpipe is written in Java. Two solutions then: using java from python or
reimplement boilerpipe in python. I will let you guess which one I chosen, meh.&lt;/p&gt;
&lt;p&gt;JPype allows to bridge python project with java libraries. It takes another
point of view than Jython: rather than reimplementing python in Java, both
languages are interfacing at the VM level. This means you need to start a VM
from your python script, but it does the job and stay fully compatible with
Cpython and its C extensions.&lt;/p&gt;
&lt;div class="section" id="first-steps-with-jpype"&gt;
&lt;h2&gt;First steps with JPype&lt;/h2&gt;
&lt;p&gt;Once JPype installed (you'll have to hack a bit some files to integrate
seamlessly with your system) you can access java classes by doing something
like that:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;jpype&lt;/span&gt;
&lt;span class="n"&gt;jpype&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;startJVM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;jpype&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;getDefaultJVMPath&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

&lt;span class="c"&gt;# you can then access to the basic java functions&lt;/span&gt;
&lt;span class="n"&gt;jpype&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;java&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lang&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;hello world&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# and you have to shutdown the VM at the end&lt;/span&gt;
&lt;span class="n"&gt;jpype&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shutdownJVM&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Okay, now we have a hello world, but what we want seems somehow more complex.
We want to interact with java classes, so we will have to load them.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="interfacing-with-boilerpipe"&gt;
&lt;h2&gt;Interfacing with Boilerpipe&lt;/h2&gt;
&lt;p&gt;To install boilerpipe, you just have to run an ant script:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
$ cd boilerpipe
$ ant
&lt;/pre&gt;
&lt;p&gt;Here is a simple example of how to use boilerpipe in Java, from their sources&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="n"&gt;de&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;l3s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;boilerpipe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;demo&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.net.URL&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;de.l3s.boilerpipe.extractors.ArticleExtractor&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Oneliner&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="n"&gt;Exception&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="n"&gt;URL&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="n"&gt;URL&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;http://notmyidea.org&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ArticleExtractor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;INSTANCE&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getText&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;To run it:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;javac -cp dist/boilerpipe-1.1-dev.jar:lib/nekohtml-1.9.13.jar:lib/xerces-2.9.1.jar src/demo/de/l3s/boilerpipe/demo/Oneliner.java
&lt;span class="nv"&gt;$ &lt;/span&gt;java -cp src/demo:dist/boilerpipe-1.1-dev.jar:lib/nekohtml-1.9.13.jar:lib/xerces-2.9.1.jar de.l3s.boilerpipe.demo.Oneliner
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Yes, this is kind of ugly, sorry for your eyes.
Let's try something similar, but from python&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;jpype&lt;/span&gt;

&lt;span class="c"&gt;# start the JVM with the good classpaths&lt;/span&gt;
&lt;span class="n"&gt;classpath&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;dist/boilerpipe-1.1-dev.jar:lib/nekohtml-1.9.13.jar:lib/xerces-2.9.1.jar&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;jpype&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;startJVM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;jpype&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;getDefaultJVMPath&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;-Djava.class.path=&lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;classpath&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# get the Java classes we want to use&lt;/span&gt;
&lt;span class="n"&gt;DefaultExtractor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;jpype&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;JPackage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;de&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;l3s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;boilerpipe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extractors&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DefaultExtractor&lt;/span&gt;

&lt;span class="c"&gt;# call them !&lt;/span&gt;
&lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="n"&gt;DefaultExtractor&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;INSTANCE&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;getText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;jpype&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;java&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;net&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;URL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;http://blog.notmyidea.org&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And you get what you want.&lt;/p&gt;
&lt;p&gt;I must say I didn't thought it could work so easily. This will allow me to
extract text content from URLs and remove the &lt;em&gt;boilerplate&lt;/em&gt; text easily
for infuse (my master thesis project), without having to write java code, nice!&lt;/p&gt;
&lt;/div&gt;
</summary><category term="python"></category><category term="java"></category></entry><entry><title>Un coup de main pour mon mémoire!</title><link href="http://blog.notmyidea.org/un-coup-de-main-pour-mon-memoire.html" rel="alternate"></link><updated>2011-05-25T00:00:00+02:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2011-05-25:/un-coup-de-main-pour-mon-memoire.html/</id><summary type="html">&lt;p&gt;Ça y est, bientôt la fin. LA FIN. La fin des études, et le début du reste.
En attendant je bosse sur mon mémoire de fin d'études et j'aurais besoin d'un petit
coup de main.&lt;/p&gt;
&lt;p&gt;Mon mémoire porte sur les systèmes de recommandation. Pour ceux qui connaissent
last.fm, je fais quelque chose de similaire mais pour les sites internet: en me
basant sur ce que vous visitez quotidiennement et comment vous le visitez (quelles
horaires, quelle emplacement géographique, etc.) je souhaites proposer des liens
qui vous intéresseront potentiellement, en me basant sur l'avis des personnes qui
ont des profils similaires au votre.&lt;/p&gt;
&lt;p&gt;Le projet est loin d'être terminé, mais la première étape est de récupérer des
données de navigation, idéalement beaucoup de données de navigation. Donc si
vous pouvez me filer un coup de main je vous en serais éternellement
reconnaissant (pour ceux qui font semblant de pas comprendre, entendez &amp;quot;tournée
générale&amp;quot;).&lt;/p&gt;
&lt;p&gt;J'ai créé un petit site web (en anglais) qui résume un peu le concept, qui vous
propose de vous inscrire et de télécharger un plugin firefox qui m'enverra des
information sur les sites que vous visitez (si vous avez l'habitude d'utiliser
chrome vous pouvez considérer de switcher à firefox4 pour les deux prochains
mois pour me filer un coup de main). Il est possible de désactiver le plugin
d'un simple clic si vous souhaitez garder votre vie privée privée ;-)&lt;/p&gt;
&lt;p&gt;Le site est par là: &lt;a class="reference external" href="http://infuse.notmyidea.org"&gt;http://infuse.notmyidea.org&lt;/a&gt;. Une fois le plugin téléchargé
et le compte créé il faut renseigner vos identifiants dans le plugin en
question, et c'est tout!&lt;/p&gt;
&lt;p&gt;A votre bon cœur ! Je récupérerais probablement des données durant les 2
prochains mois pour ensuite les analyser correctement.&lt;/p&gt;
&lt;p&gt;Merci pour votre aide !&lt;/p&gt;
</summary></entry><entry><title>Analyse users' browsing context to build up a web recommender</title><link href="http://blog.notmyidea.org/analyse-users-browsing-context-to-build-up-a-web-recommender.html" rel="alternate"></link><updated>2011-04-01T00:00:00+02:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2011-04-01:/analyse-users-browsing-context-to-build-up-a-web-recommender.html/</id><summary type="html">&lt;p&gt;No, this is not an april's fool ;)&lt;/p&gt;
&lt;p&gt;Wow, it's been a long time. My year in Oxford is going really well. I realized
few days ago that the end of the year is approaching really quickly.
Exams are coming in one month or such and then I'll be working full time on my dissertation topic.&lt;/p&gt;
&lt;p&gt;When I learned we'll have about 6 month to work on something, I first thought
about doing a packaging related stuff, but finally decided to start something
new. After all, that's the good time to learn.&lt;/p&gt;
&lt;p&gt;Since a long time, I'm being impressed by the &lt;a class="reference external" href="http://last.fm"&gt;last.fm&lt;/a&gt;
recommender system. They're &lt;em&gt;scrobbling&lt;/em&gt; the music I listen to since something
like 5 years now and the recommendations they're doing  are really nice and
accurate (I discovered &lt;strong&gt;a lot&lt;/strong&gt; of great artists listening to the
&amp;quot;neighbour radio&amp;quot;.) (by the way, &lt;a class="reference external" href="http://lastfm.com/user/akounet/"&gt;here is&lt;/a&gt;
my lastfm account)&lt;/p&gt;
&lt;p&gt;So I decided to work on recommender systems, to better understand what is it
about.&lt;/p&gt;
&lt;p&gt;Recommender systems are usually used to increase the sales of products
(like Amazon.com does) which is not really what I'm looking for (The one who
know me a bit know I'm kind of sick about all this consumerism going on).&lt;/p&gt;
&lt;p&gt;Actually, the most simple thing I thought of was the web: I'm browsing it quite
every day and each time new content appears. I've stopped to follow &lt;a class="reference external" href="https://bitbucket.org/bruno/aspirator/"&gt;my feed
reader&lt;/a&gt; because of the
information overload, and reduced drastically the number of people I follow &lt;a class="reference external" href="http://twitter.com/ametaireau/"&gt;on
twitter&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Too much information kills the information.&lt;/p&gt;
&lt;p&gt;You shall got what will be my dissertation topic: a recommender system for
the web. Well, such recommender systems already exists, so I will try to add contextual
information to them: you're probably not interested by the same topics at different
times of the day, or depending on the computer you're using. We can also
probably make good use of the way you browse to create groups into the content
you're browsing (or even use the great firefox4 tab group feature).&lt;/p&gt;
&lt;p&gt;There is a large part of concerns to have about user's privacy as well.&lt;/p&gt;
&lt;p&gt;Here is my proposal (copy/pasted from the one I had to do for my master)&lt;/p&gt;
&lt;div class="section" id="introduction-and-rationale"&gt;
&lt;h2&gt;Introduction and rationale&lt;/h2&gt;
&lt;p&gt;Nowadays, people surf the web more and more often. New web pages are created
each day so the amount of information to retrieve is more important as the time
passes. These users uses the web in different contexts, from finding cooking
recipes to technical articles.&lt;/p&gt;
&lt;p&gt;A lot of people share the same interest to various topics, and the quantity of
information is such than it's really hard to triage them efficiently without
spending hours doing it. Firstly because of the huge quantity of information
but also because the triage is something relative to each person. Although, this
triage can be facilitated by fetching the browsing information of all
particular individuals and put the in perspective.&lt;/p&gt;
&lt;p&gt;Machine learning is a branch of Artificial Intelligence (AI) which deals with how
a program can learn from data. Recommendation systems are a particular
application area of machine learning which is able to recommend things (links
in our case) to the users, given a particular database containing the previous
choices users have made.&lt;/p&gt;
&lt;p&gt;This browsing information is currently available in browsers. Even if it is not
in a very usable format, it is possible to transform it to something useful.
This information gold mine just wait to be used. Although, it is not as simple as
it can seems at the first approach: It is important to take care of the context
the user is in while browsing links. For instance, It's more likely that during
the day, a computer scientist will browse computing related links, and that during
the evening, he browse cooking recipes or something else.&lt;/p&gt;
&lt;p&gt;Page contents are also interesting to analyse, because that's what people
browse and what actually contain the most interesting part of the information.
The raw data extracted from the browsing can then be translated into
something more useful (namely tags, type of resource, visit frequency,
navigation context etc.)&lt;/p&gt;
&lt;p&gt;The goal of this dissertation is to create a recommender system for web links,
including this context information.&lt;/p&gt;
&lt;p&gt;At the end of the dissertation, different pieces of software will be provided,
from raw data collection from the browser to a recommendation system.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="background-review"&gt;
&lt;h2&gt;Background Review&lt;/h2&gt;
&lt;p&gt;This dissertation is mainly about data extraction, analysis and recommendation
systems. Two different research area can be isolated: Data preprocessing and
Information filtering.&lt;/p&gt;
&lt;p&gt;The first step in order to make recommendations is to gather some data. The
more data we have available, the better it is (T. Segaran, 2007). This data can
be retrieved in various ways, one of them is to get it directly from user's
browsers.&lt;/p&gt;
&lt;div class="section" id="data-preparation-and-extraction"&gt;
&lt;h3&gt;Data preparation and extraction&lt;/h3&gt;
&lt;p&gt;The data gathered from browsers is basically URLs and additional information
about the context of the navigation. There is clearly a need to extract more
information about the meaning of the data the user is browsing, starting by the
content of the web pages.&lt;/p&gt;
&lt;p&gt;Because the information provided on the current Web is not meant to be read by
machines (T. Berners Lee, 2001) there is a need of tools to extract meaning from
web pages. The information needs to be preprocessed before stored in a machine
readable format, allowing to make recommendations (Choochart et Al, 2004).&lt;/p&gt;
&lt;p&gt;Data preparation is composed of two steps: cleaning and structuring (
Castellano et Al, 2007). Because raw data can contain a lot of un-needed text
(such as menus, headers etc.) and need to be cleaned prior to be stored.
Multiple techniques can be used here and belongs to boilerplate removal and
full text extraction (Kohlschütter et Al, 2010).&lt;/p&gt;
&lt;p&gt;Then, structuring the information: category, type of content (news, blog, wiki)
can be extracted from raw data. This kind of information is not clearly defined
by HTML pages so there is a need of tools to recognise them.&lt;/p&gt;
&lt;p&gt;Some context-related information can also be inferred from each resource. It can go
from the visit frequency to the navigation group the user was in while
browsing. It is also possible to determine if the user &amp;quot;liked&amp;quot; a resource, and
determine a mark for it, which can be used by information filtering a later
step (T. Segaran, 2007).&lt;/p&gt;
&lt;p&gt;At this stage, structuring the data is required. Storing this kind of
information in RDBMS can be a bit tedious and require complex queries to get
back the data in an usable format. Graph databases can play a major role in the
simplification of information storage and querying.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="information-filtering"&gt;
&lt;h3&gt;Information filtering&lt;/h3&gt;
&lt;p&gt;To filter the information, three techniques can be used (Balabanovic et
Al, 1997):&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;The content-based approach states that if an user have liked something in the
past, he is more likely to like similar things in the future. So it's about
establishing a profile for the user and compare new items against it.&lt;/li&gt;
&lt;li&gt;The collaborative approach will rather recommend items that other similar users
have liked. This approach consider only the relationship between users, and
not the profile of the user we are making recommendations to.&lt;/li&gt;
&lt;li&gt;the hybrid approach, which appeared recently combine both of the previous
approaches, giving recommendations when items score high regarding user's
profile, or if a similar user already liked it.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Grouping is also something to consider at this stage (G. Myatt, 2007).
Because we are dealing with huge amount of data, it can be useful to detect group
of data that can fit together. Data clustering is able to find such groups (T.
Segaran, 2007).&lt;/p&gt;
&lt;p&gt;References:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;Balabanović, M., &amp;amp; Shoham, Y. (1997). Fab: content-based, collaborative
recommendation. Communications of the ACM, 40(3), 66–72. ACM.
Retrieved March 1, 2011, from &lt;a class="reference external" href="http://portal.acm.org/citation.cfm?id=245108.245124&amp;amp;amp"&gt;http://portal.acm.org/citation.cfm?id=245108.245124&amp;amp;amp&lt;/a&gt;;.&lt;/li&gt;
&lt;li&gt;Berners-Lee, T., Hendler, J., &amp;amp; Lassila, O. (2001).
The semantic web: Scientific american. Scientific American, 284(5), 34–43.
Retrieved November 21, 2010, from &lt;a class="reference external" href="http://www.citeulike.org/group/222/article/1176986"&gt;http://www.citeulike.org/group/222/article/1176986&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Castellano, G., Fanelli, A., &amp;amp; Torsello, M. (2007).
LODAP: a LOg DAta Preprocessor for mining Web browsing patterns. Proceedings of the 6th Conference on 6th WSEAS Int. Conf. on Artificial Intelligence, Knowledge Engineering and Data Bases-Volume 6 (p. 12–17). World Scientific and Engineering Academy and Society (WSEAS). Retrieved March 8, 2011, from &lt;a class="reference external" href="http://portal.acm.org/citation.cfm?id=1348485.1348488"&gt;http://portal.acm.org/citation.cfm?id=1348485.1348488&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Kohlschutter, C., Fankhauser, P., &amp;amp; Nejdl, W. (2010). Boilerplate detection using shallow text features. Proceedings of the third ACM international conference on Web search and data mining (p. 441–450). ACM. Retrieved March 8, 2011, from &lt;a class="reference external" href="http://portal.acm.org/citation.cfm?id=1718542"&gt;http://portal.acm.org/citation.cfm?id=1718542&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Myatt, G. J. (2007). Making Sense of Data: A Practical Guide to Exploratory
Data Analysis and Data Mining.&lt;/li&gt;
&lt;li&gt;Segaran, T. (2007). Collective Intelligence.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="privacy"&gt;
&lt;h2&gt;Privacy&lt;/h2&gt;
&lt;p&gt;The first thing that's come to people minds when it comes to process their
browsing data is privacy. People don't want to be stalked. That's perfectly
right, and I don't either.&lt;/p&gt;
&lt;p&gt;But such a system don't have to deal with people identities. It's completely
possible to process completely anonymous data, and that's probably what I'm
gonna do.&lt;/p&gt;
&lt;p&gt;By the way, if you have interesting thoughts about that, if you do know
projects that do seems related, fire the comments !&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="what-s-the-plan"&gt;
&lt;h2&gt;What's the plan ?&lt;/h2&gt;
&lt;p&gt;There is a lot of different things to explore, especially because I'm
a complete novice in that field.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;I want to develop a firefox plugin, to extract the browsing informations (
still, I need to know exactly which kind of informations to retrieve). The
idea is to provide some &lt;em&gt;raw&lt;/em&gt; browsing data, and then to transform it and to
store it in the better possible way.&lt;/li&gt;
&lt;li&gt;Analyse how to store the informations in a graph database. What can be the
different methods to store this data and to visualize the relationship
between different pieces of data? How can I define the different contexts,
and add those informations in the db?&lt;/li&gt;
&lt;li&gt;Process the data using well known recommendation algorithms. Compare the
results and criticize their value.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;There is plenty of stuff I want to try during this experimentation:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;I want to try using Geshi to visualize the connexion between the links,
and the contexts&lt;/li&gt;
&lt;li&gt;Try using graph databases such as Neo4j&lt;/li&gt;
&lt;li&gt;Having a deeper look at tools such as scikit.learn (a machine learning
toolkit in python)&lt;/li&gt;
&lt;li&gt;Analyse web pages in order to categorize them. Processing their
contents as well, to do some keyword based classification will be done.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Lot of work on its way, yay !&lt;/p&gt;
&lt;/div&gt;
</summary><category term="recommendations"></category><category term="browsers"></category><category term="users"></category></entry><entry><title>Wrap up of the distutils2 paris' sprint</title><link href="http://blog.notmyidea.org/wrap-up-of-the-distutils2-paris-sprint.html" rel="alternate"></link><updated>2011-02-08T00:00:00+01:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2011-02-08:/wrap-up-of-the-distutils2-paris-sprint.html/</id><summary type="html">&lt;p&gt;Finally, thanks to a bunch of people that helped me to pay my train and bus
tickets, I've made it to paris for the distutils2 sprint.&lt;/p&gt;
&lt;p&gt;They have been a bit more than 10 people to come during the sprint, and it was
very productive. Here's a taste of what we've been working on:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;the &lt;cite&gt;datafiles&lt;/cite&gt;, a way to specify and to handle the installation of files which
are not python-related (pictures, manpages and so on).&lt;/li&gt;
&lt;li&gt;&lt;cite&gt;mkgcfg&lt;/cite&gt;, a tool to help you to create a setup.cfg in minutes (and with funny
examples)&lt;/li&gt;
&lt;li&gt;converters from setup.py scripts. We do now have a piece of code which
reads your current &lt;cite&gt;setup.py&lt;/cite&gt; file and fill in some fields in the &lt;cite&gt;setup.cfg&lt;/cite&gt;
for you.&lt;/li&gt;
&lt;li&gt;a compatibility layer for distutils1, so it can read the &lt;cite&gt;setup.cfg&lt;/cite&gt; you will
wrote for distutils2 :-)&lt;/li&gt;
&lt;li&gt;the uninstaller, so it's now possible to uninstall what have been installed
by distutils2 (see PEP 376)&lt;/li&gt;
&lt;li&gt;the installer, and the setuptools compatibility layer, which will allow you
to rely on setuptools' based distributions (and there are plenty of them!)&lt;/li&gt;
&lt;li&gt;The compilers, so they are more flexible than they were. Since that's an
obscure part of the code for distutils2 commiters (it comes directly from the
distutils1 ages), having some guys who understood the problematics here was
a must.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Some people have also tried to port their packaging from distutils1 to
distutils2. They have spotted a number of bugs and made some improvements
to the code, to make it more friendly to use.&lt;/p&gt;
&lt;p&gt;I'm really pleased to see how newcomers went trough the code, and started
hacking so fast. I must say it wasn't the case when we started to work on
distutils1 so that's a very good point: people now can hack the code quicker
than they could before.&lt;/p&gt;
&lt;p&gt;Some of the features here are not &lt;em&gt;completely&lt;/em&gt; finished yet, but are on the
tubes, and will be ready for a release (hopefully) at the end of the week.&lt;/p&gt;
&lt;p&gt;Big thanks to logilab for hosting (and sponsoring my train ticket) and
providing us food, and to bearstech for providing some money for breakfast and
bears^Wbeers.&lt;/p&gt;
&lt;p&gt;Again, a big thanks to all the people who gave me money to pay the transport,
I really wasn't expecting such thing to happen :-)&lt;/p&gt;
</summary></entry><entry><title>PyPI on CouchDB</title><link href="http://blog.notmyidea.org/pypi-on-couchdb.html" rel="alternate"></link><updated>2011-01-20T00:00:00+01:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2011-01-20:/pypi-on-couchdb.html/</id><summary type="html">&lt;p&gt;By now, there are two ways to retrieve data from PyPI (the Python Package
Index). You can both rely on xml/rpc or on the &amp;quot;simple&amp;quot; API. The simple
API is not so simple to use as the name suggest, and have several existing
drawbacks.&lt;/p&gt;
&lt;p&gt;Basically, if you want to use informations coming from the simple API, you will
have to parse web pages manually, to extract informations using some black
vodoo magic. Badly, magic have a price, and it's sometimes impossible to get
exactly the informations you want to get from this index. That's the technique
currently being used by distutils2, setuptools and pip.&lt;/p&gt;
&lt;p&gt;On the other side, while XML/RPC is working fine, it's requiring extra work
to the python servers each time you request something, which can lead to
some outages from time to time. Also, it's important to point out that, even if
PyPI have a mirroring infrastructure, it's only for the so-called &lt;em&gt;simple&lt;/em&gt; API,
and not for the XML/RPC.&lt;/p&gt;
&lt;div class="section" id="couchdb"&gt;
&lt;h2&gt;CouchDB&lt;/h2&gt;
&lt;p&gt;Here comes CouchDB. CouchDB is a document oriented database, that
knows how to speak REST and JSON. It's easy to use, and provides out of the box
a replication mechanism.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="so-what"&gt;
&lt;h2&gt;So, what ?&lt;/h2&gt;
&lt;p&gt;Hmm, I'm sure you got it. I've wrote a piece of software to link informations from
PyPI to a CouchDB instance. Then you can replicate all the PyPI index with only
one HTTP request on the CouchDB server. You can also access the informations
from the index directly using a REST API, speaking json. Handy.&lt;/p&gt;
&lt;p&gt;So PyPIonCouch is using the PyPI XML/RPC API to get data from PyPI, and
generate records in the CouchDB instance.&lt;/p&gt;
&lt;p&gt;The final goal is to avoid to rely on this &amp;quot;simple&amp;quot; API, and rely on a REST
insterface instead. I have set up a couchdb server on my server, which is
available at &lt;a class="reference external" href="http://couchdb.notmyidea.org/_utils/database.html?pypi"&gt;http://couchdb.notmyidea.org/_utils/database.html?pypi&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;There is not a lot to
see there for now, but I've done the first import from PyPI yesterday and all
went fine: it's possible to access the metadata of all PyPI projects via a REST
interface. Next step is to write a client for this REST interface in
distutils2.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="example"&gt;
&lt;h2&gt;Example&lt;/h2&gt;
&lt;p&gt;For now, you can use pypioncouch via the command line, or via the python API.&lt;/p&gt;
&lt;div class="section" id="using-the-command-line"&gt;
&lt;h3&gt;Using the command line&lt;/h3&gt;
&lt;p&gt;You can do something like that for a full import. This &lt;strong&gt;will&lt;/strong&gt; take long,
because it's fetching all the projects at pypi and importing their metadata:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
$ pypioncouch --fullimport http://your.couchdb.instance/
&lt;/pre&gt;
&lt;p&gt;If you already have the data on your couchdb instance, you can just update it
with the last informations from pypi. &lt;strong&gt;However, I recommend to just replicate
the principal node, hosted at http://couchdb.notmyidea.org/pypi/&lt;/strong&gt;, to avoid
the duplication of nodes:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
$ pypioncouch --update http://your.couchdb.instance/
&lt;/pre&gt;
&lt;p&gt;The principal node is updated once a day by now, I'll try to see if it's
enough, and ajust with the time.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="using-the-python-api"&gt;
&lt;h3&gt;Using the python API&lt;/h3&gt;
&lt;p&gt;You can also use the python API to interact with pypioncouch:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
&amp;gt;&amp;gt;&amp;gt; from pypioncouch import XmlRpcImporter, import_all, update
&amp;gt;&amp;gt;&amp;gt; full_import()
&amp;gt;&amp;gt;&amp;gt; update()
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="what-s-next"&gt;
&lt;h2&gt;What's next ?&lt;/h2&gt;
&lt;p&gt;I want to make a couchapp, in order to navigate PyPI easily. Here are some of
the features I want to propose:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;List all the available projects&lt;/li&gt;
&lt;li&gt;List all the projects, filtered by specifiers&lt;/li&gt;
&lt;li&gt;List all the projects by author/maintainer&lt;/li&gt;
&lt;li&gt;List all the projects by keywords&lt;/li&gt;
&lt;li&gt;Page for each project.&lt;/li&gt;
&lt;li&gt;Provide a PyPI &amp;quot;Simple&amp;quot; API equivalent, even if I want to replace it, I do
think it will be really easy to setup mirrors that way, with the out of the
box couchdb replication&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I also still need to polish the import mechanism, so I can directly store in
couchdb:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;The OPML files for each project&lt;/li&gt;
&lt;li&gt;The upload_time as couchdb friendly format (list of int)&lt;/li&gt;
&lt;li&gt;The tags as lists (currently it's only a string separated by spaces&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The work I've done by now is available on
&lt;a class="reference external" href="https://bitbucket.org/ametaireau/pypioncouch/"&gt;https://bitbucket.org/ametaireau/pypioncouch/&lt;/a&gt;. Keep in mind that it's still
a work in progress, and everything can break at any time. However, any feedback
will be appreciated !&lt;/p&gt;
&lt;/div&gt;
</summary></entry><entry><title>Help me to go to the distutils2 paris' sprint</title><link href="http://blog.notmyidea.org/help-me-to-go-to-the-distutils2-paris-sprint.html" rel="alternate"></link><updated>2011-01-15T00:00:00+01:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2011-01-15:/help-me-to-go-to-the-distutils2-paris-sprint.html/</id><summary type="html">&lt;p&gt;&lt;strong&gt;Edit: Thanks to logilab and some amazing people, I can make it to paris for the
sprint. Many thanks to them for the support!&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;There will be a distutils2 sprint from the 27th to the 30th of january, thanks
to logilab which will host the event.&lt;/p&gt;
&lt;p&gt;You can find more informations about the sprint on the wiki page of the event
(&lt;a class="reference external" href="http://wiki.python.org/moin/Distutils/SprintParis"&gt;http://wiki.python.org/moin/Distutils/SprintParis&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;I really want to go there but I'm unfortunately blocked in UK for money reasons.
The cheapest two ways I've found is about £80, which I can't afford.
Following some advices on #distutils, I've set up a ChipIn account for that, so
if some people want to help me making it to go there, they can give me some
money that way.&lt;/p&gt;
&lt;p&gt;I'll probably work on the installer (to support old distutils and
setuptools distributions) and on the uninstaller (depending on the first
task). If I can't make it to paris, I'll hang around on IRC to give some help
while needed.&lt;/p&gt;
&lt;p&gt;If you want to contribute some money to help me go there, feel free to use this
chipin page: &lt;a class="reference external" href="http://ametaireau.chipin.com/distutils2-sprint-in-paris"&gt;http://ametaireau.chipin.com/distutils2-sprint-in-paris&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Thanks for your support !&lt;/p&gt;
</summary></entry><entry><title>How to reboot your bebox using the CLI</title><link href="http://blog.notmyidea.org/how-to-reboot-your-bebox-using-the-cli.html" rel="alternate"></link><updated>2010-10-21T00:00:00+02:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2010-10-21:/how-to-reboot-your-bebox-using-the-cli.html/</id><summary type="html">&lt;p&gt;I've an internet connection which, for some obscure reasons, tend to be very
slow from time to time. After rebooting the box (yes, that's a hard solution),
all the things seems to go fine again.&lt;/p&gt;
&lt;div class="section" id="edit-using-grep"&gt;
&lt;h2&gt;EDIT : Using grep&lt;/h2&gt;
&lt;p&gt;After a bit of reflexion, that's also really easy to do using directly the
command line tools curl, grep and tail (but really harder to read).&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;curl -X POST -u joel:joel http://bebox.config/cgi/b/info/restart/&lt;span class="se"&gt;\?&lt;/span&gt;be&lt;span class="se"&gt;\=&lt;/span&gt;0&lt;span class="se"&gt;\&amp;amp;&lt;/span&gt;l0&lt;span class="se"&gt;\=&lt;/span&gt;1&lt;span class="se"&gt;\&amp;amp;&lt;/span&gt;l1&lt;span class="se"&gt;\=&lt;/span&gt;0&lt;span class="se"&gt;\&amp;amp;&lt;/span&gt;tid&lt;span class="se"&gt;\=&lt;/span&gt;RESTART -d &lt;span class="s2"&gt;&amp;quot;0=17&amp;amp;2=`curl -u joel:joel http://bebox.config/cgi/b/info/restart/\?be\=0\&amp;amp;l0\=1\&amp;amp;l1\=0\&amp;amp;tid\=RESTART | grep -o &amp;quot;&lt;/span&gt;&lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;2&amp;#39;&lt;/span&gt; &lt;span class="nv"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;0-9&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="se"&gt;\+&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot; | grep -o &amp;quot;&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;0-9&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="se"&gt;\+&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot; | tail -n 1`&amp;amp;1&amp;quot;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="the-python-version"&gt;
&lt;h2&gt;The Python version&lt;/h2&gt;
&lt;p&gt;Well, that's not the optimal solution, that's a bit &amp;quot;gruik&amp;quot;, but it works.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;urllib2&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;urlparse&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;re&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;argparse&lt;/span&gt;

&lt;span class="n"&gt;REBOOT_URL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;&amp;#39;/b/info/restart/?be=0&amp;amp;l0=1&amp;amp;l1=0&amp;amp;tid=RESTART&amp;#39;&lt;/span&gt;
&lt;span class="n"&gt;BOX_URL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;&amp;#39;http://bebox.config/cgi&amp;#39;&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;open_url&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;username&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;password&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;passman&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;urllib2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HTTPPasswordMgrWithDefaultRealm&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;passman&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_password&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;username&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;password&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;authhandler&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;urllib2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HTTPBasicAuthHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;passman&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;opener&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;urllib2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;build_opener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;authhandler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;urllib2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;install_opener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;opener&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;urllib2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;urlopen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;reboot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;username&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;password&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;open_url&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;username&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;password&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;findall&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;name\=&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;2&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s"&gt;&amp;#39; value=&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;([0-9]+)&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;urllib2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;urlopen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;urllib2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;0=17&amp;amp;2=&lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s"&gt;&amp;amp;1&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__file__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;&amp;#39;__main__&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;parser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;argparse&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ArgumentParser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;&amp;quot;&amp;quot;Reboot your bebox !&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dest&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;user&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;help&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;username&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dest&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;password&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;help&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;password&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;boxurl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;boxurl&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;BOX_URL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;help&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;Base box url.  Default is &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="n"&gt;BOX_URL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;args&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parse_args&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;urlparse&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;urljoin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;boxurl&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;REBOOT_URL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;reboot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;username&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;password&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
</summary></entry><entry><title>Dynamically change your gnome desktop wallpaper</title><link href="http://blog.notmyidea.org/dynamically-change-your-gnome-desktop-wallpaper.html" rel="alternate"></link><updated>2010-10-11T00:00:00+02:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2010-10-11:/dynamically-change-your-gnome-desktop-wallpaper.html/</id><summary type="html">&lt;p&gt;In gnome, you can can use a XML file to have a dynamic wallpaper.
It's not so easy, and you can't just tell: use the pictures in this folder to do
so.&lt;/p&gt;
&lt;p&gt;You can have a look to the git repository if you want: &lt;a class="reference external" href="http://github.com/ametaireau/gnome-background-generator"&gt;http://github.com/ametaireau/gnome-background-generator&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Some time ago, I've made a little python script to ease that, and you can now
use it too. It's named &amp;quot;gnome-background-generator&amp;quot;, and you can install it via
pip for instance.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;$ pip install gnome-background-generator
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then, you have just to use it this way:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;$ gnome-background-generator -p ~/Images/walls -s
/home/alexis/Images/walls/dynamic-wallpaper.xml generated
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Here is a extract of the &lt;cite&gt;--help&lt;/cite&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;$ gnome-background-generator --help
usage: gnome-background-generator [-h] [-p PATH] [-o OUTPUT]
                                  [-t TRANSITION_TIME] [-d DISPLAY_TIME] [-s]
                                  [-b]

A simple command line tool to generate an XML file to use for gnome
wallpapers, to have dynamic walls

optional arguments:
  -h, --help            show this help message and exit
  -p PATH, --path PATH  Path to look for the pictures. If no output is
                        specified, will be used too for outputing the dynamic-
                        wallpaper.xml file. Default value is the current
                        directory (.)
  -o OUTPUT, --output OUTPUT
                        Output filename. If no filename is specified, a
                        dynamic-wallpaper.xml file will be generated in the
                        path containing the pictures. You can also use &amp;quot;-&amp;quot; to
                        display the xml in the stdout.
  -t TRANSITION_TIME, --transition-time TRANSITION_TIME
                        Time (in seconds) transitions must last (default value
                        is 2 seconds)
  -d DISPLAY_TIME, --display-time DISPLAY_TIME
                        Time (in seconds) a picture must be displayed. Default
                        value is 900 (15mn)
  -s, --set-background  &amp;#39;&amp;#39;&amp;#39;try to set the background using gnome-appearance-
                        properties
  -b, --debug
&lt;/pre&gt;&lt;/div&gt;
</summary></entry><entry><title>Pelican, a simple static blog generator in python</title><link href="http://blog.notmyidea.org/pelican-a-simple-static-blog-generator-in-python.html" rel="alternate"></link><updated>2010-10-06T00:00:00+02:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2010-10-06:/pelican-a-simple-static-blog-generator-in-python.html/</id><summary type="html">&lt;p&gt;Those days, I've wrote a little python application to fit my blogging needs.
I'm an occasional blogger, a vim lover, I like restructured text and DVCSes, so
I've made a little tool that makes good use of all that.&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="http://alexis.notmyidea.org/pelican/"&gt;Pelican&lt;/a&gt; (for calepin) is just a simple tool to generate your blog as static
files, letting you using your editor of choice (vim!). It's easy to extend,
and has a template support (via jinja2).&lt;/p&gt;
&lt;p&gt;I've made it to fit &lt;em&gt;my&lt;/em&gt; needs. I hope it will fit yours, but maybe it wont, and
it have not be designed to feet everyone's needs.&lt;/p&gt;
&lt;p&gt;Need an example ? You're looking at it ! This weblog is using pelican to be
generated, also for the atom feeds.&lt;/p&gt;
&lt;p&gt;I've released it under AGPL, since I want all the modifications to be profitable
to all the users.&lt;/p&gt;
&lt;p&gt;You can find a mercurial repository to fork at &lt;a class="reference external" href="http://hg.lolnet.org/pelican/"&gt;http://hg.lolnet.org/pelican/&lt;/a&gt;,
feel free to hack it !&lt;/p&gt;
&lt;p&gt;If you just want to get started, use your installer of choice (pip, easy_install, …)
And then have a look to the help (&lt;cite&gt;pelican --help&lt;/cite&gt;)&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;pip install pelican
&lt;/pre&gt;&lt;/div&gt;
&lt;div class="section" id="usage"&gt;
&lt;h2&gt;Usage&lt;/h2&gt;
&lt;p&gt;Here's a sample usage of pelican&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;pelican .
writing /home/alexis/projets/notmyidea.org/output/index.html
writing /home/alexis/projets/notmyidea.org/output/tags.html
writing /home/alexis/projets/notmyidea.org/output/categories.html
writing /home/alexis/projets/notmyidea.org/output/archives.html
writing /home/alexis/projets/notmyidea.org/output/category/python.html
writing
/home/alexis/projets/notmyidea.org/output/pelican-a-simple-static-blog-generator-in-python.html
Done !
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You also can use the &lt;cite&gt;--help&lt;/cite&gt; option for the command line to get more
informations&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span class="nv"&gt;$pelican&lt;/span&gt; --help
usage: pelican &lt;span class="o"&gt;[&lt;/span&gt;-h&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;-t TEMPLATES&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;-o OUTPUT&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;-m MARKUP&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;-s SETTINGS&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;-b&lt;span class="o"&gt;]&lt;/span&gt;
               path

A tool to generate a static blog, with restructured text input files.

positional arguments:
  path                  Path where to find the content files &lt;span class="o"&gt;(&lt;/span&gt;default is
                        &lt;span class="s2"&gt;&amp;quot;content&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;.

optional arguments:
  -h, --help            show this &lt;span class="nb"&gt;help &lt;/span&gt;message and &lt;span class="nb"&gt;exit&lt;/span&gt;
  -t TEMPLATES, --templates-path TEMPLATES
                        Path where to find the templates. If not specified,
                        will uses the ones included with pelican.
  -o OUTPUT, --output OUTPUT
                        Where to output the generated files. If not specified,
                        a directory will be created, named &lt;span class="s2"&gt;&amp;quot;output&amp;quot;&lt;/span&gt; in the
                        current path.
  -m MARKUP, --markup MARKUP
                        the markup language to use. Currently only
                        ReSTreucturedtext is available.
  -s SETTINGS, --settings SETTINGS
                        the settings of the application. Default to None.
  -b, --debug
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Enjoy :)&lt;/p&gt;
&lt;/div&gt;
</summary></entry><entry><title>An amazing summer of code working on distutils2</title><link href="http://blog.notmyidea.org/an-amazing-summer-of-code-working-on-distutils2.html" rel="alternate"></link><updated>2010-08-16T00:00:00+02:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2010-08-16:/an-amazing-summer-of-code-working-on-distutils2.html/</id><summary type="html">&lt;p&gt;The &lt;a class="reference external" href="http://code.google.com/soc/"&gt;Google Summer of Code&lt;/a&gt; I've
spent working on &lt;a class="reference external" href="http://hg.python.org/distutils2/"&gt;distutils2&lt;/a&gt;
is over. It was a really amazing experience, for many reasons.&lt;/p&gt;
&lt;p&gt;First of all, we had a very good team, we were 5 students working
on distutils2: &lt;a class="reference external" href="http://zubin71.wordpress.com"&gt;Zubin&lt;/a&gt;,
&lt;a class="reference external" href="http://wokslog.wordpress.com/"&gt;Éric&lt;/a&gt;,
&lt;a class="reference external" href="http://gsoc.djolonga.com/"&gt;Josip&lt;/a&gt;,
&lt;a class="reference external" href="http://konryd.blogspot.com/"&gt;Konrad&lt;/a&gt; and me. In addition,
&lt;a class="reference external" href="http://mouadino.blogspot.com/"&gt;Mouad&lt;/a&gt; have worked on the PyPI
testing infrastructure. You could find what each person have done
on
&lt;a class="reference external" href="http://bitbucket.org/tarek/distutils2/wiki/GSoC_2010_teams"&gt;the wiki page of distutils2&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We were in contact with each others really often, helping us when
possible (in #distutils), and were continuously aware of the state
of the work of each participant. This, in my opinion, have bring us
in a good shape.&lt;/p&gt;
&lt;p&gt;Then, I've learned a lot. Python packaging was completely new to me
at the time of the GSoC start, and I was pretty unfamiliar with
python good practices too, as I've been introducing myself to
python in the late 2009.&lt;/p&gt;
&lt;p&gt;I've recently looked at some python code I wrote just three months
ago, and I was amazed to think about many improvements to made on
it. I guess this is a good indicator of the path I've traveled
since I wrote it.&lt;/p&gt;
&lt;p&gt;This summer was awesome because I've learned about python good
practices, now having some strong
&lt;a class="reference external" href="http://mercurial.selenic.com/"&gt;mercurial&lt;/a&gt; knowledge, and I've
seen a little how the python community works.&lt;/p&gt;
&lt;p&gt;Then, I would like to say a big thanks to all the mentors that have
hanged around while needed, on IRC or via mail, and especially my
mentor for this summer, &lt;a class="reference external" href="http://tarek.ziade.org"&gt;Tarek Ziadé&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Thanks a lot for your motivation, your leadership and your
cheerfulness, even with a new-born and a new work!&lt;/p&gt;
&lt;div class="section" id="why"&gt;
&lt;h2&gt;Why ?&lt;/h2&gt;
&lt;p&gt;I wanted to work on python packaging because, as the time pass, we
were having a sort of complex tools in this field. Each one wanted
to add features to distutils, but not in a standard way.&lt;/p&gt;
&lt;p&gt;Now, we have PEPs that describes some format we agreed on (see PEP
345), and we wanted to have a tool on which users can base their
code on, that's &lt;a class="reference external" href="http://hg.python.org/distutils2/"&gt;distutils2&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="my-job"&gt;
&lt;h2&gt;My job&lt;/h2&gt;
&lt;p&gt;I had to provides a way to crawl the PyPI indexes in a simple way,
and do some installation / uninstallation scripts.&lt;/p&gt;
&lt;p&gt;All the work done is available in
&lt;a class="reference external" href="http://bitbucket.org/ametaireau/distutils2/"&gt;my bitbucket repository&lt;/a&gt;.&lt;/p&gt;
&lt;div class="section" id="crawling-the-pypi-indexes"&gt;
&lt;h3&gt;Crawling the PyPI indexes&lt;/h3&gt;
&lt;p&gt;There are two ways of requesting informations from the indexes:
using the &amp;quot;simple&amp;quot; index, that is a kind of REST index, and using
XML-RPC.&lt;/p&gt;
&lt;p&gt;I've done the two implementations, and a high level API to query
those twos. Basically, this supports the mirroring infrastructure
defined in PEP 381. So far, the work I've done is gonna be used in
pip (they've basically copy/paste the code, but this will change as
soon as we get something completely stable for distutils2), and
that's a good news, as it was the main reason for what I've done
that.&lt;/p&gt;
&lt;p&gt;I've tried to have an unified API for the clients, to switch from
one to another implementation easily. I'm already thinking of
adding others crawlers to this stuff, and it was made to be
extensible.&lt;/p&gt;
&lt;p&gt;If you want to get more informations about the crawlers/PyPI
clients, please refer to the distutils2 documentation, especially
&lt;a class="reference external" href="http://distutils2.notmyidea.org/library/distutils2.index.html"&gt;the pages about indexes&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;You can find the changes I made about this in the
&lt;a class="reference external" href="http://hg.python.org/distutils2/"&gt;distutils2&lt;/a&gt; source code .&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="installation-uninstallation-scripts"&gt;
&lt;h3&gt;Installation / Uninstallation scripts&lt;/h3&gt;
&lt;p&gt;Next step was to think about an installation script, and an
uninstaller. I've not done the uninstaller part, and it's a smart
part, as it's basically removing some files from the system, so
I'll probably do it in a near future.&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="http://hg.python.org/distutils2/"&gt;distutils2&lt;/a&gt; provides a way to
install distributions, and to handle dependencies between releases.
For now, this support is only about the last version of the
METADATA (1.2) (See, the PEP 345), but I'm working on a
compatibility layer for the old metadata, and for the informations
provided via PIP requires.txt, for instance.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="extra-work"&gt;
&lt;h3&gt;Extra work&lt;/h3&gt;
&lt;p&gt;Also, I've done some extra work. this includes:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;working on the PEP 345, and having some discussion about it
(about the names of some fields).&lt;/li&gt;
&lt;li&gt;writing a PyPI server mock, useful for tests. you can find more
information about it on the
&lt;a class="reference external" href="http://distutils.notmyidea.org"&gt;documentation&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="futures-plans"&gt;
&lt;h2&gt;Futures plans&lt;/h2&gt;
&lt;p&gt;As I said, I've enjoyed working on distutils2, and the people I've
met here are really pleasant to work with. So I &lt;em&gt;want&lt;/em&gt; to continue
contributing on python, and especially on python packaging, because
there is still a lot of things to do in this scope, to get
something really usable.&lt;/p&gt;
&lt;p&gt;I'm not plainly satisfied by the work I've done, so I'll probably
tweak it a bit: the installer part is not yet completely finished,
and I want to add support for a real
&lt;a class="reference external" href="http://en.wikipedia.org/wiki/Representational_State_Transfer"&gt;REST&lt;/a&gt;
index in the future.&lt;/p&gt;
&lt;p&gt;We'll talk again of this in the next months, probably, but we
definitely need a real
&lt;a class="reference external" href="http://en.wikipedia.org/wiki/Representational_State_Transfer"&gt;REST&lt;/a&gt;
API for &lt;a class="reference external" href="http://pypi.python.org"&gt;PyPI&lt;/a&gt;, as the &amp;quot;simple&amp;quot; index
&lt;em&gt;is&lt;/em&gt; an ugly hack, in my opinion. I'll work on a serious
proposition about this, maybe involving
&lt;a class="reference external" href="http://couchdb.org"&gt;CouchDB&lt;/a&gt;, as it seems to be a good option
for what we want here.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="issues"&gt;
&lt;h2&gt;Issues&lt;/h2&gt;
&lt;p&gt;I've encountered some issues during this summer. The main one is
that's hard to work remotely, especially being in the same room
that we live, with others. I like to just think about a project
with other people, a paper and a pencil, no computers. This have
been not so possible at the start of the project, as I needed to
read a lot of code to understand the codebase, and then to
read/write emails.&lt;/p&gt;
&lt;p&gt;I've finally managed to work in an office, so good point for
home/office separation.&lt;/p&gt;
&lt;p&gt;I'd not planned there will be so a high number of emails to read,
in order to follow what's up in the python world, and be a part of
the community seems to takes some times to read/write emails,
especially for those (like me) that arent so confortable with
english (but this had brought me some english fu !).&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="thanks"&gt;
&lt;h2&gt;Thanks !&lt;/h2&gt;
&lt;p&gt;A big thanks to &lt;a class="reference external" href="http://www.graine-libre.fr/"&gt;Graine Libre&lt;/a&gt; and
&lt;a class="reference external" href="http://www.makina-corpus.com/"&gt;Makina Corpus&lt;/a&gt;, which has offered
me to come into their offices from time to time, to share they
cheerfulness ! Many thanks too to the Google Summer of Code program
for setting up such an initiative. If you're a student, if you're
interested about FOSS, dont hesitate any second, it's a really good
opportunity to work on interesting projects!&lt;/p&gt;
&lt;/div&gt;
</summary></entry><entry><title>Introducing the distutils2 index crawlers</title><link href="http://blog.notmyidea.org/introducing-the-distutils2-index-crawlers.html" rel="alternate"></link><updated>2010-07-06T00:00:00+02:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2010-07-06:/introducing-the-distutils2-index-crawlers.html/</id><summary type="html">&lt;p&gt;I'm working for about a month for distutils2, even if I was being a
bit busy (as I had some class courses and exams to work on)&lt;/p&gt;
&lt;p&gt;I'll try do sum-up my general feelings here, and the work I've made
so far. You can also find, if you're interested, my weekly
summaries in
&lt;a class="reference external" href="http://wiki.notmyidea.org/distutils2_schedule"&gt;a dedicated wiki page&lt;/a&gt;.&lt;/p&gt;
&lt;div class="section" id="general-feelings"&gt;
&lt;h2&gt;General feelings&lt;/h2&gt;
&lt;p&gt;First, and it's a really important point, the GSoC is going very
well, for me as for other students, at least from my perspective.
It's a pleasure to work with such enthusiast people, as this make
the global atmosphere very pleasant to live.&lt;/p&gt;
&lt;p&gt;First of all, I've spent time to read the existing codebase, and to
understand what we're going to do, and what's the rationale to do
so.&lt;/p&gt;
&lt;p&gt;It's really clear for me now: what we're building is the
foundations of a packaging infrastructure in python. The fact is
that many projects co-exists, and comes all with their good
concepts. Distutils2 tries to take the interesting parts of all,
and to provide it in the python standard libs, respecting the
recently written PEP about packaging.&lt;/p&gt;
&lt;p&gt;With distutils2, it will be simpler to make &amp;quot;things&amp;quot; compatible. So
if you think about a new way to deal with distributions and
packaging in python, you can use the Distutils2 APIs to do so.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="tasks"&gt;
&lt;h2&gt;Tasks&lt;/h2&gt;
&lt;p&gt;My main task while working on distutils2 is to provide an
installation and an un-installation command, as described in PEP
376. For this, I first need to get informations about the existing
distributions (what's their version, name, metadata, dependencies,
etc.)&lt;/p&gt;
&lt;p&gt;The main index, you probably know and use, is PyPI. You can access
it at &lt;a class="reference external" href="http://pypi.python.org"&gt;http://pypi.python.org&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="pypi-index-crawling"&gt;
&lt;h2&gt;PyPI index crawling&lt;/h2&gt;
&lt;p&gt;There is two ways to get these informations from PyPI: using the
simple API, or via xml-rpc calls.&lt;/p&gt;
&lt;p&gt;A goal was to use the version specifiers defined
in`PEP 345 &amp;lt;&lt;a class="reference external" href="http://www.python.org/dev/peps/pep-0345/"&gt;http://www.python.org/dev/peps/pep-0345/&lt;/a&gt;&amp;gt;`_ and to
provides a way to sort the grabbed distributions depending our
needs, to pick the version we want/need.&lt;/p&gt;
&lt;div class="section" id="using-the-simple-api"&gt;
&lt;h3&gt;Using the simple API&lt;/h3&gt;
&lt;p&gt;The simple API is composed of HTML pages you can access at
&lt;a class="reference external" href="http://pypi.python.org/simple/"&gt;http://pypi.python.org/simple/&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Distribute and Setuptools already provides a crawler for that, but
it deals with their internal mechanisms, and I found that the code
was not so clear as I want, that's why I've preferred to pick up
the good ideas, and some implementation details, plus re-thinking
the global architecture.&lt;/p&gt;
&lt;p&gt;The rules are simple: each project have a dedicated page, which
allows us to get informations about:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;the distribution download locations (for some versions)&lt;/li&gt;
&lt;li&gt;homepage links&lt;/li&gt;
&lt;li&gt;some other useful informations, as the bugtracker address, for
instance.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If you want to find all the distributions of the &amp;quot;EggsAndSpam&amp;quot;
project, you could do the following (do not take so attention to
the names here, as the API will probably change a bit):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SimpleIndex&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="o"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;EggsAndSpam&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;EggsAndSpam&lt;/span&gt; &lt;span class="mf"&gt;1.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;EggsAndSpam&lt;/span&gt; &lt;span class="mf"&gt;1.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;EggsAndSpam&lt;/span&gt; &lt;span class="mf"&gt;1.3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;We also could use version specifiers:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;EggsAndSpam (&amp;lt; =1.2)&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;EggsAndSpam&lt;/span&gt; &lt;span class="mf"&gt;1.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;EggsAndSpam&lt;/span&gt; &lt;span class="mf"&gt;1.2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Internally, what's done here is the following:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;it process the
&lt;a class="reference external" href="http://pypi.python.org/simple/FooBar/"&gt;http://pypi.python.org/simple/FooBar/&lt;/a&gt;
page, searching for download URLs.&lt;/li&gt;
&lt;li&gt;for each found distribution download URL, it creates an object,
containing informations about the project name, the version and the
URL where the archive remains.&lt;/li&gt;
&lt;li&gt;it sort the found distributions, using version numbers. The
default behavior here is to prefer source distributions (over
binary ones), and to rely on the last &amp;quot;final&amp;quot; distribution (rather
than beta, alpha etc. ones)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So, nothing hard or difficult here.&lt;/p&gt;
&lt;p&gt;We provides a bunch of other features, like relying on the new PyPI
mirroring infrastructure or filter the found distributions by some
criterias. If you're curious, please browse the
&lt;a class="reference external" href="http://distutils2.notmyidea.org/"&gt;distutils2 documentation&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="using-xml-rpc"&gt;
&lt;h3&gt;Using xml-rpc&lt;/h3&gt;
&lt;p&gt;We also can make some xmlrpc calls to retreive informations from
PyPI. It's a really more reliable way to get informations from from
the index (as it's just the index that provides the informations),
but cost processes on the PyPI distant server.&lt;/p&gt;
&lt;p&gt;For now, this way of querying the xmlrpc client is not available on
Distutils2, as I'm working on it. The main pieces are already
present (I'll reuse some work I've made from the SimpleIndex
querying, and
&lt;a class="reference external" href="http://github.com/ametaireau/pypiclient"&gt;some code already set up&lt;/a&gt;),
what I need to do is to provide a xml-rpc PyPI mock server, and
that's on what I'm actually working on.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="processes"&gt;
&lt;h2&gt;Processes&lt;/h2&gt;
&lt;p&gt;For now, I'm trying to follow the &amp;quot;documentation, then test, then
code&amp;quot; path, and that seems to be really needed while working with a
community. Code is hard to read/understand, compared to
documentation, and it's easier to change.&lt;/p&gt;
&lt;p&gt;While writing the simple index crawling work, I must have done this
to avoid some changes on the API, and some loss of time.&lt;/p&gt;
&lt;p&gt;Also, I've set up
&lt;a class="reference external" href="http://wiki.notmyidea.org/distutils2_schedule"&gt;a schedule&lt;/a&gt;, and
the goal is to be sure everything will be ready in time, for the
end of the summer. (And now, I need to learn to follow schedules
...)&lt;/p&gt;
&lt;/div&gt;
</summary></entry><entry><title>Sprinting on distutils2 in Tours</title><link href="http://blog.notmyidea.org/sprinting-on-distutils2-in-tours.html" rel="alternate"></link><updated>2010-07-06T00:00:00+02:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2010-07-06:/sprinting-on-distutils2-in-tours.html/</id><summary type="html">&lt;p&gt;Yesterday, as I was traveling to Tours, I've took some time to
visit Éric, another student who's working on distutils2 this
summer, as a part of the GSoC. Basically, it was to take a drink,
discuss a bit about distutils2, our respective tasks and general
feelings, and to put a face on a pseudonym. I'd really enjoyed this
time, because Éric knows a lot of things about mercurial and python
good practices, and I'm eager to learn about those. So, we have
discussed about things, have not wrote so much code, but have some
things to propose so far, about documentation, and I also provides
here some bribes of conversations we had.&lt;/p&gt;
&lt;div class="section" id="documentation"&gt;
&lt;h2&gt;Documentation&lt;/h2&gt;
&lt;p&gt;While writing the PyPI simple index crawler documentation, I
realized that we miss some structure, or how-to about the
documentation. Yep, you read well. We lack documentation on how to
make documentation. Heh. We're missing some rules to follow, and
this lead to a not-so-structured final documentation. We probably
target three type of publics, and we can split the documentation
regarding those:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;strong&gt;Packagers&lt;/strong&gt; who want to distribute their softwares.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;End users&lt;/strong&gt; who need to understand how to use end user
commands, like the installer/uninstaller&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;packaging coders&lt;/strong&gt; who &lt;em&gt;use&lt;/em&gt; distutils2, as a base for
building a package manager.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We also need to discuss about a pattern to follow while writing
documentation. How many parts do we need ? Where to put the API
description ? etc. That's maybe seems to be not so important, but I
guess the readers would appreciate to have the same structure all
along distutils2 documentation.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="mercurial"&gt;
&lt;h2&gt;Mercurial&lt;/h2&gt;
&lt;p&gt;I'm really &lt;em&gt;not&lt;/em&gt; a mercurial power user. I use it on daily basis,
but I lack of basic knowledge about it. Big thanks Éric for sharing
yours with me, you're of a great help. We have talked about some
mercurial extensions that seems to make the life simpler, while
used the right way. I've not used them so far, so consider this as
a personal note.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;hg histedit, to edit the history&lt;/li&gt;
&lt;li&gt;hg crecord, to select the changes to commit&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We have spent some time to review a merge I made sunday, to
re-merge it, and commit the changes as a new changeset. Awesome.
These things make me say I &lt;strong&gt;need&lt;/strong&gt; to read
&lt;a class="reference external" href="http://hgbook.red-bean.com/read/"&gt;the hg book&lt;/a&gt;, and will do as
soon as I got some spare time: mercurial seems to be simply great.
So ... Great. I'm a powerful merger now !&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="on-using-tools"&gt;
&lt;h2&gt;On using tools&lt;/h2&gt;
&lt;p&gt;Because we &lt;em&gt;also&lt;/em&gt; are &lt;em&gt;hackers&lt;/em&gt;, we have shared a bit our ways to
code, the tools we use, etc. Both of us were using vim, and I've
discovered vimdiff and hgtk, which will completely change the way I
navigate into the mercurial history. We aren't &amp;quot;power users&amp;quot;, so we
have learned from each other about vim tips. You can find
&lt;a class="reference external" href="http://github.com/ametaireau/dotfiles"&gt;my dotfiles on github&lt;/a&gt;,
if it could help. They're not perfect, and not intended to be,
because changing all the time, as I learn. Don't hesitate to have a
look, and to propose enhancements if you have !&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="on-being-pythonic"&gt;
&lt;h2&gt;On being pythonic&lt;/h2&gt;
&lt;p&gt;My background as an old Java user disserves me so far, as the
paradigms are not the same while coding in python. Hard to find the
more pythonic way to do, and sometimes hard to unlearn my way to
think about software engineering. Well, it seems that the only
solution is to read code, and to re-read import this from times to
times !
&lt;a class="reference external" href="http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html"&gt;Coding like a pythonista&lt;/a&gt;
seems to be a must-read, so, I know what to do.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="conclusion"&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;It was really great. Next time, we'll need to focus a bit more on
distutils2, and to have a bullet list of things to do, but days
like this one are opportunities to catch ! We'll probably do
another sprint in a few weeks, stay tuned !&lt;/p&gt;
&lt;/div&gt;
</summary></entry><entry><title>Use Restructured Text (ReST) to power your presentations</title><link href="http://blog.notmyidea.org/use-restructured-text-rest-to-power-your-presentations.html" rel="alternate"></link><updated>2010-06-25T00:00:00+02:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2010-06-25:/use-restructured-text-rest-to-power-your-presentations.html/</id><summary type="html">&lt;p&gt;Wednesday, we give a presentation, with some friends, about the
CouchDB Database, to
&lt;a class="reference external" href="http://www.toulibre.org"&gt;the Toulouse local LUG&lt;/a&gt;. Thanks a lot
to all the presents for being there, it was a pleasure to talk
about this topic with you. Too bad the season is over now an I quit
Toulouse next year.&lt;/p&gt;
&lt;p&gt;During our brainstorming about the topic, we
used some paper, and we wanted to make a presentation the simpler
way. First thing that come to my mind was using
&lt;a class="reference external" href="http://docutils.sourceforge.net/rst.html"&gt;restructured text&lt;/a&gt;, so
I've wrote a simple file containing our different bullet points. In
fact, there is quite nothing to do then, to have a working
presentation.&lt;/p&gt;
&lt;p&gt;So far, I've used
&lt;a class="reference external" href="http://code.google.com/p/rst2pdf/"&gt;the rst2pdf program&lt;/a&gt;, and a
simple template, to generate output. It's probably simple to have
similar results using latex + beamer, I'll try this next time, but
as I'm not familiar with latex syntax, restructured text was a
great option.&lt;/p&gt;
&lt;p&gt;Here are
&lt;a class="reference external" href="http://files.lolnet.org/alexis/rst-presentations/couchdb/couchdb.pdf"&gt;the final PDF output&lt;/a&gt;,
&lt;a class="reference external" href="http://files.lolnet.org/alexis/rst-presentations/couchdb/couchdb.rst"&gt;Rhe ReST source&lt;/a&gt;,
&lt;a class="reference external" href="http://files.lolnet.org/alexis/rst-presentations/slides.style"&gt;the theme used&lt;/a&gt;,
and the command line to generate the PDF:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
rst2pdf couchdb.rst -b1 -s ../slides.style
&lt;/pre&gt;
</summary></entry><entry><title>first week working on distutils2</title><link href="http://blog.notmyidea.org/first-week-working-on-distutils2.html" rel="alternate"></link><updated>2010-06-04T00:00:00+02:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2010-06-04:/first-week-working-on-distutils2.html/</id><summary type="html">&lt;p&gt;As I've been working on
&lt;a class="reference external" href="http://hg.python.org/distutils2/"&gt;Distutils2&lt;/a&gt; during the past
week, taking part of the
&lt;a class="reference external" href="http://code.google.com/intl/fr/soc/"&gt;GSOC&lt;/a&gt; program, here is a
short summary of what I've done so far.&lt;/p&gt;
&lt;p&gt;As my courses are not over yet, I've not worked as much as I
wanted, and this will continues until the end of June. My main
tasks are about making installation and uninstallation commands, to
have a simple way to install distributions via
&lt;a class="reference external" href="http://hg.python.org/distutils2/"&gt;Distutils2&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;To do this, we need to rely on informations provided by the Python
Package Index (&lt;a class="reference external" href="http://pypi.python.org/"&gt;PyPI&lt;/a&gt;), and there is at
least two ways to retreive informations from here: XML-RPC and the
&amp;quot;simple&amp;quot; API.&lt;/p&gt;
&lt;p&gt;So, I've been working on porting some
&lt;a class="reference external" href="http://bitbucket.org/tarek/distribute/"&gt;Distribute&lt;/a&gt; related
stuff to &lt;a class="reference external" href="http://hg.python.org/distutils2/"&gt;Distutils2&lt;/a&gt;, cutting
off all non distutils' things, as we do not want to depend from
Distribute's internals. My main work has been about reading the
whole code, writing tests about this and making those tests
possible.&lt;/p&gt;
&lt;p&gt;In fact, there was a need of a pypi mocked server, and, after
reading and introducing myself to the distutils behaviors and code,
I've taken some time to improve the work
&lt;a class="reference external" href="http://bitbucket.org/konrad"&gt;Konrad&lt;/a&gt; makes about this mock.&lt;/p&gt;
&lt;div class="section" id="a-pypi-server-mock"&gt;
&lt;h2&gt;A PyPI Server mock&lt;/h2&gt;
&lt;p&gt;The mock is embeded in a thread, to make it available during the
tests, in a non blocking way. We first used
&lt;a class="reference external" href="http://wsgi.org"&gt;WSGI&lt;/a&gt; and
&lt;a class="reference external" href="http://docs.python.org/library/wsgiref.html"&gt;wsgiref&lt;/a&gt; in order
control what to serve, and to log the requests made to the server,
but finally realised that
&lt;a class="reference external" href="http://docs.python.org/library/wsgiref.html"&gt;wsgiref&lt;/a&gt; is not
python 2.4 compatible (and we &lt;em&gt;need&lt;/em&gt; to be python 2.4 compatible in
Distutils2).&lt;/p&gt;
&lt;p&gt;So, we switched to
&lt;a class="reference external" href="http://docs.python.org/library/basehttpserver.html"&gt;BaseHTTPServer&lt;/a&gt;
and
&lt;a class="reference external" href="http://docs.python.org/library/simplehttpserver.html"&gt;SimpleHTTPServer&lt;/a&gt;,
and updated our tests accordingly. It's been an opportunity to
realize that &lt;a class="reference external" href="http://wsgi.org"&gt;WSGI&lt;/a&gt; has been a great step
forward for making HTTP servers, and expose a really simplest way
to discuss with HTTP !&lt;/p&gt;
&lt;p&gt;You can find
&lt;a class="reference external" href="http://bitbucket.org/ametaireau/distutils2/changesets"&gt;the modifications I made&lt;/a&gt;,
and the
&lt;a class="reference external" href="http://bitbucket.org/ametaireau/distutils2/src/tip/docs/source/test_framework.rst"&gt;related docs&lt;/a&gt;
about this on
&lt;a class="reference external" href="http://bitbucket.org/ametaireau/distutils2/"&gt;my bitbucket distutils2 clone&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="the-pypi-simple-api"&gt;
&lt;h2&gt;The PyPI Simple API&lt;/h2&gt;
&lt;p&gt;So, back to the main problematic: make a python library to access
and request information stored on PyPI, via the simple API. As I
said, I've just grabbed the work made from
&lt;a class="reference external" href="http://bitbucket.org/tarek/distribute/"&gt;Distribute&lt;/a&gt;, and played
a bit with, in order to view what are the different use cases, and
started to write the related tests.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="the-work-to-come"&gt;
&lt;h2&gt;The work to come&lt;/h2&gt;
&lt;p&gt;So, once all use cases covered with tests, I'll rewrite a bit the
grabbed code, and do some software design work (to not expose all
things as privates methods, have a clear API, and other things like
this), then update the tests accordingly and write a documentation
to make this clear.&lt;/p&gt;
&lt;p&gt;Next step is to a little client, as I've
&lt;a class="reference external" href="http://github.com/ametaireau/pypiclient"&gt;already started here&lt;/a&gt;
I'll take you updated !&lt;/p&gt;
&lt;/div&gt;
</summary></entry><entry><title>A Distutils2 GSoC</title><link href="http://blog.notmyidea.org/a-distutils2-gsoc.html" rel="alternate"></link><updated>2010-05-01T00:00:00+02:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2010-05-01:/a-distutils2-gsoc.html/</id><summary type="html">&lt;p&gt;WOW. I've been accepted to be a part of the
&lt;a class="reference external" href="http://code.google.com/intl/fr/soc/"&gt;Google Summer Of Code&lt;/a&gt;
program, and will work on &lt;a class="reference external" href="http://python.org/"&gt;python&lt;/a&gt;
&lt;a class="reference external" href="http://hg.python.org/distutils2/"&gt;distutils2&lt;/a&gt;, with
&lt;a class="reference external" href="http://pygsoc.wordpress.com/"&gt;a&lt;/a&gt;
&lt;a class="reference external" href="http://konryd.blogspot.com/"&gt;lot&lt;/a&gt; &lt;a class="reference external" href="http://ziade.org/"&gt;of&lt;/a&gt;
(intersting!) &lt;a class="reference external" href="http://zubin71.wordpress.com/"&gt;people&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
So, it's about building the successor of Distutils2, ie. &amp;quot;the
python package manager&amp;quot;. Today, there is too&amp;nbsp;many ways to package a
python application (pip, setuptools, distribute, distutils, etc.)
so&amp;nbsp;there is a huge effort to make in order to make all this
packaging stuff interoperable, as pointed out by
the&amp;nbsp;&lt;a class="reference external" href="http://www.python.org/dev/peps/pep-0376/"&gt;PEP 376&lt;/a&gt;.&lt;/blockquote&gt;
&lt;p&gt;In more details, I'm going to work on the Installer / Uninstaller
features of Distutils2, and on a PyPI XML-RPC client for distutils2.
Here are the already defined tasks:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;Implement Distutils2 APIs described in PEP 376.&lt;/li&gt;
&lt;li&gt;Add the uninstall command.&lt;/li&gt;
&lt;li&gt;think about a basic installer / uninstaller script. (with deps)
-- similar to pip/easy_install&lt;/li&gt;
&lt;li&gt;in a pypi subpackage;&lt;/li&gt;
&lt;li&gt;Integrate a module similar to setuptools' package_index'&lt;/li&gt;
&lt;li&gt;PyPI XML-RPC client for distutils 2:
&lt;a class="reference external" href="http://bugs.python.org/issue8190"&gt;http://bugs.python.org/issue8190&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;As I'm relatively new to python, I'll need some extra work in order
to apply all good practice, among other things that can make a
developper-life joyful. I'll post here, each week, my advancement,
and my tought about python and especialy python packaging world.&lt;/p&gt;
</summary></entry><entry><title>Python ? go !</title><link href="http://blog.notmyidea.org/python-go.html" rel="alternate"></link><updated>2009-12-17T00:00:00+01:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2009-12-17:/python-go.html/</id><summary type="html">&lt;p&gt;Cela fait maintenant un peu plus d'un mois que je travaille sur un
projet en &lt;a class="reference external" href="http://www.djangoproject.org"&gt;django&lt;/a&gt;, et que,
nécessairement, je me forme à &lt;a class="reference external" href="http://python.org/"&gt;Python&lt;/a&gt;. Je
prends un plaisir non dissimulé à découvrir ce langage (et à
l'utiliser), qui ne cesse de me surprendre. Les premiers mots qui
me viennent à l'esprit à propos de Python, sont &amp;quot;logique&amp;quot; et
&amp;quot;simple&amp;quot;. Et pourtant puissant pour autant. Je ne manque d'ailleurs
pas une occasion pour faire un peu d'&lt;em&gt;évangélisation&lt;/em&gt; auprès des
quelques personnes qui veulent bien m'écouter.&lt;/p&gt;
&lt;div class="section" id="the-zen-of-python"&gt;
&lt;h2&gt;The Zen of Python&lt;/h2&gt;
&lt;p&gt;Avant toute autre chose, je pense utile de citer Tim Peters, et
&lt;a class="reference external" href="http://www.python.org/dev/peps/pep-0020/"&gt;le PEP20&lt;/a&gt;, qui
constituent une très bonne introduction au langage, qui prends la
forme d'un &lt;em&gt;easter egg&lt;/em&gt; présent dans python&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&amp;gt;&amp;gt;&amp;gt; import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren&lt;span class="s1"&gt;&amp;#39;t special enough to break the rules.&lt;/span&gt;
&lt;span class="s1"&gt;Although practicality beats purity.&lt;/span&gt;
&lt;span class="s1"&gt;Errors should never pass silently.&lt;/span&gt;
&lt;span class="s1"&gt;Unless explicitly silenced.&lt;/span&gt;
&lt;span class="s1"&gt;In the face of ambiguity, refuse the temptation to guess.&lt;/span&gt;
&lt;span class="s1"&gt;There should be one-- and preferably only one --obvious way to do it.&lt;/span&gt;
&lt;span class="s1"&gt;Although that way may not be obvious at first unless you&amp;#39;&lt;/span&gt;re Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it&lt;span class="s1"&gt;&amp;#39;s a bad idea.&lt;/span&gt;
&lt;span class="s1"&gt;If the implementation is easy to explain, it may be a good idea.&lt;/span&gt;
&lt;span class="s1"&gt;Namespaces are one honking great idea -- let&amp;#39;&lt;/span&gt;s &lt;span class="k"&gt;do &lt;/span&gt;more of those!
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;J'ai la vague impression que c'est ce que j'ai toujours cherché à
faire en PHP, et particulièrement dans
&lt;a class="reference external" href="http://www.spiral-project.org"&gt;le framework Spiral&lt;/a&gt;, mais en
ajoutant ces concepts dans une sur-couche au langage. Ici, c'est
directement de &lt;em&gt;l'esprit&lt;/em&gt; de python qu'il s'agit, ce qui signifie
que la plupart des bibliothèques python suivent ces concepts. Elle
est pas belle la vie ?&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="comment-commencer-et-par-ou"&gt;
&lt;h2&gt;Comment commencer, et par ou ?&lt;/h2&gt;
&lt;p&gt;Pour ma part, j'ai commencé par la lecture de quelques livres et
articles intéressants, qui constituent une bonne entrée en matière
sur le sujet (La liste n'est bien évidemment pas exhaustive et vos
commentaires sont les bienvenus) :&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;a class="reference external" href="http://diveintopython.adrahon.org/"&gt;Dive into python&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="reference external" href="http://www.swaroopch.com/notes/Python_fr:Table_des_Matières"&gt;A byte of python&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="reference external" href="http://www.amazon.fr/Python-Petit-guide-lusage-développeur/dp/2100508830"&gt;Python: petit guide à l'usage du développeur agile&lt;/a&gt;
de &lt;a class="reference external" href="http://tarekziade.wordpress.com/"&gt;Tarek Ziadé&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="reference external" href="http://docs.python.org/index.html"&gt;La documentation officielle python&lt;/a&gt;,
bien sûr !&lt;/li&gt;
&lt;li&gt;&lt;a class="reference external" href="http://video.pycon.fr/videos/pycon-fr-2009/"&gt;Les vidéos du pyconfr 2009&lt;/a&gt;!&lt;/li&gt;
&lt;li&gt;Un peu de temps, et une console python ouverte :)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;J'essaye par ailleurs de partager au maximum les ressources que je
trouve de temps à autres, que ce soit
&lt;a class="reference external" href="http://www.twitter.com/ametaireau"&gt;via twitter&lt;/a&gt; ou
&lt;a class="reference external" href="http://delicious.com/ametaireau"&gt;via mon compte delicious&lt;/a&gt;.
Allez jeter un œil
&lt;a class="reference external" href="http://delicious.com/ametaireau/python"&gt;au tag python&lt;/a&gt; sur mon
profil, peut être que vous trouverez des choses intéressantes, qui
sait!&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="un-python-sexy"&gt;
&lt;h2&gt;Un python sexy&lt;/h2&gt;
&lt;p&gt;Quelques fonctionnalités qui devraient vous mettre l'eau à la
bouche:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;a class="reference external" href="http://docs.python.org/library/stdtypes.html#comparisons"&gt;Le chaînage des opérateurs de comparaison&lt;/a&gt;
est possible (a&amp;lt;b &amp;lt;c dans une condition)&lt;/li&gt;
&lt;li&gt;Assignation de valeurs multiples (il est possible de faire a,b,c
= 1,2,3 par exemple)&lt;/li&gt;
&lt;li&gt;&lt;a class="reference external" href="http://docs.python.org/tutorial/datastructures.html"&gt;Les listes&lt;/a&gt;
sont simples à manipuler !&lt;/li&gt;
&lt;li&gt;Les &lt;a class="reference external" href="http://docs.python.org/tutorial/datastructures.html#list-comprehensions"&gt;list comprehension&lt;/a&gt;,
ou comment faire des opérations complexes sur les listes, de
manière simple.&lt;/li&gt;
&lt;li&gt;Les
&lt;a class="reference external" href="http://docs.python.org/library/doctest.html?highlight=doctest"&gt;doctests&lt;/a&gt;:
ou comment faire des tests directement dans la documentation de vos
classes, tout en la documentant avec de vrais exemples.&lt;/li&gt;
&lt;li&gt;Les
&lt;a class="reference external" href="http://www.python.org/doc/essays/metaclasses/meta-vladimir.txt"&gt;métaclasses&lt;/a&gt;,
ou comment contrôler la manière dont les classes se construisent&lt;/li&gt;
&lt;li&gt;Python est
&lt;a class="reference external" href="http://wiki.python.org/moin/Why%20is%20Python%20a%20dynamic%20language%20and%20also%20a%20strongly%20typed%20language"&gt;un langage à typage fort dynamique&lt;/a&gt;:
c'est ce qui m'agaçait avec PHP qui est un langage à typage faible
dynamique.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Cous pouvez également aller regarder
&lt;a class="reference external" href="http://video.pycon.fr/videos/free/53/"&gt;l'atelier donné par Victor Stinner durant le Pyconfr 09&lt;/a&gt;.
Have fun !&lt;/p&gt;
&lt;/div&gt;
</summary></entry></feed>