mirror of
https://github.com/almet/notmyidea.git
synced 2025-04-29 03:52:38 +02:00
1167 lines
No EOL
95 KiB
XML
1167 lines
No EOL
95 KiB
XML
<?xml version="1.0" encoding="utf-8"?>
|
||
<feed xmlns="http://www.w3.org/2005/Atom"><title>Alexis' log</title><link href="http://blog.notmyidea.org" rel="alternate"></link><link href="http://blog.notmyidea.org/feeds/dev.atom.xml" rel="self"></link><id>http://blog.notmyidea.org</id><updated>2011-07-25T00:00:00+02:00</updated><entry><title>Pelican, 9 months later</title><link href="http://blog.notmyidea.org/pelican-9-months-later.html" rel="alternate"></link><updated>2011-07-25T00:00:00+02:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2011-07-25:/pelican-9-months-later.html/</id><summary type="html"><p>Back in October, I released <a class="reference external" href="http://docs.notmyidea.org/alexis/pelican">pelican</a>,
|
||
a little piece of code I wrote to power this weblog. I had simple needs: I wanted
|
||
to be able to use my text editor of choice (vim), a vcs (mercurial) and
|
||
restructured text. I started to write a really simple blog engine
|
||
in something like a hundred python lines and released it on github.</p>
|
||
<p>And people started contributing. I wasn't at all expecting to see people
|
||
interested in such a little piece of code, but it turned out that they were.
|
||
I refactored the code to make it evolve a bit more by two times and eventually,
|
||
in 9 months, got 49 forks, 139 issues and 73 pull requests.</p>
|
||
<p><strong>Which is clearly awesome.</strong></p>
|
||
<p>I pulled features such as translations, tag
|
||
clouds, integration with different services such as twitter or piwik, import
|
||
from dotclear and rss, fixed
|
||
a number of mistakes and improved a lot the codebase. This was a proof that
|
||
there is a bunch of people that are willing to make better softwares just for
|
||
the sake of fun.</p>
|
||
<p>Thank you, guys, you're why I like open source so much.</p>
|
||
</summary><category term="pelican"></category><category term="python"></category><category term="open source"></category><category term="nice story"></category></entry><entry><title>Using JPype to bridge python and Java</title><link href="http://blog.notmyidea.org/using-jpype-to-bridge-python-and-java.html" rel="alternate"></link><updated>2011-06-11T00:00:00+02:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2011-06-11:/using-jpype-to-bridge-python-and-java.html/</id><summary type="html"><p>Java provides some interesting libraries that have no exact equivalent in
|
||
python. In my case, the awesome boilerpipe library allows me to remove
|
||
uninteresting parts of HTML pages, like menus, footers and other &quot;boilerplate&quot;
|
||
contents.</p>
|
||
<p>Boilerpipe is written in Java. Two solutions then: using java from python or
|
||
reimplement boilerpipe in python. I will let you guess which one I chosen, meh.</p>
|
||
<p>JPype allows to bridge python project with java libraries. It takes another
|
||
point of view than Jython: rather than reimplementing python in Java, both
|
||
languages are interfacing at the VM level. This means you need to start a VM
|
||
from your python script, but it does the job and stay fully compatible with
|
||
Cpython and its C extensions.</p>
|
||
<div class="section" id="first-steps-with-jpype">
|
||
<h2>First steps with JPype</h2>
|
||
<p>Once JPype installed (you'll have to hack a bit some files to integrate
|
||
seamlessly with your system) you can access java classes by doing something
|
||
like that:</p>
|
||
<div class="highlight"><pre><span class="kn">import</span> <span class="nn">jpype</span>
|
||
<span class="n">jpype</span><span class="o">.</span><span class="n">startJVM</span><span class="p">(</span><span class="n">jpype</span><span class="o">.</span><span class="n">getDefaultJVMPath</span><span class="p">())</span>
|
||
|
||
<span class="c"># you can then access to the basic java functions</span>
|
||
<span class="n">jpype</span><span class="o">.</span><span class="n">java</span><span class="o">.</span><span class="n">lang</span><span class="o">.</span><span class="n">System</span><span class="o">.</span><span class="n">out</span><span class="o">.</span><span class="n">println</span><span class="p">(</span><span class="s">&quot;hello world&quot;</span><span class="p">)</span>
|
||
|
||
<span class="c"># and you have to shutdown the VM at the end</span>
|
||
<span class="n">jpype</span><span class="o">.</span><span class="n">shutdownJVM</span><span class="p">()</span>
|
||
</pre></div>
|
||
<p>Okay, now we have a hello world, but what we want seems somehow more complex.
|
||
We want to interact with java classes, so we will have to load them.</p>
|
||
</div>
|
||
<div class="section" id="interfacing-with-boilerpipe">
|
||
<h2>Interfacing with Boilerpipe</h2>
|
||
<p>To install boilerpipe, you just have to run an ant script:</p>
|
||
<pre class="literal-block">
|
||
$ cd boilerpipe
|
||
$ ant
|
||
</pre>
|
||
<p>Here is a simple example of how to use boilerpipe in Java, from their sources</p>
|
||
<div class="highlight"><pre><span class="kn">package</span> <span class="n">de</span><span class="o">.</span><span class="na">l3s</span><span class="o">.</span><span class="na">boilerpipe</span><span class="o">.</span><span class="na">demo</span><span class="o">;</span>
|
||
<span class="kn">import</span> <span class="nn">java.net.URL</span><span class="o">;</span>
|
||
<span class="kn">import</span> <span class="nn">de.l3s.boilerpipe.extractors.ArticleExtractor</span><span class="o">;</span>
|
||
|
||
<span class="kd">public</span> <span class="kd">class</span> <span class="nc">Oneliner</span> <span class="o">{</span>
|
||
<span class="kd">public</span> <span class="kd">static</span> <span class="kt">void</span> <span class="nf">main</span><span class="o">(</span><span class="kd">final</span> <span class="n">String</span><span class="o">[]</span> <span class="n">args</span><span class="o">)</span> <span class="kd">throws</span> <span class="n">Exception</span> <span class="o">{</span>
|
||
<span class="kd">final</span> <span class="n">URL</span> <span class="n">url</span> <span class="o">=</span> <span class="k">new</span> <span class="n">URL</span><span class="o">(</span><span class="s">&quot;http://notmyidea.org&quot;</span><span class="o">);</span>
|
||
<span class="n">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="n">ArticleExtractor</span><span class="o">.</span><span class="na">INSTANCE</span><span class="o">.</span><span class="na">getText</span><span class="o">(</span><span class="n">url</span><span class="o">));</span>
|
||
<span class="o">}</span>
|
||
<span class="o">}</span>
|
||
</pre></div>
|
||
<p>To run it:</p>
|
||
<div class="highlight"><pre><span class="nv">$ </span>javac -cp dist/boilerpipe-1.1-dev.jar:lib/nekohtml-1.9.13.jar:lib/xerces-2.9.1.jar src/demo/de/l3s/boilerpipe/demo/Oneliner.java
|
||
<span class="nv">$ </span>java -cp src/demo:dist/boilerpipe-1.1-dev.jar:lib/nekohtml-1.9.13.jar:lib/xerces-2.9.1.jar de.l3s.boilerpipe.demo.Oneliner
|
||
</pre></div>
|
||
<p>Yes, this is kind of ugly, sorry for your eyes.
|
||
Let's try something similar, but from python</p>
|
||
<div class="highlight"><pre><span class="kn">import</span> <span class="nn">jpype</span>
|
||
|
||
<span class="c"># start the JVM with the good classpaths</span>
|
||
<span class="n">classpath</span> <span class="o">=</span> <span class="s">&quot;dist/boilerpipe-1.1-dev.jar:lib/nekohtml-1.9.13.jar:lib/xerces-2.9.1.jar&quot;</span>
|
||
<span class="n">jpype</span><span class="o">.</span><span class="n">startJVM</span><span class="p">(</span><span class="n">jpype</span><span class="o">.</span><span class="n">getDefaultJVMPath</span><span class="p">(),</span> <span class="s">&quot;-Djava.class.path=</span><span class="si">%s</span><span class="s">&quot;</span> <span class="o">%</span> <span class="n">classpath</span><span class="p">)</span>
|
||
|
||
<span class="c"># get the Java classes we want to use</span>
|
||
<span class="n">DefaultExtractor</span> <span class="o">=</span> <span class="n">jpype</span><span class="o">.</span><span class="n">JPackage</span><span class="p">(</span><span class="s">&quot;de&quot;</span><span class="p">)</span><span class="o">.</span><span class="n">l3s</span><span class="o">.</span><span class="n">boilerpipe</span><span class="o">.</span><span class="n">extractors</span><span class="o">.</span><span class="n">DefaultExtractor</span>
|
||
|
||
<span class="c"># call them !</span>
|
||
<span class="k">print</span> <span class="n">DefaultExtractor</span><span class="o">.</span><span class="n">INSTANCE</span><span class="o">.</span><span class="n">getText</span><span class="p">(</span><span class="n">jpype</span><span class="o">.</span><span class="n">java</span><span class="o">.</span><span class="n">net</span><span class="o">.</span><span class="n">URL</span><span class="p">(</span><span class="s">&quot;http://blog.notmyidea.org&quot;</span><span class="p">))</span>
|
||
</pre></div>
|
||
<p>And you get what you want.</p>
|
||
<p>I must say I didn't thought it could work so easily. This will allow me to
|
||
extract text content from URLs and remove the <em>boilerplate</em> text easily
|
||
for infuse (my master thesis project), without having to write java code, nice!</p>
|
||
</div>
|
||
</summary><category term="python"></category><category term="java"></category></entry><entry><title>Un coup de main pour mon mémoire!</title><link href="http://blog.notmyidea.org/un-coup-de-main-pour-mon-memoire.html" rel="alternate"></link><updated>2011-05-25T00:00:00+02:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2011-05-25:/un-coup-de-main-pour-mon-memoire.html/</id><summary type="html"><p>Ça y est, bientôt la fin. LA FIN. La fin des études, et le début du reste.
|
||
En attendant je bosse sur mon mémoire de fin d'études et j'aurais besoin d'un petit
|
||
coup de main.</p>
|
||
<p>Mon mémoire porte sur les systèmes de recommandation. Pour ceux qui connaissent
|
||
last.fm, je fais quelque chose de similaire mais pour les sites internet: en me
|
||
basant sur ce que vous visitez quotidiennement et comment vous le visitez (quelles
|
||
horaires, quelle emplacement géographique, etc.) je souhaites proposer des liens
|
||
qui vous intéresseront potentiellement, en me basant sur l'avis des personnes qui
|
||
ont des profils similaires au votre.</p>
|
||
<p>Le projet est loin d'être terminé, mais la première étape est de récupérer des
|
||
données de navigation, idéalement beaucoup de données de navigation. Donc si
|
||
vous pouvez me filer un coup de main je vous en serais éternellement
|
||
reconnaissant (pour ceux qui font semblant de pas comprendre, entendez &quot;tournée
|
||
générale&quot;).</p>
|
||
<p>J'ai créé un petit site web (en anglais) qui résume un peu le concept, qui vous
|
||
propose de vous inscrire et de télécharger un plugin firefox qui m'enverra des
|
||
information sur les sites que vous visitez (si vous avez l'habitude d'utiliser
|
||
chrome vous pouvez considérer de switcher à firefox4 pour les deux prochains
|
||
mois pour me filer un coup de main). Il est possible de désactiver le plugin
|
||
d'un simple clic si vous souhaitez garder votre vie privée privée ;-)</p>
|
||
<p>Le site est par là: <a class="reference external" href="http://infuse.notmyidea.org">http://infuse.notmyidea.org</a>. Une fois le plugin téléchargé
|
||
et le compte créé il faut renseigner vos identifiants dans le plugin en
|
||
question, et c'est tout!</p>
|
||
<p>A votre bon cœur ! Je récupérerais probablement des données durant les 2
|
||
prochains mois pour ensuite les analyser correctement.</p>
|
||
<p>Merci pour votre aide !</p>
|
||
</summary></entry><entry><title>Analyse users' browsing context to build up a web recommender</title><link href="http://blog.notmyidea.org/analyse-users-browsing-context-to-build-up-a-web-recommender.html" rel="alternate"></link><updated>2011-04-01T00:00:00+02:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2011-04-01:/analyse-users-browsing-context-to-build-up-a-web-recommender.html/</id><summary type="html"><p>No, this is not an april's fool ;)</p>
|
||
<p>Wow, it's been a long time. My year in Oxford is going really well. I realized
|
||
few days ago that the end of the year is approaching really quickly.
|
||
Exams are coming in one month or such and then I'll be working full time on my dissertation topic.</p>
|
||
<p>When I learned we'll have about 6 month to work on something, I first thought
|
||
about doing a packaging related stuff, but finally decided to start something
|
||
new. After all, that's the good time to learn.</p>
|
||
<p>Since a long time, I'm being impressed by the <a class="reference external" href="http://last.fm">last.fm</a>
|
||
recommender system. They're <em>scrobbling</em> the music I listen to since something
|
||
like 5 years now and the recommendations they're doing are really nice and
|
||
accurate (I discovered <strong>a lot</strong> of great artists listening to the
|
||
&quot;neighbour radio&quot;.) (by the way, <a class="reference external" href="http://lastfm.com/user/akounet/">here is</a>
|
||
my lastfm account)</p>
|
||
<p>So I decided to work on recommender systems, to better understand what is it
|
||
about.</p>
|
||
<p>Recommender systems are usually used to increase the sales of products
|
||
(like Amazon.com does) which is not really what I'm looking for (The one who
|
||
know me a bit know I'm kind of sick about all this consumerism going on).</p>
|
||
<p>Actually, the most simple thing I thought of was the web: I'm browsing it quite
|
||
every day and each time new content appears. I've stopped to follow <a class="reference external" href="https://bitbucket.org/bruno/aspirator/">my feed
|
||
reader</a> because of the
|
||
information overload, and reduced drastically the number of people I follow <a class="reference external" href="http://twitter.com/ametaireau/">on
|
||
twitter</a>.</p>
|
||
<p>Too much information kills the information.</p>
|
||
<p>You shall got what will be my dissertation topic: a recommender system for
|
||
the web. Well, such recommender systems already exists, so I will try to add contextual
|
||
information to them: you're probably not interested by the same topics at different
|
||
times of the day, or depending on the computer you're using. We can also
|
||
probably make good use of the way you browse to create groups into the content
|
||
you're browsing (or even use the great firefox4 tab group feature).</p>
|
||
<p>There is a large part of concerns to have about user's privacy as well.</p>
|
||
<p>Here is my proposal (copy/pasted from the one I had to do for my master)</p>
|
||
<div class="section" id="introduction-and-rationale">
|
||
<h2>Introduction and rationale</h2>
|
||
<p>Nowadays, people surf the web more and more often. New web pages are created
|
||
each day so the amount of information to retrieve is more important as the time
|
||
passes. These users uses the web in different contexts, from finding cooking
|
||
recipes to technical articles.</p>
|
||
<p>A lot of people share the same interest to various topics, and the quantity of
|
||
information is such than it's really hard to triage them efficiently without
|
||
spending hours doing it. Firstly because of the huge quantity of information
|
||
but also because the triage is something relative to each person. Although, this
|
||
triage can be facilitated by fetching the browsing information of all
|
||
particular individuals and put the in perspective.</p>
|
||
<p>Machine learning is a branch of Artificial Intelligence (AI) which deals with how
|
||
a program can learn from data. Recommendation systems are a particular
|
||
application area of machine learning which is able to recommend things (links
|
||
in our case) to the users, given a particular database containing the previous
|
||
choices users have made.</p>
|
||
<p>This browsing information is currently available in browsers. Even if it is not
|
||
in a very usable format, it is possible to transform it to something useful.
|
||
This information gold mine just wait to be used. Although, it is not as simple as
|
||
it can seems at the first approach: It is important to take care of the context
|
||
the user is in while browsing links. For instance, It's more likely that during
|
||
the day, a computer scientist will browse computing related links, and that during
|
||
the evening, he browse cooking recipes or something else.</p>
|
||
<p>Page contents are also interesting to analyse, because that's what people
|
||
browse and what actually contain the most interesting part of the information.
|
||
The raw data extracted from the browsing can then be translated into
|
||
something more useful (namely tags, type of resource, visit frequency,
|
||
navigation context etc.)</p>
|
||
<p>The goal of this dissertation is to create a recommender system for web links,
|
||
including this context information.</p>
|
||
<p>At the end of the dissertation, different pieces of software will be provided,
|
||
from raw data collection from the browser to a recommendation system.</p>
|
||
</div>
|
||
<div class="section" id="background-review">
|
||
<h2>Background Review</h2>
|
||
<p>This dissertation is mainly about data extraction, analysis and recommendation
|
||
systems. Two different research area can be isolated: Data preprocessing and
|
||
Information filtering.</p>
|
||
<p>The first step in order to make recommendations is to gather some data. The
|
||
more data we have available, the better it is (T. Segaran, 2007). This data can
|
||
be retrieved in various ways, one of them is to get it directly from user's
|
||
browsers.</p>
|
||
<div class="section" id="data-preparation-and-extraction">
|
||
<h3>Data preparation and extraction</h3>
|
||
<p>The data gathered from browsers is basically URLs and additional information
|
||
about the context of the navigation. There is clearly a need to extract more
|
||
information about the meaning of the data the user is browsing, starting by the
|
||
content of the web pages.</p>
|
||
<p>Because the information provided on the current Web is not meant to be read by
|
||
machines (T. Berners Lee, 2001) there is a need of tools to extract meaning from
|
||
web pages. The information needs to be preprocessed before stored in a machine
|
||
readable format, allowing to make recommendations (Choochart et Al, 2004).</p>
|
||
<p>Data preparation is composed of two steps: cleaning and structuring (
|
||
Castellano et Al, 2007). Because raw data can contain a lot of un-needed text
|
||
(such as menus, headers etc.) and need to be cleaned prior to be stored.
|
||
Multiple techniques can be used here and belongs to boilerplate removal and
|
||
full text extraction (Kohlschütter et Al, 2010).</p>
|
||
<p>Then, structuring the information: category, type of content (news, blog, wiki)
|
||
can be extracted from raw data. This kind of information is not clearly defined
|
||
by HTML pages so there is a need of tools to recognise them.</p>
|
||
<p>Some context-related information can also be inferred from each resource. It can go
|
||
from the visit frequency to the navigation group the user was in while
|
||
browsing. It is also possible to determine if the user &quot;liked&quot; a resource, and
|
||
determine a mark for it, which can be used by information filtering a later
|
||
step (T. Segaran, 2007).</p>
|
||
<p>At this stage, structuring the data is required. Storing this kind of
|
||
information in RDBMS can be a bit tedious and require complex queries to get
|
||
back the data in an usable format. Graph databases can play a major role in the
|
||
simplification of information storage and querying.</p>
|
||
</div>
|
||
<div class="section" id="information-filtering">
|
||
<h3>Information filtering</h3>
|
||
<p>To filter the information, three techniques can be used (Balabanovic et
|
||
Al, 1997):</p>
|
||
<ul class="simple">
|
||
<li>The content-based approach states that if an user have liked something in the
|
||
past, he is more likely to like similar things in the future. So it's about
|
||
establishing a profile for the user and compare new items against it.</li>
|
||
<li>The collaborative approach will rather recommend items that other similar users
|
||
have liked. This approach consider only the relationship between users, and
|
||
not the profile of the user we are making recommendations to.</li>
|
||
<li>the hybrid approach, which appeared recently combine both of the previous
|
||
approaches, giving recommendations when items score high regarding user's
|
||
profile, or if a similar user already liked it.</li>
|
||
</ul>
|
||
<p>Grouping is also something to consider at this stage (G. Myatt, 2007).
|
||
Because we are dealing with huge amount of data, it can be useful to detect group
|
||
of data that can fit together. Data clustering is able to find such groups (T.
|
||
Segaran, 2007).</p>
|
||
<p>References:</p>
|
||
<ul class="simple">
|
||
<li>Balabanović, M., &amp; Shoham, Y. (1997). Fab: content-based, collaborative
|
||
recommendation. Communications of the ACM, 40(3), 66–72. ACM.
|
||
Retrieved March 1, 2011, from <a class="reference external" href="http://portal.acm.org/citation.cfm?id=245108.245124&amp;amp">http://portal.acm.org/citation.cfm?id=245108.245124&amp;amp</a>;.</li>
|
||
<li>Berners-Lee, T., Hendler, J., &amp; Lassila, O. (2001).
|
||
The semantic web: Scientific american. Scientific American, 284(5), 34–43.
|
||
Retrieved November 21, 2010, from <a class="reference external" href="http://www.citeulike.org/group/222/article/1176986">http://www.citeulike.org/group/222/article/1176986</a>.</li>
|
||
<li>Castellano, G., Fanelli, A., &amp; Torsello, M. (2007).
|
||
LODAP: a LOg DAta Preprocessor for mining Web browsing patterns. Proceedings of the 6th Conference on 6th WSEAS Int. Conf. on Artificial Intelligence, Knowledge Engineering and Data Bases-Volume 6 (p. 12–17). World Scientific and Engineering Academy and Society (WSEAS). Retrieved March 8, 2011, from <a class="reference external" href="http://portal.acm.org/citation.cfm?id=1348485.1348488">http://portal.acm.org/citation.cfm?id=1348485.1348488</a>.</li>
|
||
<li>Kohlschutter, C., Fankhauser, P., &amp; Nejdl, W. (2010). Boilerplate detection using shallow text features. Proceedings of the third ACM international conference on Web search and data mining (p. 441–450). ACM. Retrieved March 8, 2011, from <a class="reference external" href="http://portal.acm.org/citation.cfm?id=1718542">http://portal.acm.org/citation.cfm?id=1718542</a>.</li>
|
||
<li>Myatt, G. J. (2007). Making Sense of Data: A Practical Guide to Exploratory
|
||
Data Analysis and Data Mining.</li>
|
||
<li>Segaran, T. (2007). Collective Intelligence.</li>
|
||
</ul>
|
||
</div>
|
||
</div>
|
||
<div class="section" id="privacy">
|
||
<h2>Privacy</h2>
|
||
<p>The first thing that's come to people minds when it comes to process their
|
||
browsing data is privacy. People don't want to be stalked. That's perfectly
|
||
right, and I don't either.</p>
|
||
<p>But such a system don't have to deal with people identities. It's completely
|
||
possible to process completely anonymous data, and that's probably what I'm
|
||
gonna do.</p>
|
||
<p>By the way, if you have interesting thoughts about that, if you do know
|
||
projects that do seems related, fire the comments !</p>
|
||
</div>
|
||
<div class="section" id="what-s-the-plan">
|
||
<h2>What's the plan ?</h2>
|
||
<p>There is a lot of different things to explore, especially because I'm
|
||
a complete novice in that field.</p>
|
||
<ul class="simple">
|
||
<li>I want to develop a firefox plugin, to extract the browsing informations (
|
||
still, I need to know exactly which kind of informations to retrieve). The
|
||
idea is to provide some <em>raw</em> browsing data, and then to transform it and to
|
||
store it in the better possible way.</li>
|
||
<li>Analyse how to store the informations in a graph database. What can be the
|
||
different methods to store this data and to visualize the relationship
|
||
between different pieces of data? How can I define the different contexts,
|
||
and add those informations in the db?</li>
|
||
<li>Process the data using well known recommendation algorithms. Compare the
|
||
results and criticize their value.</li>
|
||
</ul>
|
||
<p>There is plenty of stuff I want to try during this experimentation:</p>
|
||
<ul class="simple">
|
||
<li>I want to try using Geshi to visualize the connexion between the links,
|
||
and the contexts</li>
|
||
<li>Try using graph databases such as Neo4j</li>
|
||
<li>Having a deeper look at tools such as scikit.learn (a machine learning
|
||
toolkit in python)</li>
|
||
<li>Analyse web pages in order to categorize them. Processing their
|
||
contents as well, to do some keyword based classification will be done.</li>
|
||
</ul>
|
||
<p>Lot of work on its way, yay !</p>
|
||
</div>
|
||
</summary><category term="recommendations"></category><category term="browsers"></category><category term="users"></category></entry><entry><title>Wrap up of the distutils2 paris' sprint</title><link href="http://blog.notmyidea.org/wrap-up-of-the-distutils2-paris-sprint.html" rel="alternate"></link><updated>2011-02-08T00:00:00+01:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2011-02-08:/wrap-up-of-the-distutils2-paris-sprint.html/</id><summary type="html"><p>Finally, thanks to a bunch of people that helped me to pay my train and bus
|
||
tickets, I've made it to paris for the distutils2 sprint.</p>
|
||
<p>They have been a bit more than 10 people to come during the sprint, and it was
|
||
very productive. Here's a taste of what we've been working on:</p>
|
||
<ul class="simple">
|
||
<li>the <cite>datafiles</cite>, a way to specify and to handle the installation of files which
|
||
are not python-related (pictures, manpages and so on).</li>
|
||
<li><cite>mkgcfg</cite>, a tool to help you to create a setup.cfg in minutes (and with funny
|
||
examples)</li>
|
||
<li>converters from setup.py scripts. We do now have a piece of code which
|
||
reads your current <cite>setup.py</cite> file and fill in some fields in the <cite>setup.cfg</cite>
|
||
for you.</li>
|
||
<li>a compatibility layer for distutils1, so it can read the <cite>setup.cfg</cite> you will
|
||
wrote for distutils2 :-)</li>
|
||
<li>the uninstaller, so it's now possible to uninstall what have been installed
|
||
by distutils2 (see PEP 376)</li>
|
||
<li>the installer, and the setuptools compatibility layer, which will allow you
|
||
to rely on setuptools' based distributions (and there are plenty of them!)</li>
|
||
<li>The compilers, so they are more flexible than they were. Since that's an
|
||
obscure part of the code for distutils2 commiters (it comes directly from the
|
||
distutils1 ages), having some guys who understood the problematics here was
|
||
a must.</li>
|
||
</ul>
|
||
<p>Some people have also tried to port their packaging from distutils1 to
|
||
distutils2. They have spotted a number of bugs and made some improvements
|
||
to the code, to make it more friendly to use.</p>
|
||
<p>I'm really pleased to see how newcomers went trough the code, and started
|
||
hacking so fast. I must say it wasn't the case when we started to work on
|
||
distutils1 so that's a very good point: people now can hack the code quicker
|
||
than they could before.</p>
|
||
<p>Some of the features here are not <em>completely</em> finished yet, but are on the
|
||
tubes, and will be ready for a release (hopefully) at the end of the week.</p>
|
||
<p>Big thanks to logilab for hosting (and sponsoring my train ticket) and
|
||
providing us food, and to bearstech for providing some money for breakfast and
|
||
bears^Wbeers.</p>
|
||
<p>Again, a big thanks to all the people who gave me money to pay the transport,
|
||
I really wasn't expecting such thing to happen :-)</p>
|
||
</summary></entry><entry><title>PyPI on CouchDB</title><link href="http://blog.notmyidea.org/pypi-on-couchdb.html" rel="alternate"></link><updated>2011-01-20T00:00:00+01:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2011-01-20:/pypi-on-couchdb.html/</id><summary type="html"><p>By now, there are two ways to retrieve data from PyPI (the Python Package
|
||
Index). You can both rely on xml/rpc or on the &quot;simple&quot; API. The simple
|
||
API is not so simple to use as the name suggest, and have several existing
|
||
drawbacks.</p>
|
||
<p>Basically, if you want to use informations coming from the simple API, you will
|
||
have to parse web pages manually, to extract informations using some black
|
||
vodoo magic. Badly, magic have a price, and it's sometimes impossible to get
|
||
exactly the informations you want to get from this index. That's the technique
|
||
currently being used by distutils2, setuptools and pip.</p>
|
||
<p>On the other side, while XML/RPC is working fine, it's requiring extra work
|
||
to the python servers each time you request something, which can lead to
|
||
some outages from time to time. Also, it's important to point out that, even if
|
||
PyPI have a mirroring infrastructure, it's only for the so-called <em>simple</em> API,
|
||
and not for the XML/RPC.</p>
|
||
<div class="section" id="couchdb">
|
||
<h2>CouchDB</h2>
|
||
<p>Here comes CouchDB. CouchDB is a document oriented database, that
|
||
knows how to speak REST and JSON. It's easy to use, and provides out of the box
|
||
a replication mechanism.</p>
|
||
</div>
|
||
<div class="section" id="so-what">
|
||
<h2>So, what ?</h2>
|
||
<p>Hmm, I'm sure you got it. I've wrote a piece of software to link informations from
|
||
PyPI to a CouchDB instance. Then you can replicate all the PyPI index with only
|
||
one HTTP request on the CouchDB server. You can also access the informations
|
||
from the index directly using a REST API, speaking json. Handy.</p>
|
||
<p>So PyPIonCouch is using the PyPI XML/RPC API to get data from PyPI, and
|
||
generate records in the CouchDB instance.</p>
|
||
<p>The final goal is to avoid to rely on this &quot;simple&quot; API, and rely on a REST
|
||
insterface instead. I have set up a couchdb server on my server, which is
|
||
available at <a class="reference external" href="http://couchdb.notmyidea.org/_utils/database.html?pypi">http://couchdb.notmyidea.org/_utils/database.html?pypi</a>.</p>
|
||
<p>There is not a lot to
|
||
see there for now, but I've done the first import from PyPI yesterday and all
|
||
went fine: it's possible to access the metadata of all PyPI projects via a REST
|
||
interface. Next step is to write a client for this REST interface in
|
||
distutils2.</p>
|
||
</div>
|
||
<div class="section" id="example">
|
||
<h2>Example</h2>
|
||
<p>For now, you can use pypioncouch via the command line, or via the python API.</p>
|
||
<div class="section" id="using-the-command-line">
|
||
<h3>Using the command line</h3>
|
||
<p>You can do something like that for a full import. This <strong>will</strong> take long,
|
||
because it's fetching all the projects at pypi and importing their metadata:</p>
|
||
<pre class="literal-block">
|
||
$ pypioncouch --fullimport http://your.couchdb.instance/
|
||
</pre>
|
||
<p>If you already have the data on your couchdb instance, you can just update it
|
||
with the last informations from pypi. <strong>However, I recommend to just replicate
|
||
the principal node, hosted at http://couchdb.notmyidea.org/pypi/</strong>, to avoid
|
||
the duplication of nodes:</p>
|
||
<pre class="literal-block">
|
||
$ pypioncouch --update http://your.couchdb.instance/
|
||
</pre>
|
||
<p>The principal node is updated once a day by now, I'll try to see if it's
|
||
enough, and ajust with the time.</p>
|
||
</div>
|
||
<div class="section" id="using-the-python-api">
|
||
<h3>Using the python API</h3>
|
||
<p>You can also use the python API to interact with pypioncouch:</p>
|
||
<pre class="literal-block">
|
||
&gt;&gt;&gt; from pypioncouch import XmlRpcImporter, import_all, update
|
||
&gt;&gt;&gt; full_import()
|
||
&gt;&gt;&gt; update()
|
||
</pre>
|
||
</div>
|
||
</div>
|
||
<div class="section" id="what-s-next">
|
||
<h2>What's next ?</h2>
|
||
<p>I want to make a couchapp, in order to navigate PyPI easily. Here are some of
|
||
the features I want to propose:</p>
|
||
<ul class="simple">
|
||
<li>List all the available projects</li>
|
||
<li>List all the projects, filtered by specifiers</li>
|
||
<li>List all the projects by author/maintainer</li>
|
||
<li>List all the projects by keywords</li>
|
||
<li>Page for each project.</li>
|
||
<li>Provide a PyPI &quot;Simple&quot; API equivalent, even if I want to replace it, I do
|
||
think it will be really easy to setup mirrors that way, with the out of the
|
||
box couchdb replication</li>
|
||
</ul>
|
||
<p>I also still need to polish the import mechanism, so I can directly store in
|
||
couchdb:</p>
|
||
<ul class="simple">
|
||
<li>The OPML files for each project</li>
|
||
<li>The upload_time as couchdb friendly format (list of int)</li>
|
||
<li>The tags as lists (currently it's only a string separated by spaces</li>
|
||
</ul>
|
||
<p>The work I've done by now is available on
|
||
<a class="reference external" href="https://bitbucket.org/ametaireau/pypioncouch/">https://bitbucket.org/ametaireau/pypioncouch/</a>. Keep in mind that it's still
|
||
a work in progress, and everything can break at any time. However, any feedback
|
||
will be appreciated !</p>
|
||
</div>
|
||
</summary></entry><entry><title>Help me to go to the distutils2 paris' sprint</title><link href="http://blog.notmyidea.org/help-me-to-go-to-the-distutils2-paris-sprint.html" rel="alternate"></link><updated>2011-01-15T00:00:00+01:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2011-01-15:/help-me-to-go-to-the-distutils2-paris-sprint.html/</id><summary type="html"><p><strong>Edit: Thanks to logilab and some amazing people, I can make it to paris for the
|
||
sprint. Many thanks to them for the support!</strong></p>
|
||
<p>There will be a distutils2 sprint from the 27th to the 30th of january, thanks
|
||
to logilab which will host the event.</p>
|
||
<p>You can find more informations about the sprint on the wiki page of the event
|
||
(<a class="reference external" href="http://wiki.python.org/moin/Distutils/SprintParis">http://wiki.python.org/moin/Distutils/SprintParis</a>).</p>
|
||
<p>I really want to go there but I'm unfortunately blocked in UK for money reasons.
|
||
The cheapest two ways I've found is about £80, which I can't afford.
|
||
Following some advices on #distutils, I've set up a ChipIn account for that, so
|
||
if some people want to help me making it to go there, they can give me some
|
||
money that way.</p>
|
||
<p>I'll probably work on the installer (to support old distutils and
|
||
setuptools distributions) and on the uninstaller (depending on the first
|
||
task). If I can't make it to paris, I'll hang around on IRC to give some help
|
||
while needed.</p>
|
||
<p>If you want to contribute some money to help me go there, feel free to use this
|
||
chipin page: <a class="reference external" href="http://ametaireau.chipin.com/distutils2-sprint-in-paris">http://ametaireau.chipin.com/distutils2-sprint-in-paris</a></p>
|
||
<p>Thanks for your support !</p>
|
||
</summary></entry><entry><title>How to reboot your bebox using the CLI</title><link href="http://blog.notmyidea.org/how-to-reboot-your-bebox-using-the-cli.html" rel="alternate"></link><updated>2010-10-21T00:00:00+02:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2010-10-21:/how-to-reboot-your-bebox-using-the-cli.html/</id><summary type="html"><p>I've an internet connection which, for some obscure reasons, tend to be very
|
||
slow from time to time. After rebooting the box (yes, that's a hard solution),
|
||
all the things seems to go fine again.</p>
|
||
<div class="section" id="edit-using-grep">
|
||
<h2>EDIT : Using grep</h2>
|
||
<p>After a bit of reflexion, that's also really easy to do using directly the
|
||
command line tools curl, grep and tail (but really harder to read).</p>
|
||
<div class="highlight"><pre>curl -X POST -u joel:joel http://bebox.config/cgi/b/info/restart/<span class="se">\?</span>be<span class="se">\=</span>0<span class="se">\&amp;</span>l0<span class="se">\=</span>1<span class="se">\&amp;</span>l1<span class="se">\=</span>0<span class="se">\&amp;</span>tid<span class="se">\=</span>RESTART -d <span class="s2">&quot;0=17&amp;2=`curl -u joel:joel http://bebox.config/cgi/b/info/restart/\?be\=0\&amp;l0\=1\&amp;l1\=0\&amp;tid\=RESTART | grep -o &quot;</span><span class="nv">name</span><span class="o">=</span><span class="s1">&#39;2&#39;</span> <span class="nv">value</span><span class="o">=</span><span class="err">&#39;</span><span class="o">[</span>0-9<span class="o">]</span><span class="se">\+</span><span class="s2">&quot; | grep -o &quot;</span><span class="o">[</span>0-9<span class="o">]</span><span class="se">\+</span><span class="s2">&quot; | tail -n 1`&amp;1&quot;</span>
|
||
</pre></div>
|
||
</div>
|
||
<div class="section" id="the-python-version">
|
||
<h2>The Python version</h2>
|
||
<p>Well, that's not the optimal solution, that's a bit &quot;gruik&quot;, but it works.</p>
|
||
<div class="highlight"><pre><span class="kn">import</span> <span class="nn">urllib2</span>
|
||
<span class="kn">import</span> <span class="nn">urlparse</span>
|
||
<span class="kn">import</span> <span class="nn">re</span>
|
||
<span class="kn">import</span> <span class="nn">argparse</span>
|
||
|
||
<span class="n">REBOOT_URL</span> <span class="o">=</span> <span class="s">&#39;/b/info/restart/?be=0&amp;l0=1&amp;l1=0&amp;tid=RESTART&#39;</span>
|
||
<span class="n">BOX_URL</span> <span class="o">=</span> <span class="s">&#39;http://bebox.config/cgi&#39;</span>
|
||
|
||
<span class="k">def</span> <span class="nf">open_url</span><span class="p">(</span><span class="n">url</span><span class="p">,</span> <span class="n">username</span><span class="p">,</span> <span class="n">password</span><span class="p">):</span>
|
||
<span class="n">passman</span> <span class="o">=</span> <span class="n">urllib2</span><span class="o">.</span><span class="n">HTTPPasswordMgrWithDefaultRealm</span><span class="p">()</span>
|
||
<span class="n">passman</span><span class="o">.</span><span class="n">add_password</span><span class="p">(</span><span class="bp">None</span><span class="p">,</span> <span class="n">url</span><span class="p">,</span> <span class="n">username</span><span class="p">,</span> <span class="n">password</span><span class="p">)</span>
|
||
<span class="n">authhandler</span> <span class="o">=</span> <span class="n">urllib2</span><span class="o">.</span><span class="n">HTTPBasicAuthHandler</span><span class="p">(</span><span class="n">passman</span><span class="p">)</span>
|
||
|
||
<span class="n">opener</span> <span class="o">=</span> <span class="n">urllib2</span><span class="o">.</span><span class="n">build_opener</span><span class="p">(</span><span class="n">authhandler</span><span class="p">)</span>
|
||
|
||
<span class="n">urllib2</span><span class="o">.</span><span class="n">install_opener</span><span class="p">(</span><span class="n">opener</span><span class="p">)</span>
|
||
|
||
<span class="k">return</span> <span class="n">urllib2</span><span class="o">.</span><span class="n">urlopen</span><span class="p">(</span><span class="n">url</span><span class="p">)</span><span class="o">.</span><span class="n">read</span><span class="p">()</span>
|
||
|
||
<span class="k">def</span> <span class="nf">reboot</span><span class="p">(</span><span class="n">url</span><span class="p">,</span> <span class="n">username</span><span class="p">,</span> <span class="n">password</span><span class="p">):</span>
|
||
<span class="n">data</span> <span class="o">=</span> <span class="n">open_url</span><span class="p">(</span><span class="n">url</span><span class="p">,</span> <span class="n">username</span><span class="p">,</span> <span class="n">password</span><span class="p">)</span>
|
||
<span class="n">token</span> <span class="o">=</span> <span class="n">re</span><span class="o">.</span><span class="n">findall</span><span class="p">(</span><span class="s">&quot;name\=</span><span class="se">\\</span><span class="s">&#39;2</span><span class="se">\\</span><span class="s">&#39; value=</span><span class="se">\\</span><span class="s">&#39;([0-9]+)</span><span class="se">\\</span><span class="s">&#39;&quot;</span><span class="p">,</span> <span class="n">data</span><span class="p">)[</span><span class="mi">1</span><span class="p">]</span>
|
||
<span class="n">urllib2</span><span class="o">.</span><span class="n">urlopen</span><span class="p">(</span><span class="n">urllib2</span><span class="o">.</span><span class="n">Request</span><span class="p">(</span><span class="n">url</span><span class="o">=</span><span class="n">url</span><span class="p">,</span> <span class="n">data</span><span class="o">=</span><span class="s">&#39;0=17&amp;2=</span><span class="si">%s</span><span class="s">&amp;1&#39;</span> <span class="o">%</span> <span class="n">token</span><span class="p">))</span>
|
||
|
||
<span class="k">if</span> <span class="n">__file__</span> <span class="o">==</span> <span class="s">&#39;__main__&#39;</span><span class="p">:</span>
|
||
<span class="n">parser</span> <span class="o">=</span> <span class="n">argparse</span><span class="o">.</span><span class="n">ArgumentParser</span><span class="p">(</span><span class="n">description</span><span class="o">=</span><span class="s">&quot;&quot;&quot;Reboot your bebox !&quot;&quot;&quot;</span><span class="p">)</span>
|
||
|
||
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span><span class="n">dest</span><span class="o">=</span><span class="s">&#39;user&#39;</span><span class="p">,</span> <span class="n">help</span><span class="o">=</span><span class="s">&#39;username&#39;</span><span class="p">)</span>
|
||
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span><span class="n">dest</span><span class="o">=</span><span class="s">&#39;password&#39;</span><span class="p">,</span> <span class="n">help</span><span class="o">=</span><span class="s">&#39;password&#39;</span><span class="p">)</span>
|
||
<span class="n">parser</span><span class="o">.</span><span class="n">add_argument</span><span class="p">(</span><span class="n">boxurl</span><span class="o">=</span><span class="s">&#39;boxurl&#39;</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="n">BOX_URL</span><span class="p">,</span> <span class="n">help</span><span class="o">=</span><span class="s">&#39;Base box url. Default is </span><span class="si">%s</span><span class="s">&#39;</span> <span class="o">%</span> <span class="n">BOX_URL</span><span class="p">)</span>
|
||
|
||
<span class="n">args</span> <span class="o">=</span> <span class="n">parser</span><span class="o">.</span><span class="n">parse_args</span><span class="p">()</span>
|
||
<span class="n">url</span> <span class="o">=</span> <span class="n">urlparse</span><span class="o">.</span><span class="n">urljoin</span><span class="p">(</span><span class="n">args</span><span class="o">.</span><span class="n">boxurl</span><span class="p">,</span> <span class="n">REBOOT_URL</span><span class="p">)</span>
|
||
<span class="n">reboot</span><span class="p">(</span><span class="n">url</span><span class="p">,</span> <span class="n">args</span><span class="o">.</span><span class="n">username</span><span class="p">,</span> <span class="n">args</span><span class="o">.</span><span class="n">password</span><span class="p">)</span>
|
||
</pre></div>
|
||
</div>
|
||
</summary></entry><entry><title>Dynamically change your gnome desktop wallpaper</title><link href="http://blog.notmyidea.org/dynamically-change-your-gnome-desktop-wallpaper.html" rel="alternate"></link><updated>2010-10-11T00:00:00+02:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2010-10-11:/dynamically-change-your-gnome-desktop-wallpaper.html/</id><summary type="html"><p>In gnome, you can can use a XML file to have a dynamic wallpaper.
|
||
It's not so easy, and you can't just tell: use the pictures in this folder to do
|
||
so.</p>
|
||
<p>You can have a look to the git repository if you want: <a class="reference external" href="http://github.com/ametaireau/gnome-background-generator">http://github.com/ametaireau/gnome-background-generator</a></p>
|
||
<p>Some time ago, I've made a little python script to ease that, and you can now
|
||
use it too. It's named &quot;gnome-background-generator&quot;, and you can install it via
|
||
pip for instance.</p>
|
||
<div class="highlight"><pre>$ pip install gnome-background-generator
|
||
</pre></div>
|
||
<p>Then, you have just to use it this way:</p>
|
||
<div class="highlight"><pre>$ gnome-background-generator -p ~/Images/walls -s
|
||
/home/alexis/Images/walls/dynamic-wallpaper.xml generated
|
||
</pre></div>
|
||
<p>Here is a extract of the <cite>--help</cite>:</p>
|
||
<div class="highlight"><pre>$ gnome-background-generator --help
|
||
usage: gnome-background-generator [-h] [-p PATH] [-o OUTPUT]
|
||
[-t TRANSITION_TIME] [-d DISPLAY_TIME] [-s]
|
||
[-b]
|
||
|
||
A simple command line tool to generate an XML file to use for gnome
|
||
wallpapers, to have dynamic walls
|
||
|
||
optional arguments:
|
||
-h, --help show this help message and exit
|
||
-p PATH, --path PATH Path to look for the pictures. If no output is
|
||
specified, will be used too for outputing the dynamic-
|
||
wallpaper.xml file. Default value is the current
|
||
directory (.)
|
||
-o OUTPUT, --output OUTPUT
|
||
Output filename. If no filename is specified, a
|
||
dynamic-wallpaper.xml file will be generated in the
|
||
path containing the pictures. You can also use &quot;-&quot; to
|
||
display the xml in the stdout.
|
||
-t TRANSITION_TIME, --transition-time TRANSITION_TIME
|
||
Time (in seconds) transitions must last (default value
|
||
is 2 seconds)
|
||
-d DISPLAY_TIME, --display-time DISPLAY_TIME
|
||
Time (in seconds) a picture must be displayed. Default
|
||
value is 900 (15mn)
|
||
-s, --set-background &#39;&#39;&#39;try to set the background using gnome-appearance-
|
||
properties
|
||
-b, --debug
|
||
</pre></div>
|
||
</summary></entry><entry><title>Pelican, a simple static blog generator in python</title><link href="http://blog.notmyidea.org/pelican-a-simple-static-blog-generator-in-python.html" rel="alternate"></link><updated>2010-10-06T00:00:00+02:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2010-10-06:/pelican-a-simple-static-blog-generator-in-python.html/</id><summary type="html"><p>Those days, I've wrote a little python application to fit my blogging needs.
|
||
I'm an occasional blogger, a vim lover, I like restructured text and DVCSes, so
|
||
I've made a little tool that makes good use of all that.</p>
|
||
<p><a class="reference external" href="http://alexis.notmyidea.org/pelican/">Pelican</a> (for calepin) is just a simple tool to generate your blog as static
|
||
files, letting you using your editor of choice (vim!). It's easy to extend,
|
||
and has a template support (via jinja2).</p>
|
||
<p>I've made it to fit <em>my</em> needs. I hope it will fit yours, but maybe it wont, and
|
||
it have not be designed to feet everyone's needs.</p>
|
||
<p>Need an example ? You're looking at it ! This weblog is using pelican to be
|
||
generated, also for the atom feeds.</p>
|
||
<p>I've released it under AGPL, since I want all the modifications to be profitable
|
||
to all the users.</p>
|
||
<p>You can find a mercurial repository to fork at <a class="reference external" href="http://hg.lolnet.org/pelican/">http://hg.lolnet.org/pelican/</a>,
|
||
feel free to hack it !</p>
|
||
<p>If you just want to get started, use your installer of choice (pip, easy_install, …)
|
||
And then have a look to the help (<cite>pelican --help</cite>)</p>
|
||
<div class="highlight"><pre><span class="nv">$ </span>pip install pelican
|
||
</pre></div>
|
||
<div class="section" id="usage">
|
||
<h2>Usage</h2>
|
||
<p>Here's a sample usage of pelican</p>
|
||
<div class="highlight"><pre><span class="nv">$ </span>pelican .
|
||
writing /home/alexis/projets/notmyidea.org/output/index.html
|
||
writing /home/alexis/projets/notmyidea.org/output/tags.html
|
||
writing /home/alexis/projets/notmyidea.org/output/categories.html
|
||
writing /home/alexis/projets/notmyidea.org/output/archives.html
|
||
writing /home/alexis/projets/notmyidea.org/output/category/python.html
|
||
writing
|
||
/home/alexis/projets/notmyidea.org/output/pelican-a-simple-static-blog-generator-in-python.html
|
||
Done !
|
||
</pre></div>
|
||
<p>You also can use the <cite>--help</cite> option for the command line to get more
|
||
informations</p>
|
||
<div class="highlight"><pre><span class="nv">$pelican</span> --help
|
||
usage: pelican <span class="o">[</span>-h<span class="o">]</span> <span class="o">[</span>-t TEMPLATES<span class="o">]</span> <span class="o">[</span>-o OUTPUT<span class="o">]</span> <span class="o">[</span>-m MARKUP<span class="o">]</span> <span class="o">[</span>-s SETTINGS<span class="o">]</span> <span class="o">[</span>-b<span class="o">]</span>
|
||
path
|
||
|
||
A tool to generate a static blog, with restructured text input files.
|
||
|
||
positional arguments:
|
||
path Path where to find the content files <span class="o">(</span>default is
|
||
<span class="s2">&quot;content&quot;</span><span class="o">)</span>.
|
||
|
||
optional arguments:
|
||
-h, --help show this <span class="nb">help </span>message and <span class="nb">exit</span>
|
||
-t TEMPLATES, --templates-path TEMPLATES
|
||
Path where to find the templates. If not specified,
|
||
will uses the ones included with pelican.
|
||
-o OUTPUT, --output OUTPUT
|
||
Where to output the generated files. If not specified,
|
||
a directory will be created, named <span class="s2">&quot;output&quot;</span> in the
|
||
current path.
|
||
-m MARKUP, --markup MARKUP
|
||
the markup language to use. Currently only
|
||
ReSTreucturedtext is available.
|
||
-s SETTINGS, --settings SETTINGS
|
||
the settings of the application. Default to None.
|
||
-b, --debug
|
||
</pre></div>
|
||
<p>Enjoy :)</p>
|
||
</div>
|
||
</summary></entry><entry><title>An amazing summer of code working on distutils2</title><link href="http://blog.notmyidea.org/an-amazing-summer-of-code-working-on-distutils2.html" rel="alternate"></link><updated>2010-08-16T00:00:00+02:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2010-08-16:/an-amazing-summer-of-code-working-on-distutils2.html/</id><summary type="html"><p>The <a class="reference external" href="http://code.google.com/soc/">Google Summer of Code</a> I've
|
||
spent working on <a class="reference external" href="http://hg.python.org/distutils2/">distutils2</a>
|
||
is over. It was a really amazing experience, for many reasons.</p>
|
||
<p>First of all, we had a very good team, we were 5 students working
|
||
on distutils2: <a class="reference external" href="http://zubin71.wordpress.com">Zubin</a>,
|
||
<a class="reference external" href="http://wokslog.wordpress.com/">Éric</a>,
|
||
<a class="reference external" href="http://gsoc.djolonga.com/">Josip</a>,
|
||
<a class="reference external" href="http://konryd.blogspot.com/">Konrad</a> and me. In addition,
|
||
<a class="reference external" href="http://mouadino.blogspot.com/">Mouad</a> have worked on the PyPI
|
||
testing infrastructure. You could find what each person have done
|
||
on
|
||
<a class="reference external" href="http://bitbucket.org/tarek/distutils2/wiki/GSoC_2010_teams">the wiki page of distutils2</a>.</p>
|
||
<p>We were in contact with each others really often, helping us when
|
||
possible (in #distutils), and were continuously aware of the state
|
||
of the work of each participant. This, in my opinion, have bring us
|
||
in a good shape.</p>
|
||
<p>Then, I've learned a lot. Python packaging was completely new to me
|
||
at the time of the GSoC start, and I was pretty unfamiliar with
|
||
python good practices too, as I've been introducing myself to
|
||
python in the late 2009.</p>
|
||
<p>I've recently looked at some python code I wrote just three months
|
||
ago, and I was amazed to think about many improvements to made on
|
||
it. I guess this is a good indicator of the path I've traveled
|
||
since I wrote it.</p>
|
||
<p>This summer was awesome because I've learned about python good
|
||
practices, now having some strong
|
||
<a class="reference external" href="http://mercurial.selenic.com/">mercurial</a> knowledge, and I've
|
||
seen a little how the python community works.</p>
|
||
<p>Then, I would like to say a big thanks to all the mentors that have
|
||
hanged around while needed, on IRC or via mail, and especially my
|
||
mentor for this summer, <a class="reference external" href="http://tarek.ziade.org">Tarek Ziadé</a>.</p>
|
||
<p>Thanks a lot for your motivation, your leadership and your
|
||
cheerfulness, even with a new-born and a new work!</p>
|
||
<div class="section" id="why">
|
||
<h2>Why ?</h2>
|
||
<p>I wanted to work on python packaging because, as the time pass, we
|
||
were having a sort of complex tools in this field. Each one wanted
|
||
to add features to distutils, but not in a standard way.</p>
|
||
<p>Now, we have PEPs that describes some format we agreed on (see PEP
|
||
345), and we wanted to have a tool on which users can base their
|
||
code on, that's <a class="reference external" href="http://hg.python.org/distutils2/">distutils2</a>.</p>
|
||
</div>
|
||
<div class="section" id="my-job">
|
||
<h2>My job</h2>
|
||
<p>I had to provides a way to crawl the PyPI indexes in a simple way,
|
||
and do some installation / uninstallation scripts.</p>
|
||
<p>All the work done is available in
|
||
<a class="reference external" href="http://bitbucket.org/ametaireau/distutils2/">my bitbucket repository</a>.</p>
|
||
<div class="section" id="crawling-the-pypi-indexes">
|
||
<h3>Crawling the PyPI indexes</h3>
|
||
<p>There are two ways of requesting informations from the indexes:
|
||
using the &quot;simple&quot; index, that is a kind of REST index, and using
|
||
XML-RPC.</p>
|
||
<p>I've done the two implementations, and a high level API to query
|
||
those twos. Basically, this supports the mirroring infrastructure
|
||
defined in PEP 381. So far, the work I've done is gonna be used in
|
||
pip (they've basically copy/paste the code, but this will change as
|
||
soon as we get something completely stable for distutils2), and
|
||
that's a good news, as it was the main reason for what I've done
|
||
that.</p>
|
||
<p>I've tried to have an unified API for the clients, to switch from
|
||
one to another implementation easily. I'm already thinking of
|
||
adding others crawlers to this stuff, and it was made to be
|
||
extensible.</p>
|
||
<p>If you want to get more informations about the crawlers/PyPI
|
||
clients, please refer to the distutils2 documentation, especially
|
||
<a class="reference external" href="http://distutils2.notmyidea.org/library/distutils2.index.html">the pages about indexes</a>.</p>
|
||
<p>You can find the changes I made about this in the
|
||
<a class="reference external" href="http://hg.python.org/distutils2/">distutils2</a> source code .</p>
|
||
</div>
|
||
<div class="section" id="installation-uninstallation-scripts">
|
||
<h3>Installation / Uninstallation scripts</h3>
|
||
<p>Next step was to think about an installation script, and an
|
||
uninstaller. I've not done the uninstaller part, and it's a smart
|
||
part, as it's basically removing some files from the system, so
|
||
I'll probably do it in a near future.</p>
|
||
<p><a class="reference external" href="http://hg.python.org/distutils2/">distutils2</a> provides a way to
|
||
install distributions, and to handle dependencies between releases.
|
||
For now, this support is only about the last version of the
|
||
METADATA (1.2) (See, the PEP 345), but I'm working on a
|
||
compatibility layer for the old metadata, and for the informations
|
||
provided via PIP requires.txt, for instance.</p>
|
||
</div>
|
||
<div class="section" id="extra-work">
|
||
<h3>Extra work</h3>
|
||
<p>Also, I've done some extra work. this includes:</p>
|
||
<ul class="simple">
|
||
<li>working on the PEP 345, and having some discussion about it
|
||
(about the names of some fields).</li>
|
||
<li>writing a PyPI server mock, useful for tests. you can find more
|
||
information about it on the
|
||
<a class="reference external" href="http://distutils.notmyidea.org">documentation</a>.</li>
|
||
</ul>
|
||
</div>
|
||
</div>
|
||
<div class="section" id="futures-plans">
|
||
<h2>Futures plans</h2>
|
||
<p>As I said, I've enjoyed working on distutils2, and the people I've
|
||
met here are really pleasant to work with. So I <em>want</em> to continue
|
||
contributing on python, and especially on python packaging, because
|
||
there is still a lot of things to do in this scope, to get
|
||
something really usable.</p>
|
||
<p>I'm not plainly satisfied by the work I've done, so I'll probably
|
||
tweak it a bit: the installer part is not yet completely finished,
|
||
and I want to add support for a real
|
||
<a class="reference external" href="http://en.wikipedia.org/wiki/Representational_State_Transfer">REST</a>
|
||
index in the future.</p>
|
||
<p>We'll talk again of this in the next months, probably, but we
|
||
definitely need a real
|
||
<a class="reference external" href="http://en.wikipedia.org/wiki/Representational_State_Transfer">REST</a>
|
||
API for <a class="reference external" href="http://pypi.python.org">PyPI</a>, as the &quot;simple&quot; index
|
||
<em>is</em> an ugly hack, in my opinion. I'll work on a serious
|
||
proposition about this, maybe involving
|
||
<a class="reference external" href="http://couchdb.org">CouchDB</a>, as it seems to be a good option
|
||
for what we want here.</p>
|
||
</div>
|
||
<div class="section" id="issues">
|
||
<h2>Issues</h2>
|
||
<p>I've encountered some issues during this summer. The main one is
|
||
that's hard to work remotely, especially being in the same room
|
||
that we live, with others. I like to just think about a project
|
||
with other people, a paper and a pencil, no computers. This have
|
||
been not so possible at the start of the project, as I needed to
|
||
read a lot of code to understand the codebase, and then to
|
||
read/write emails.</p>
|
||
<p>I've finally managed to work in an office, so good point for
|
||
home/office separation.</p>
|
||
<p>I'd not planned there will be so a high number of emails to read,
|
||
in order to follow what's up in the python world, and be a part of
|
||
the community seems to takes some times to read/write emails,
|
||
especially for those (like me) that arent so confortable with
|
||
english (but this had brought me some english fu !).</p>
|
||
</div>
|
||
<div class="section" id="thanks">
|
||
<h2>Thanks !</h2>
|
||
<p>A big thanks to <a class="reference external" href="http://www.graine-libre.fr/">Graine Libre</a> and
|
||
<a class="reference external" href="http://www.makina-corpus.com/">Makina Corpus</a>, which has offered
|
||
me to come into their offices from time to time, to share they
|
||
cheerfulness ! Many thanks too to the Google Summer of Code program
|
||
for setting up such an initiative. If you're a student, if you're
|
||
interested about FOSS, dont hesitate any second, it's a really good
|
||
opportunity to work on interesting projects!</p>
|
||
</div>
|
||
</summary></entry><entry><title>Introducing the distutils2 index crawlers</title><link href="http://blog.notmyidea.org/introducing-the-distutils2-index-crawlers.html" rel="alternate"></link><updated>2010-07-06T00:00:00+02:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2010-07-06:/introducing-the-distutils2-index-crawlers.html/</id><summary type="html"><p>I'm working for about a month for distutils2, even if I was being a
|
||
bit busy (as I had some class courses and exams to work on)</p>
|
||
<p>I'll try do sum-up my general feelings here, and the work I've made
|
||
so far. You can also find, if you're interested, my weekly
|
||
summaries in
|
||
<a class="reference external" href="http://wiki.notmyidea.org/distutils2_schedule">a dedicated wiki page</a>.</p>
|
||
<div class="section" id="general-feelings">
|
||
<h2>General feelings</h2>
|
||
<p>First, and it's a really important point, the GSoC is going very
|
||
well, for me as for other students, at least from my perspective.
|
||
It's a pleasure to work with such enthusiast people, as this make
|
||
the global atmosphere very pleasant to live.</p>
|
||
<p>First of all, I've spent time to read the existing codebase, and to
|
||
understand what we're going to do, and what's the rationale to do
|
||
so.</p>
|
||
<p>It's really clear for me now: what we're building is the
|
||
foundations of a packaging infrastructure in python. The fact is
|
||
that many projects co-exists, and comes all with their good
|
||
concepts. Distutils2 tries to take the interesting parts of all,
|
||
and to provide it in the python standard libs, respecting the
|
||
recently written PEP about packaging.</p>
|
||
<p>With distutils2, it will be simpler to make &quot;things&quot; compatible. So
|
||
if you think about a new way to deal with distributions and
|
||
packaging in python, you can use the Distutils2 APIs to do so.</p>
|
||
</div>
|
||
<div class="section" id="tasks">
|
||
<h2>Tasks</h2>
|
||
<p>My main task while working on distutils2 is to provide an
|
||
installation and an un-installation command, as described in PEP
|
||
376. For this, I first need to get informations about the existing
|
||
distributions (what's their version, name, metadata, dependencies,
|
||
etc.)</p>
|
||
<p>The main index, you probably know and use, is PyPI. You can access
|
||
it at <a class="reference external" href="http://pypi.python.org">http://pypi.python.org</a>.</p>
|
||
</div>
|
||
<div class="section" id="pypi-index-crawling">
|
||
<h2>PyPI index crawling</h2>
|
||
<p>There is two ways to get these informations from PyPI: using the
|
||
simple API, or via xml-rpc calls.</p>
|
||
<p>A goal was to use the version specifiers defined
|
||
in`PEP 345 &lt;<a class="reference external" href="http://www.python.org/dev/peps/pep-0345/">http://www.python.org/dev/peps/pep-0345/</a>&gt;`_ and to
|
||
provides a way to sort the grabbed distributions depending our
|
||
needs, to pick the version we want/need.</p>
|
||
<div class="section" id="using-the-simple-api">
|
||
<h3>Using the simple API</h3>
|
||
<p>The simple API is composed of HTML pages you can access at
|
||
<a class="reference external" href="http://pypi.python.org/simple/">http://pypi.python.org/simple/</a>.</p>
|
||
<p>Distribute and Setuptools already provides a crawler for that, but
|
||
it deals with their internal mechanisms, and I found that the code
|
||
was not so clear as I want, that's why I've preferred to pick up
|
||
the good ideas, and some implementation details, plus re-thinking
|
||
the global architecture.</p>
|
||
<p>The rules are simple: each project have a dedicated page, which
|
||
allows us to get informations about:</p>
|
||
<ul class="simple">
|
||
<li>the distribution download locations (for some versions)</li>
|
||
<li>homepage links</li>
|
||
<li>some other useful informations, as the bugtracker address, for
|
||
instance.</li>
|
||
</ul>
|
||
<p>If you want to find all the distributions of the &quot;EggsAndSpam&quot;
|
||
project, you could do the following (do not take so attention to
|
||
the names here, as the API will probably change a bit):</p>
|
||
<div class="highlight"><pre><span class="o">&gt;&gt;&gt;</span> <span class="n">index</span> <span class="o">=</span> <span class="n">SimpleIndex</span><span class="p">()</span>
|
||
<span class="o">&gt;&gt;&gt;</span> <span class="n">index</span><span class="o">.</span><span class="n">find</span><span class="p">(</span><span class="s">&quot;EggsAndSpam&quot;</span><span class="p">)</span>
|
||
<span class="p">[</span><span class="n">EggsAndSpam</span> <span class="mf">1.1</span><span class="p">,</span> <span class="n">EggsAndSpam</span> <span class="mf">1.2</span><span class="p">,</span> <span class="n">EggsAndSpam</span> <span class="mf">1.3</span><span class="p">]</span>
|
||
</pre></div>
|
||
<p>We also could use version specifiers:</p>
|
||
<div class="highlight"><pre><span class="o">&gt;&gt;&gt;</span> <span class="n">index</span><span class="o">.</span><span class="n">find</span><span class="p">(</span><span class="s">&quot;EggsAndSpam (&lt; =1.2)&quot;</span><span class="p">)</span>
|
||
<span class="p">[</span><span class="n">EggsAndSpam</span> <span class="mf">1.1</span><span class="p">,</span> <span class="n">EggsAndSpam</span> <span class="mf">1.2</span><span class="p">]</span>
|
||
</pre></div>
|
||
<p>Internally, what's done here is the following:</p>
|
||
<ul class="simple">
|
||
<li>it process the
|
||
<a class="reference external" href="http://pypi.python.org/simple/FooBar/">http://pypi.python.org/simple/FooBar/</a>
|
||
page, searching for download URLs.</li>
|
||
<li>for each found distribution download URL, it creates an object,
|
||
containing informations about the project name, the version and the
|
||
URL where the archive remains.</li>
|
||
<li>it sort the found distributions, using version numbers. The
|
||
default behavior here is to prefer source distributions (over
|
||
binary ones), and to rely on the last &quot;final&quot; distribution (rather
|
||
than beta, alpha etc. ones)</li>
|
||
</ul>
|
||
<p>So, nothing hard or difficult here.</p>
|
||
<p>We provides a bunch of other features, like relying on the new PyPI
|
||
mirroring infrastructure or filter the found distributions by some
|
||
criterias. If you're curious, please browse the
|
||
<a class="reference external" href="http://distutils2.notmyidea.org/">distutils2 documentation</a>.</p>
|
||
</div>
|
||
<div class="section" id="using-xml-rpc">
|
||
<h3>Using xml-rpc</h3>
|
||
<p>We also can make some xmlrpc calls to retreive informations from
|
||
PyPI. It's a really more reliable way to get informations from from
|
||
the index (as it's just the index that provides the informations),
|
||
but cost processes on the PyPI distant server.</p>
|
||
<p>For now, this way of querying the xmlrpc client is not available on
|
||
Distutils2, as I'm working on it. The main pieces are already
|
||
present (I'll reuse some work I've made from the SimpleIndex
|
||
querying, and
|
||
<a class="reference external" href="http://github.com/ametaireau/pypiclient">some code already set up</a>),
|
||
what I need to do is to provide a xml-rpc PyPI mock server, and
|
||
that's on what I'm actually working on.</p>
|
||
</div>
|
||
</div>
|
||
<div class="section" id="processes">
|
||
<h2>Processes</h2>
|
||
<p>For now, I'm trying to follow the &quot;documentation, then test, then
|
||
code&quot; path, and that seems to be really needed while working with a
|
||
community. Code is hard to read/understand, compared to
|
||
documentation, and it's easier to change.</p>
|
||
<p>While writing the simple index crawling work, I must have done this
|
||
to avoid some changes on the API, and some loss of time.</p>
|
||
<p>Also, I've set up
|
||
<a class="reference external" href="http://wiki.notmyidea.org/distutils2_schedule">a schedule</a>, and
|
||
the goal is to be sure everything will be ready in time, for the
|
||
end of the summer. (And now, I need to learn to follow schedules
|
||
...)</p>
|
||
</div>
|
||
</summary></entry><entry><title>Sprinting on distutils2 in Tours</title><link href="http://blog.notmyidea.org/sprinting-on-distutils2-in-tours.html" rel="alternate"></link><updated>2010-07-06T00:00:00+02:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2010-07-06:/sprinting-on-distutils2-in-tours.html/</id><summary type="html"><p>Yesterday, as I was traveling to Tours, I've took some time to
|
||
visit Éric, another student who's working on distutils2 this
|
||
summer, as a part of the GSoC. Basically, it was to take a drink,
|
||
discuss a bit about distutils2, our respective tasks and general
|
||
feelings, and to put a face on a pseudonym. I'd really enjoyed this
|
||
time, because Éric knows a lot of things about mercurial and python
|
||
good practices, and I'm eager to learn about those. So, we have
|
||
discussed about things, have not wrote so much code, but have some
|
||
things to propose so far, about documentation, and I also provides
|
||
here some bribes of conversations we had.</p>
|
||
<div class="section" id="documentation">
|
||
<h2>Documentation</h2>
|
||
<p>While writing the PyPI simple index crawler documentation, I
|
||
realized that we miss some structure, or how-to about the
|
||
documentation. Yep, you read well. We lack documentation on how to
|
||
make documentation. Heh. We're missing some rules to follow, and
|
||
this lead to a not-so-structured final documentation. We probably
|
||
target three type of publics, and we can split the documentation
|
||
regarding those:</p>
|
||
<ul class="simple">
|
||
<li><strong>Packagers</strong> who want to distribute their softwares.</li>
|
||
<li><strong>End users</strong> who need to understand how to use end user
|
||
commands, like the installer/uninstaller</li>
|
||
<li><strong>packaging coders</strong> who <em>use</em> distutils2, as a base for
|
||
building a package manager.</li>
|
||
</ul>
|
||
<p>We also need to discuss about a pattern to follow while writing
|
||
documentation. How many parts do we need ? Where to put the API
|
||
description ? etc. That's maybe seems to be not so important, but I
|
||
guess the readers would appreciate to have the same structure all
|
||
along distutils2 documentation.</p>
|
||
</div>
|
||
<div class="section" id="mercurial">
|
||
<h2>Mercurial</h2>
|
||
<p>I'm really <em>not</em> a mercurial power user. I use it on daily basis,
|
||
but I lack of basic knowledge about it. Big thanks Éric for sharing
|
||
yours with me, you're of a great help. We have talked about some
|
||
mercurial extensions that seems to make the life simpler, while
|
||
used the right way. I've not used them so far, so consider this as
|
||
a personal note.</p>
|
||
<ul class="simple">
|
||
<li>hg histedit, to edit the history</li>
|
||
<li>hg crecord, to select the changes to commit</li>
|
||
</ul>
|
||
<p>We have spent some time to review a merge I made sunday, to
|
||
re-merge it, and commit the changes as a new changeset. Awesome.
|
||
These things make me say I <strong>need</strong> to read
|
||
<a class="reference external" href="http://hgbook.red-bean.com/read/">the hg book</a>, and will do as
|
||
soon as I got some spare time: mercurial seems to be simply great.
|
||
So ... Great. I'm a powerful merger now !</p>
|
||
</div>
|
||
<div class="section" id="on-using-tools">
|
||
<h2>On using tools</h2>
|
||
<p>Because we <em>also</em> are <em>hackers</em>, we have shared a bit our ways to
|
||
code, the tools we use, etc. Both of us were using vim, and I've
|
||
discovered vimdiff and hgtk, which will completely change the way I
|
||
navigate into the mercurial history. We aren't &quot;power users&quot;, so we
|
||
have learned from each other about vim tips. You can find
|
||
<a class="reference external" href="http://github.com/ametaireau/dotfiles">my dotfiles on github</a>,
|
||
if it could help. They're not perfect, and not intended to be,
|
||
because changing all the time, as I learn. Don't hesitate to have a
|
||
look, and to propose enhancements if you have !</p>
|
||
</div>
|
||
<div class="section" id="on-being-pythonic">
|
||
<h2>On being pythonic</h2>
|
||
<p>My background as an old Java user disserves me so far, as the
|
||
paradigms are not the same while coding in python. Hard to find the
|
||
more pythonic way to do, and sometimes hard to unlearn my way to
|
||
think about software engineering. Well, it seems that the only
|
||
solution is to read code, and to re-read import this from times to
|
||
times !
|
||
<a class="reference external" href="http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html">Coding like a pythonista</a>
|
||
seems to be a must-read, so, I know what to do.</p>
|
||
</div>
|
||
<div class="section" id="conclusion">
|
||
<h2>Conclusion</h2>
|
||
<p>It was really great. Next time, we'll need to focus a bit more on
|
||
distutils2, and to have a bullet list of things to do, but days
|
||
like this one are opportunities to catch ! We'll probably do
|
||
another sprint in a few weeks, stay tuned !</p>
|
||
</div>
|
||
</summary></entry><entry><title>Use Restructured Text (ReST) to power your presentations</title><link href="http://blog.notmyidea.org/use-restructured-text-rest-to-power-your-presentations.html" rel="alternate"></link><updated>2010-06-25T00:00:00+02:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2010-06-25:/use-restructured-text-rest-to-power-your-presentations.html/</id><summary type="html"><p>Wednesday, we give a presentation, with some friends, about the
|
||
CouchDB Database, to
|
||
<a class="reference external" href="http://www.toulibre.org">the Toulouse local LUG</a>. Thanks a lot
|
||
to all the presents for being there, it was a pleasure to talk
|
||
about this topic with you. Too bad the season is over now an I quit
|
||
Toulouse next year.</p>
|
||
<p>During our brainstorming about the topic, we
|
||
used some paper, and we wanted to make a presentation the simpler
|
||
way. First thing that come to my mind was using
|
||
<a class="reference external" href="http://docutils.sourceforge.net/rst.html">restructured text</a>, so
|
||
I've wrote a simple file containing our different bullet points. In
|
||
fact, there is quite nothing to do then, to have a working
|
||
presentation.</p>
|
||
<p>So far, I've used
|
||
<a class="reference external" href="http://code.google.com/p/rst2pdf/">the rst2pdf program</a>, and a
|
||
simple template, to generate output. It's probably simple to have
|
||
similar results using latex + beamer, I'll try this next time, but
|
||
as I'm not familiar with latex syntax, restructured text was a
|
||
great option.</p>
|
||
<p>Here are
|
||
<a class="reference external" href="http://files.lolnet.org/alexis/rst-presentations/couchdb/couchdb.pdf">the final PDF output</a>,
|
||
<a class="reference external" href="http://files.lolnet.org/alexis/rst-presentations/couchdb/couchdb.rst">Rhe ReST source</a>,
|
||
<a class="reference external" href="http://files.lolnet.org/alexis/rst-presentations/slides.style">the theme used</a>,
|
||
and the command line to generate the PDF:</p>
|
||
<pre class="literal-block">
|
||
rst2pdf couchdb.rst -b1 -s ../slides.style
|
||
</pre>
|
||
</summary></entry><entry><title>first week working on distutils2</title><link href="http://blog.notmyidea.org/first-week-working-on-distutils2.html" rel="alternate"></link><updated>2010-06-04T00:00:00+02:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2010-06-04:/first-week-working-on-distutils2.html/</id><summary type="html"><p>As I've been working on
|
||
<a class="reference external" href="http://hg.python.org/distutils2/">Distutils2</a> during the past
|
||
week, taking part of the
|
||
<a class="reference external" href="http://code.google.com/intl/fr/soc/">GSOC</a> program, here is a
|
||
short summary of what I've done so far.</p>
|
||
<p>As my courses are not over yet, I've not worked as much as I
|
||
wanted, and this will continues until the end of June. My main
|
||
tasks are about making installation and uninstallation commands, to
|
||
have a simple way to install distributions via
|
||
<a class="reference external" href="http://hg.python.org/distutils2/">Distutils2</a>.</p>
|
||
<p>To do this, we need to rely on informations provided by the Python
|
||
Package Index (<a class="reference external" href="http://pypi.python.org/">PyPI</a>), and there is at
|
||
least two ways to retreive informations from here: XML-RPC and the
|
||
&quot;simple&quot; API.</p>
|
||
<p>So, I've been working on porting some
|
||
<a class="reference external" href="http://bitbucket.org/tarek/distribute/">Distribute</a> related
|
||
stuff to <a class="reference external" href="http://hg.python.org/distutils2/">Distutils2</a>, cutting
|
||
off all non distutils' things, as we do not want to depend from
|
||
Distribute's internals. My main work has been about reading the
|
||
whole code, writing tests about this and making those tests
|
||
possible.</p>
|
||
<p>In fact, there was a need of a pypi mocked server, and, after
|
||
reading and introducing myself to the distutils behaviors and code,
|
||
I've taken some time to improve the work
|
||
<a class="reference external" href="http://bitbucket.org/konrad">Konrad</a> makes about this mock.</p>
|
||
<div class="section" id="a-pypi-server-mock">
|
||
<h2>A PyPI Server mock</h2>
|
||
<p>The mock is embeded in a thread, to make it available during the
|
||
tests, in a non blocking way. We first used
|
||
<a class="reference external" href="http://wsgi.org">WSGI</a> and
|
||
<a class="reference external" href="http://docs.python.org/library/wsgiref.html">wsgiref</a> in order
|
||
control what to serve, and to log the requests made to the server,
|
||
but finally realised that
|
||
<a class="reference external" href="http://docs.python.org/library/wsgiref.html">wsgiref</a> is not
|
||
python 2.4 compatible (and we <em>need</em> to be python 2.4 compatible in
|
||
Distutils2).</p>
|
||
<p>So, we switched to
|
||
<a class="reference external" href="http://docs.python.org/library/basehttpserver.html">BaseHTTPServer</a>
|
||
and
|
||
<a class="reference external" href="http://docs.python.org/library/simplehttpserver.html">SimpleHTTPServer</a>,
|
||
and updated our tests accordingly. It's been an opportunity to
|
||
realize that <a class="reference external" href="http://wsgi.org">WSGI</a> has been a great step
|
||
forward for making HTTP servers, and expose a really simplest way
|
||
to discuss with HTTP !</p>
|
||
<p>You can find
|
||
<a class="reference external" href="http://bitbucket.org/ametaireau/distutils2/changesets">the modifications I made</a>,
|
||
and the
|
||
<a class="reference external" href="http://bitbucket.org/ametaireau/distutils2/src/tip/docs/source/test_framework.rst">related docs</a>
|
||
about this on
|
||
<a class="reference external" href="http://bitbucket.org/ametaireau/distutils2/">my bitbucket distutils2 clone</a>.</p>
|
||
</div>
|
||
<div class="section" id="the-pypi-simple-api">
|
||
<h2>The PyPI Simple API</h2>
|
||
<p>So, back to the main problematic: make a python library to access
|
||
and request information stored on PyPI, via the simple API. As I
|
||
said, I've just grabbed the work made from
|
||
<a class="reference external" href="http://bitbucket.org/tarek/distribute/">Distribute</a>, and played
|
||
a bit with, in order to view what are the different use cases, and
|
||
started to write the related tests.</p>
|
||
</div>
|
||
<div class="section" id="the-work-to-come">
|
||
<h2>The work to come</h2>
|
||
<p>So, once all use cases covered with tests, I'll rewrite a bit the
|
||
grabbed code, and do some software design work (to not expose all
|
||
things as privates methods, have a clear API, and other things like
|
||
this), then update the tests accordingly and write a documentation
|
||
to make this clear.</p>
|
||
<p>Next step is to a little client, as I've
|
||
<a class="reference external" href="http://github.com/ametaireau/pypiclient">already started here</a>
|
||
I'll take you updated !</p>
|
||
</div>
|
||
</summary></entry><entry><title>A Distutils2 GSoC</title><link href="http://blog.notmyidea.org/a-distutils2-gsoc.html" rel="alternate"></link><updated>2010-05-01T00:00:00+02:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2010-05-01:/a-distutils2-gsoc.html/</id><summary type="html"><p>WOW. I've been accepted to be a part of the
|
||
<a class="reference external" href="http://code.google.com/intl/fr/soc/">Google Summer Of Code</a>
|
||
program, and will work on <a class="reference external" href="http://python.org/">python</a>
|
||
<a class="reference external" href="http://hg.python.org/distutils2/">distutils2</a>, with
|
||
<a class="reference external" href="http://pygsoc.wordpress.com/">a</a>
|
||
<a class="reference external" href="http://konryd.blogspot.com/">lot</a> <a class="reference external" href="http://ziade.org/">of</a>
|
||
(intersting!) <a class="reference external" href="http://zubin71.wordpress.com/">people</a>.</p>
|
||
<blockquote>
|
||
So, it's about building the successor of Distutils2, ie. &quot;the
|
||
python package manager&quot;. Today, there is too&nbsp;many ways to package a
|
||
python application (pip, setuptools, distribute, distutils, etc.)
|
||
so&nbsp;there is a huge effort to make in order to make all this
|
||
packaging stuff interoperable, as pointed out by
|
||
the&nbsp;<a class="reference external" href="http://www.python.org/dev/peps/pep-0376/">PEP 376</a>.</blockquote>
|
||
<p>In more details, I'm going to work on the Installer / Uninstaller
|
||
features of Distutils2, and on a PyPI XML-RPC client for distutils2.
|
||
Here are the already defined tasks:</p>
|
||
<ul class="simple">
|
||
<li>Implement Distutils2 APIs described in PEP 376.</li>
|
||
<li>Add the uninstall command.</li>
|
||
<li>think about a basic installer / uninstaller script. (with deps)
|
||
-- similar to pip/easy_install</li>
|
||
<li>in a pypi subpackage;</li>
|
||
<li>Integrate a module similar to setuptools' package_index'</li>
|
||
<li>PyPI XML-RPC client for distutils 2:
|
||
<a class="reference external" href="http://bugs.python.org/issue8190">http://bugs.python.org/issue8190</a></li>
|
||
</ul>
|
||
<p>As I'm relatively new to python, I'll need some extra work in order
|
||
to apply all good practice, among other things that can make a
|
||
developper-life joyful. I'll post here, each week, my advancement,
|
||
and my tought about python and especialy python packaging world.</p>
|
||
</summary></entry><entry><title>Python ? go !</title><link href="http://blog.notmyidea.org/python-go.html" rel="alternate"></link><updated>2009-12-17T00:00:00+01:00</updated><author><name>Alexis Métaireau</name></author><id>tag:blog.notmyidea.org,2009-12-17:/python-go.html/</id><summary type="html"><p>Cela fait maintenant un peu plus d'un mois que je travaille sur un
|
||
projet en <a class="reference external" href="http://www.djangoproject.org">django</a>, et que,
|
||
nécessairement, je me forme à <a class="reference external" href="http://python.org/">Python</a>. Je
|
||
prends un plaisir non dissimulé à découvrir ce langage (et à
|
||
l'utiliser), qui ne cesse de me surprendre. Les premiers mots qui
|
||
me viennent à l'esprit à propos de Python, sont &quot;logique&quot; et
|
||
&quot;simple&quot;. Et pourtant puissant pour autant. Je ne manque d'ailleurs
|
||
pas une occasion pour faire un peu d'<em>évangélisation</em> auprès des
|
||
quelques personnes qui veulent bien m'écouter.</p>
|
||
<div class="section" id="the-zen-of-python">
|
||
<h2>The Zen of Python</h2>
|
||
<p>Avant toute autre chose, je pense utile de citer Tim Peters, et
|
||
<a class="reference external" href="http://www.python.org/dev/peps/pep-0020/">le PEP20</a>, qui
|
||
constituent une très bonne introduction au langage, qui prends la
|
||
forme d'un <em>easter egg</em> présent dans python</p>
|
||
<div class="highlight"><pre>&gt;&gt;&gt; import this
|
||
The Zen of Python, by Tim Peters
|
||
|
||
Beautiful is better than ugly.
|
||
Explicit is better than implicit.
|
||
Simple is better than complex.
|
||
Complex is better than complicated.
|
||
Flat is better than nested.
|
||
Sparse is better than dense.
|
||
Readability counts.
|
||
Special cases aren<span class="s1">&#39;t special enough to break the rules.</span>
|
||
<span class="s1">Although practicality beats purity.</span>
|
||
<span class="s1">Errors should never pass silently.</span>
|
||
<span class="s1">Unless explicitly silenced.</span>
|
||
<span class="s1">In the face of ambiguity, refuse the temptation to guess.</span>
|
||
<span class="s1">There should be one-- and preferably only one --obvious way to do it.</span>
|
||
<span class="s1">Although that way may not be obvious at first unless you&#39;</span>re Dutch.
|
||
Now is better than never.
|
||
Although never is often better than *right* now.
|
||
If the implementation is hard to explain, it<span class="s1">&#39;s a bad idea.</span>
|
||
<span class="s1">If the implementation is easy to explain, it may be a good idea.</span>
|
||
<span class="s1">Namespaces are one honking great idea -- let&#39;</span>s <span class="k">do </span>more of those!
|
||
</pre></div>
|
||
<p>J'ai la vague impression que c'est ce que j'ai toujours cherché à
|
||
faire en PHP, et particulièrement dans
|
||
<a class="reference external" href="http://www.spiral-project.org">le framework Spiral</a>, mais en
|
||
ajoutant ces concepts dans une sur-couche au langage. Ici, c'est
|
||
directement de <em>l'esprit</em> de python qu'il s'agit, ce qui signifie
|
||
que la plupart des bibliothèques python suivent ces concepts. Elle
|
||
est pas belle la vie ?</p>
|
||
</div>
|
||
<div class="section" id="comment-commencer-et-par-ou">
|
||
<h2>Comment commencer, et par ou ?</h2>
|
||
<p>Pour ma part, j'ai commencé par la lecture de quelques livres et
|
||
articles intéressants, qui constituent une bonne entrée en matière
|
||
sur le sujet (La liste n'est bien évidemment pas exhaustive et vos
|
||
commentaires sont les bienvenus) :</p>
|
||
<ul class="simple">
|
||
<li><a class="reference external" href="http://diveintopython.adrahon.org/">Dive into python</a></li>
|
||
<li><a class="reference external" href="http://www.swaroopch.com/notes/Python_fr:Table_des_Matières">A byte of python</a></li>
|
||
<li><a class="reference external" href="http://www.amazon.fr/Python-Petit-guide-lusage-développeur/dp/2100508830">Python: petit guide à l'usage du développeur agile</a>
|
||
de <a class="reference external" href="http://tarekziade.wordpress.com/">Tarek Ziadé</a></li>
|
||
<li><a class="reference external" href="http://docs.python.org/index.html">La documentation officielle python</a>,
|
||
bien sûr !</li>
|
||
<li><a class="reference external" href="http://video.pycon.fr/videos/pycon-fr-2009/">Les vidéos du pyconfr 2009</a>!</li>
|
||
<li>Un peu de temps, et une console python ouverte :)</li>
|
||
</ul>
|
||
<p>J'essaye par ailleurs de partager au maximum les ressources que je
|
||
trouve de temps à autres, que ce soit
|
||
<a class="reference external" href="http://www.twitter.com/ametaireau">via twitter</a> ou
|
||
<a class="reference external" href="http://delicious.com/ametaireau">via mon compte delicious</a>.
|
||
Allez jeter un œil
|
||
<a class="reference external" href="http://delicious.com/ametaireau/python">au tag python</a> sur mon
|
||
profil, peut être que vous trouverez des choses intéressantes, qui
|
||
sait!</p>
|
||
</div>
|
||
<div class="section" id="un-python-sexy">
|
||
<h2>Un python sexy</h2>
|
||
<p>Quelques fonctionnalités qui devraient vous mettre l'eau à la
|
||
bouche:</p>
|
||
<ul class="simple">
|
||
<li><a class="reference external" href="http://docs.python.org/library/stdtypes.html#comparisons">Le chaînage des opérateurs de comparaison</a>
|
||
est possible (a&lt;b &lt;c dans une condition)</li>
|
||
<li>Assignation de valeurs multiples (il est possible de faire a,b,c
|
||
= 1,2,3 par exemple)</li>
|
||
<li><a class="reference external" href="http://docs.python.org/tutorial/datastructures.html">Les listes</a>
|
||
sont simples à manipuler !</li>
|
||
<li>Les <a class="reference external" href="http://docs.python.org/tutorial/datastructures.html#list-comprehensions">list comprehension</a>,
|
||
ou comment faire des opérations complexes sur les listes, de
|
||
manière simple.</li>
|
||
<li>Les
|
||
<a class="reference external" href="http://docs.python.org/library/doctest.html?highlight=doctest">doctests</a>:
|
||
ou comment faire des tests directement dans la documentation de vos
|
||
classes, tout en la documentant avec de vrais exemples.</li>
|
||
<li>Les
|
||
<a class="reference external" href="http://www.python.org/doc/essays/metaclasses/meta-vladimir.txt">métaclasses</a>,
|
||
ou comment contrôler la manière dont les classes se construisent</li>
|
||
<li>Python est
|
||
<a class="reference external" href="http://wiki.python.org/moin/Why%20is%20Python%20a%20dynamic%20language%20and%20also%20a%20strongly%20typed%20language">un langage à typage fort dynamique</a>:
|
||
c'est ce qui m'agaçait avec PHP qui est un langage à typage faible
|
||
dynamique.</li>
|
||
</ul>
|
||
<p>Cous pouvez également aller regarder
|
||
<a class="reference external" href="http://video.pycon.fr/videos/free/53/">l'atelier donné par Victor Stinner durant le Pyconfr 09</a>.
|
||
Have fun !</p>
|
||
</div>
|
||
</summary></entry></feed> |