I’ve realised that one of the plugins I use to make this blog is not working
correctly. I use the
more_categories plugin to:
- add subcategories
- assign multiple categories to articles.
Subcategories aren’t working and Pelican thinks each article just has categories than contain forward slashes.
In his “Powerful Python” emails, Aaron Maxwell recommends looking at the source code for popular python libraries to see how really good Python is written, and how talented developers write code and solve problems.
This is a good opportunity to look at the code that powers the plugin and see if if I can:
- Understand the source code
- Locate the source of the problem
- Fix the problem
I don’t know if Pelican is amazingly good quality or not, I get the feeling it could do with more developer resources, but I’ve got a real reason and motivation to look at the underlying code so I’m going to give it a shot.
The documentation is sparse which doesn’t help, I get the impression that whoever wrote it feels like Pelican is simple and it’s obvious what’s going on 1. It’s not obvious to me.
Every plugin has to have a
register() function, here it is for the
def register(): signals.article_generator_context.connect(get_categories) signals.article_generator_finalized.connect(create_categories)
I understand the idea of signals from Django, and generators are discussed a bit in the documentation. So what else is happening…
As I write down my understanding of the plugin, I’m aware that my understanding is definitely incomplete and probably wrong. I hope that as I progress I will see the mistakes in what I’ve already written.
get_categories() is called first, and it takes two arguments,
metadata. The entire function is 3 lines so here it is:
def get_categories(generator, metadata): categories = text_type(metadata.get('category')).split(',') metadata['categories'] = [Category(name, generator.settings) for name in categories] metadata['category'] = metadata['categories']
It looks like it gets the category from the metadata for each article.
Presumably by the time this function is called the articles have already been
parsed and a
metadata object has already been created and populated with
metadata about the articles, including categories.
The first row of
get_categories() splits up the categories if multiple
categories are listed.
metadata must be a dictionary, and there must be a
metadata dict for each article, otherwise you couldn’t just get get the value
assoiciated with the dictionary key and then split the string on commas.
This means that this function is called once for each article.
I don’t know what
text_type does yet. Maybe it ensures that the output is
always a string. It’s imported from
six which I remember seeing being a
dependecy of some other packages.
.. Having checked the
six it looks like I was
right - it represents unicode textual data in both
Pelican was originally written in Python2 I guess.
Next step is to write a new key-value pair to the metadata dictionary for each
article. This plugin adds functionality to python by enabling
and not just a
category for each article. It seems clear that adding a
categories key to the metadata dict is an obvious way to do this. The value
categories key is a
list where each item is an instance of the
Category class. This class is instantiated using two arguments,
is the string from the previous row, and
generator.settings which is
currently not understood.
.. printing the contents of
generator.settings shows that its a dictionary of
all the settings. Easily assumed and good to confirm.
I’ll dig into the
Category class in a moment, but first lets quickly cover
the last row of the function. The
category attribute of the articles metadata
is simply updated with the first item in the categories list (
must be a list because it can be indexed.)
This class is the only class defined by the plugin (which is only 96 lines of code). It has 6 methods, 5 of them are decorated, and it has no constants.
The decorators are
_name.setter  and
URLWrapper is imported from
pelican.urlwrappers and I don’t know what that
does beyond “wrapping URLs”.
Decorators are functions that takes methods or functions as inputs. Using
property along with
setter decorators lets a class have a property assigned
to it whilst ensuring that arbitrary conditions or logic is upheld. If the
decorator is over a method called
foo, then there would need to be a
foo.setter on a method somewhere in the class.
That doesn’t seem entirely right though, because in our
Category class, we
@property decorator over a
_name method, and also a
decorator over another method called
_name. But the other methods with
@property decorators (
ancestors) do not have any associated setter
decorators or methods.
The setter for
_name seems to create parent categories if the string contains slashes:
@_name.setter def _name(self, val): if '/' in val: parentname, val = val.rsplit('/', 1) self.parent = self.__class__(parentname, self.settings) else: self.parent = None self.shortname = val.strip()
self.parent becomes an instance of the category class, that is
self.settings. This is recursive to
however many levels of subcategories are specified.
as_dict methods seem more confusing.
called or mentioned within the class definition, but is called from the
create_categories function which is called after the
function returns. I don’t understand why it needs an
@property decorator though.
The class inherits from
URLWrapper so that is probably the next best place to
look… Indeed, looking at the definition of
URLWrapper shows that the
as_dict method is overriding the definition in the base class.
- I guess it’s the “curse of knowledge” ↩