Although I’ve used many Web based applications that employ tagging, I’ve yet to create an application of my own with this feature. But now, I have two potential projects on the horizon that could benefit from tagging, and I’m thinking about how to best implement this, both in the database and user interface layers.
So, I thought I would explore how to implement tagging in my own applications, from scratch, and write about them on this blog. Therefore, I won’t pretend I have all the answers–or even necessarily any good answers–but I will simply be trying to think through how to approach implementing tagging, and I will welcome any constructive feedback on those thoughts.
I also see this as an opportunity to further explore Django, a rapid Web development framework that has captured my imagination, even though I have barely scratched the surface of its capabilities. I have also been sipping on the REST Kool-Aid recently, and would therefore also like to try my hand at implementing tagging in Django RESTfully. Finally, I want to delve more deeply into AJAX programming and learn about how to create effective Selenium tests
This might seem like a lot of new stuff to learn and implement all at once (think: high risk of failure, or at least confusion), but I believe the problem domain I’m tackling is small and well defined, and I’ll be taking it piece by piece, in a divide and conquer strategy that should also effectively mitigate the risks. I’m thinking of exploring the implementation in several posts over the coming weeks, starting with database design and setting up the initial project, moving on to designing the controller layer RESTfully, and then on to implementing simple, then more advanced, user interface functions.
Furthermore, I’ll be starting the implementation very simply, perhaps even naively, and only adding complexity as needed, to try to really get at the essence of implementing tagging. I also hope that those who have more experience with the tools and approaches I use will let me know if I start to travel too far down the wrong road. I’ll keep the posts and source code up to date to reflect this feedback.
Ultimately, I believe that to truly understand a particular tool or design approach, you have to implement it meaningfully. Only then can you evaluate it on its own merits and compare it to alternatives. This is my goal here.
When I started learning to program applications professionally, I was taught to begin designing the database first (assuming a database was used, which it typically was). This, or object modeling, is still how I approach designing new projects, after a sufficient requirements gathering phase. The data model is the layer upon which all others are built, so weaknesses in its design tend to ripple out into the application. And in cases like this, in which I am exploring concepts that are new to me, boiling the problem domain down into simple types of data and their relationships provides a gentler starting point. You may prefer to do object modeling first instead, which essentially accomplishes the same goals.
So, let’s begin with the things that we want to tag, which might be Web sites (represented by their addresses and associated titles), images, or any other discreet thing you can imagine. I don’t want to get hung up on what these particular things are, so lets just call these things “items”. We will also have users who are doing the actual tagging, and we want to track who is adding tags to what items so that later we can display just the tags that a particular user has entered, for example.
Then of course, there are the tags themselves. Lets start with the obvious: items can be assigned many tags, and likewise, users can assign many tags. So, we can think of this as a many-to-many relationship between items and users that is bridged by tags:
[items] –< [tags] >– [users]
This is my attempt to render an entity-relationship diagram (ERD) with Crow’s Feet notation in ASCII, so read this as:
[“one” table] –< [“many” table]
(Let me know if this is not easily readable, and I will break down and create images of real ERDs.)
This start is OK, but to use tags effectively in our application, the tags themselves should be stored as unique values in a table so we can do things like easily see all items assigned a particular tag. To make this type of querying easier, we can normalize the database further by recognizing that a particular tag may be assigned by many users and likewise, a particular tag may be assigned to many items. So, there is really a many-to-many relationship between items and tags and a many-to-many relationship between users and tags:
[items] –< [items_tags] >– [tags] –< [users_tags] >– [users]
Now, (I think) I’m on to something. But its still not quite right, because in this model, I have broken the association between users and items, so that, for example, there is no real way to tell when a particular user assigned a particular tag to a particular item. In addition to knowing when an item was tagged by a user, there may be other metadata I want to capture around this event. So, a further refinement might look like this:
[tags] | ^ [items] –< [tag_assignment] >– [users]
This looks more straightforward.
I probably haven’t blown anyone’s mind with this, and I definitely won’t with the next section either, when I’ll simply set the stage for continuing this exploration in a Django project. Hopefully, by the end, we’ll have a solid foundation to build upon in later posts.
let’s get this party started
After designing the data model layer, I feel like I have enough basic understanding of the problem space to plunge into Django. At this point, I’ll assume you know Python, have some basic familiarity with Django (take a few minutes and look at the Django tutorial) and have Django installed as well as a locally running MySQL server instance. Basically, this section will be a condensed version of part one of the Django tutorial, customized for this particular project.
Within Eclipse, I created a new Python project through PyDev called tagging_exploration, and on the command line, I created a Django project within that directory, also called tagging_exploration for simplicity:
django-admin.py startproject tagging_exploration
Don’t get too hung up on the two project names. The Eclipse project name simply allows Eclipse to manage all our code, while the Django project is Django’s way of organizing closely related applications. In this case, we’ll start with only one application in our project, called “tagging” which will be our focus. So, on the command line, go into the Django “tagging_exploration” project and run this:
python manage.py startapp tagging
This creates an application directory called “tagging” where we’ll be coding. To let Django know about this application, modify the settings.py file in the tagging_exploration Django project directory so it looks like this:
INSTALLED_APPS = ( ‘tagging_exploration.tagging’, ‘django.contrib.auth’, ‘django.contrib.contenttypes’, ‘django.contrib.sessions’, ‘django.contrib.sites’, ‘django.contrib.admin’, )
Essentially, I added our application first in the list. The next four applications are there by default, and I added the famed django.contrib.admin application at the end, simply because its so useful, and eventually, we can use it to verify that our basic setup is working. And since it comes with Django, we get it for free.
Now, within the tagging application directory is a models.py module within which we can begin to define our data model:
from django.db import models from django.contrib.auth.models import User # Create your models here. class Tags(models.Model): tag_name = models.CharField(maxlength=255) class Items(models.Model): item_name = models.CharField(maxlength=255) added_on = models.DateTimeField(core=True) class Users(models.Model): user = models.OneToOneField(User) class Tag_Assignment(models.Model): assigned_on = models.DateTimeField(core=True) item = models.ForeignKey(Items) tag = models.ForeignKey(Tags) user = models.ForeignKey(Users)
This is about as bare as a Django model can be, but its a good starting point. With this, we can let Django create the underlying database for us, and later on, Django will also manage the object relational mapping for us in the application code using these classes as a guide. Consult the Django model documentation for a complete explanation of what I have done here as well as instructions for customizing how the admin interface functions.
Otherwise, all I’ve done is try to faithfully implement the ERD above. Again, the “items” can be anything for the purposes of this demonstration, so I simply gave them a name field. The only thing that may look odd is how the Users class is defined: I am importing and extending the User model from the bundled django.contrib.auth to automatically gain access to that application’s functionality within our tagging application. For a more detailed explanation of this approach read “Django tips: extending the User model” from the B-List.
Moving on, we’ll need to create an initially empty database, also called tagging_exploration, along with appropriate user privileges to allow Django access. So, on the command line, connect to your MySQL instance as root and run this:
use mysql; drop database if exists tagging_exploration; create database tagging_exploration; GRANT ALL ON tagging_exploration.* TO [email protected]’localhost’ IDENTIFIED BY ‘tag_pass’; exit;
I granted “all” permissions because Django will be creating fields and indexes in tables among other things, and therefore needs fairly liberal privileges. To inform Django about this database, open the settings.py file in the tagging Django project, and modify the following lines accordingly:
DATABASE_ENGINE = ‘mysql’ DATABASE_NAME = ‘tagging_exploration’ DATABASE_USER = ‘tag_user’ DATABASE_PASSWORD = ‘tag_pass’
Also, add this line to settings.py to inform Django about our custom Users class:
AUTH_PROFILE_MODULE = ‘tagging.Users’
We can now create our application by running the following from the tagging_exploration Django project directory:
python manage.py syncdb
It will ask you to create a superuser account for the django.contrib.auth application. Provide a username and password and remember it. You’ll need this information to access the administrative interface.
This is a good stopping point for this post, but it would be nice to actually see something for our effort. Lets peek into the Django admin interface and kick the tires on our data model. First, open the urls.py module in the Django project directory and uncomment the following line of code as instructed:
# Uncomment this for admin: (r’^admin/’, include(‘django.contrib.admin.urls’)),
And let’s also elaborate our model code to give the admin application hints for how we want it to handle our model code. For example, to tell Django that we want to use the admin interface for a particular class, simply add an empty “Admin” inner class to an existing model class, like this:
class Tags(models.Model): tag_name = models.CharField(maxlength=255) class Admin: pass
This is enough to provide the default functionality in the admin interface for Tags. But options can also be added, as explained in Django’s model documentation, to further control how the admin interface functions. Here is a more interesting models.py for the admin interface that will expose Tags and Items:
from django.db import models from django.contrib.auth.models import User # Create your models here. class Tags(models.Model): tag_name = models.CharField(maxlength=255) def __str__(self): return self.tag_name class Admin: list_display = (‘tag_name’,) search_fields = [‘tag_name’] class Meta: verbose_name_plural = ‘Tags’ class Items(models.Model): item_name = models.CharField(maxlength=255) added_on = models.DateTimeField(core=True) def __str__(self): return self.item_name class Admin: list_display = (‘item_name’, ‘added_on’) search_fields = [‘item_name’] class Meta: verbose_name_plural = ‘Items’ ordering = [‘item_name’] class Users(models.Model): user = models.OneToOneField(User) class Tag_Assignment(models.Model): assigned_on = models.DateTimeField(core=True, auto_now_add=True) item = models.ForeignKey(Items) tag = models.ForeignKey(Tags) user = models.ForeignKey(Users)
Now, from the project directory, run this from the command line:
python manage.py runserver
By default, this will launch a standalone Django development server on port 8000, but you can add a custom port number to the end of the command to run it elsewhere. Otherwise, open your browser to http://127.0.0.1:8000/admin, login with the superuser account you created above and begin poking around the data model we created.
In the next post, I am going to begin implementing some simple use cases, trying to design URL’s in Django that are RESTful, and probably at least exploring some simple generic views.
- The source code for this post.
- Wikipedia: Entity-relationship diagrams
- Django installation instructions
- Django tutorial
- Django model/admin customization documentation
- Django tips: extending the User model from the B-List