Although it may not look like it at first, this is intended to be the second in a series about implementing a tagging application in DJango. Part of my motivation was to learn more about Django and to become more familiar with technologies like Selenium, and architectural approaches, like REST. In the first installment, I explored how to design the data model and set up the initial Django project accordingly. But now I feel that before I can really dive into coding the application, I have to step back and think about how I might do this RESTfully. So, this post will be mostly concerned with trying to understand how to design a RESTful Web application, and more specifically, how to do so within Django. I will still be doing this within the context of the tagging application, so you may want to familiarize yourself with the first installment in the series:
- Implementing Tagging in a Django Application
REST again
If you’ve read my blog in the past, and you probably haven’t, you’ll know that I have been trying earnestly to understand how to effectively apply REST, both in Web applications and Web Services. I also believe that you can’t claim to know something, and judge it fairly, until you have used it in a significant project. That is what I intend to begin in this post.
I started doing Web programming in its early infancy, eventually taking up Java Servlet based development before Servlet containers were standardized. In those days, the common (best?) practice emerged to use some kind of action based controller approach, complete with URL’s ending in “.do” and often funneled through common dispatching Servlets. This was the common approach used in Struts applications, for example.
This seemed to make sense too. We were trying to impose object oriented semantics on the Web, and when someone made a request to the server, they were trying to do something, weren’t they? If I have a login form, I need something to process it: a login action. But today, a lot of smart people are talking about REST, and I’m trying to listen and learn. It’s really amazing to me that in 2007, after years of developing Web applications, many of us are re-examining the most fundamental aspects of how to design those applications well.
Unfortunately, learning to apply REST to Web application development is not as easy as it would seem. To my way of thinking, a primary advantage of REST is that Web applications may be structured more logically and become more clearly documented. But all those smart REST folks seem to focus almost exclusively on positioning REST as a better alternative to SOAP based Web Services and not focus nearly as much on helping us understand how to design our human facing Web applications more RESTfully. There’s a lot of theory being talked about, which is important too, to help us understand the philosophical differences between REST and action based architectures, but there is not nearly as much practical advice available on how to implement a RESTful Web application.
The bottom line is that it’s impossible to implement pure REST on the Web today because widely available browsers don’t support PUT and DELETE in Web forms (although you could implement these through an AJAX approach), and there is no single workaround for this situation that seems to be unanimously supported in the REST community. But I think there is still value in moving toward a RESTful approach, and I also think there are some common threads in the discussions taking place that can be teased out into a coherent approach.
First, of the two available REST actions, GET and POST, REST proponents give GET requests special consideration, and in fact GET requests are fundamentally different from the other three methods which have in common that they change the state of resources on the server. So GET requests are reserved solely for read-only requests (“idempotent” in REST terminology) and POST is “overloaded” to handle everything else. Given the current state of Web browser support for REST, this seems like a reasonable and logical compromise.
But then how do you overload POST exactly? An approach that can be seen on sites like del.icio.us, is to add the appropriate action to the URL. So URLs for Users might look something like this:
/users/ /users/add /user/<username> /user/<username>/update /user/<username>/delete
Because POST and PUT are somewhat vague and ambiguous terms, I opted for simpler language like “create” and “update” respectively, and taking a hint from del.icio.us, I ultimately used “add,” instead of “create,” which is simpler still.
This basic approach communicates intention clearly, especially in a framework like Django which so beautifully and succinctly designs URL templates and maps them to the appropriate controllers. One can imagine that just by peeking at the urls.py module in a Django application, someone could get a bird’s eye view of the intended behavior of the entire application. In other words: instant documentation.
But a fundamental tenet of REST is that URLs are supposed to point to distinct resources, not actions, and so it seems that the generally accepted alternative is to add an extra hidden field to your form that carries the method information, and on the server, we return to action dispatching, although now it’s only for three actions instead of an aribitrary number. While this would seem to go against my notions of instant documentation and instead hide the method information, the fact is that if REST were fully implemented on the Web today, the method information would be even more hidden to the casual observer. As long as you understand REST’s resource orientation, the structure of the application should still be understandable. So this approach does seem to genuinely move us closer to RESTful Web application development.
Another component of the REST approach that is relevant to this discussion is that REST applications are supposed to have “opaque” URLs. There seems to be quite a bit of disagreement among REST proponents about how important this is and what exactly this means, but the explanation that makes the most sense to me is the one that says that REST URLs should be logical in nature, not implementation specific, which I take to mean that URLs shouldn’t be driven by or indicative of the underlying server-side technology. So, this:
/users.php
…should just be:
/user/
That makes sense, the URLs are certainly more robust that way. PHP may not be the best example of this, but most server environments I have used allow you to flexibly construct URLs, and of these, Django may be the most flexible.
Furthermore, because our ultimate database primary keys shouldn’t have any business meaning, these make for good opaque URLs, such as for getting a particular “Item”:
/item/12345
Although, it’s also reasonable to make an exception to this approach for things like user accounts. Username fields should be unique, and using them instead of primary keys is more user friendly:
/user/<username>
And this exception makes sense for tags too, as tags names are also unique in our design. So, in general, I tend to like Bill Venner’s idea that human facing URLs should be human friendly, so I would opt for using unique names instead of primary keys if they exist, as other like del.icio.us have done.
So, URLs should point to resources, and resources are nouns (as opposed to actions which are verbs). So what are valid resources? Looking around for nouns, here are some examples I can think of:
- domain objects
- attributes of domain objects
- forms used to add and edit information
- representations of domain objects, including microformats like rss
The former two are really all about our model layer, while the latter two are really about our view layer. In this way, I think I may be starting to understand something that REST advocates say that I’ve had a hard time accepting in the past: that there is essentially no difference between Web applications and Web Services in a RESTful approach. This seems counter-intuitive because humans and computers think and act quite differently. But with the constraint of four available CRUD operations in our controller layer that can be performed on our domain objects, what actions can be done on what objects in our application is simplified and standardized in a way that makes our application immediately more generalizable than an action based approach.
Then, if I understand this proposed benefit of REST, the particular representation/view presented to the “user” might change to fit the needs of a that type of user: standard HTML for a browser to parse and present to a human, and possibly a microformat for some faceless software to parse and use without human intervention.
With this as my basis for continuing, I eventually found a blog post by Charlie Savage that up to this point, is the fullest and clearest explanation I have found for how one might structure a Web application RESTfully. (That is, right up to the point where he seems to suggest putting “action=put” in the URL of a GET request.) Charlie used RoR as his backdrop, but the following table from his blog was a revelation to me, and I think it captures the essence of how to design a RESTful controller layer:
Resource GET POST PUT DELETE Items list create each Item show update destroy creator new editor edit
Where his example talked of “products” I substituted “Items” to continue our intentionally generic example. He even had the foresight to call forms by names like “creator” instead of something like “addItemForm.” This again makes the URL much nicer looking by being more logically oriented than implementation oriented, much more deserving of being called a resource. And this is definitely much better than other RoR examples I have seen in which, for example, the URL for an edit form for an item looks like: http://mysite.com/item/1;edit . Here “edit” is supposed to refer to a form, but it reads like an action. (Although I have next to no knowledge of RoR, and maybe this actually makes perfect sense to RoR developers, or I may have misunderstood the example.)
In general, there seems to be a high correlation between those who advocate for REST and those who develop in RoR, and so, it seems there is a wealth of REST information to be found on the blogs of RoR developers. Although Django is my framework of choice, both because I already know Python and because I really like what I see in Django, I think this just shows once again how much we can learn from people from other backgrounds.
In his post, Charlie also shows us how to implement the four REST actions in the controller layer in a way that is, IMHO, quite elegant. I encourage you to read it before moving on. Now, how would we do in Python what he did in Ruby? How would we integrate it into Django?
let’s start the insanity
First, let’s add some phony data to our database, so we will have something to look at. Nothing exciting, just fire up your MySQL command line client, use tagging_exploration, our database, and add this:
insert into tagging_items (id, item_name) values (1, “First Item”); insert into tagging_items (id, item_name) values (2, “Second Item”); insert into tagging_items (id, item_name) values (3, “Third Item”); insert into tagging_items (id, item_name) values (4, “Fourth Item”); insert into tagging_items (id, item_name) values (5, “Fifth Item”);
For future reference, we can store this inside our “tagging” application directory, within an “sql” directory, in a file called items.sql. This is a special location and naming scheme based on the model class name that tells Django to run this file and add this initial data whenever the database is regenerated.
Now let’s see if we can display this data in the browser. Keep in mind that in Django’s terminology the “views” correspond to controllers in MVC terminology and “templates” correspond to views in MVC. By default, views are stored in a single views.py modules, but to follow Charlie Savage’s approach, lets begin by organizing the code like this by creating a “views” directory, with an empty __init__.py file inside so Python knows its a module. Within this views directory, we will have a separate module for each resource that will define our REST methods for that resource. So, for example, within the “views” directory, we might have an Items.py module that contains the appropriate implementations for a collection of Items: GET/list and POST/create. Also within this module, we can define an Item class that has appropriate method implementations for member Items as well as a class for the creator resource and an inner class for the editor resource. So, the general structure of views/Items.py would look like this:
def GET(request): pass def POST(request): pass class creator: def GET(self, request): pass class Item: def GET(self, request): pass def PUT(self, request): pass def DELETE(self, request): pass class editor: def GET(self, request): pass
These methods and classes organize the code logically, but keep in mind that this code represents wild speculation at this point; time will tell if this is actually a good idea or not.
Let’s begin with something easy that should work. Django is great at providing shortcuts for the tedious, repetitive work that goes into typical Web applications, and a striking example of this is generic views which fall into the following categories:
- “simple” views that redirect requests or display templates
- date based views like you might use to display blog posts or store sales
- list/detail views with automatic pagination
- create/update/delete views for domain objects
But for now, lets just take a look at our list of Items. To use generic views at their simplest, you actually don’t need to create code in your views, but instead, you can direct URL patterns to generic views completely through entries in urls.py. Here, I am going to reference the generic views with view code anyway, for several reasons: I think it makes the example more explicit, in real applications we may want to further extend and customize generic views anyway, which would be done through view code, and also, it makes the way our views are invoked more consistent. For more information about extending generic views, look at this blog post by James Bennet.
To begin, at the top of Items.py, import our Items model and the object_list generic view:
from django.views.generic.list_detail import object_list from tagging_exploration.tagging.models import Items
In the module level GET method, get all the Item models and pass it to the object_list generic view:
def GET(request): return object_list(request, Items.objects.all())
In urls.py, add a RESTful looking URL that invokes this view:
urlpatterns = patterns(”, (r’^Items/$’, ‘tagging_exploration.tagging.views.Items.GET’), (r’^admin/’, include(‘django.contrib.admin.urls’)), )
Now we need to supply the object_list template itself that will format the response to this URL. Within your Django application directory, create a “templates” directory to hold the application templates, and within that, create a “tagging” directory where Django will expect the generic view templates to live. To let Django know that the “templates” directory should be searched for templates within this application, we need to add a TEMPLATE_DIRS list to the Django application’s settings.py file with the absolute path to the templates directory. Rather than hard-coding this absolute path, I prefer to let Python build it dynamically, so I don’t have to change it when I move the application around:
import os dirname = os.path.dirname(globals()[“__file__”]) TEMPLATE_DIRS = ( os.path.join(dirname, ‘tagging/templates’), )
Now, within the “tagging_exploration/tagging/templates/tagging” directory, we can create a very simple template stored in a file called “items_list.html”. By default this file name is a convention based on the model being displayed by the generic view, although you can override this in the view code. Also note that although our model is called “Items” with an upper case “I,” the template name derived from that model is in lower case. Here is the template:
<html> <head> <meta http-equiv=”Content-Type” content=”text/html; charset=UTF-8″/> </head> <body> <h1>Items</h1> {% if object_list %} <ul> {% for item in object_list %} <li>{{ item.item_name }}</li> {% endfor %} </ul> {% else %} <p>No items available.</p> {% endif %} </body> </html>
The results of querying the database are automatically passed along to the template by the generic view in the “object_list” variable.
Start up your local Django development server by running this inside your Django project directory:
python manage.py runserver
Then in your browser hit: http://localhost:8000/Items/. You should see a simple list of item names that we previously added to the database.
Now, let’s get trickier. The problem we still have to overcome is that matching an incoming request to a URL pattern in urls.py does not indicate whether the request is a GET or a POST. This isn’t a problem for the creator or editor resources, because as forms, they will have separate URL patterns defined for them and they will always only respond to a GET request, but this problem still needs to be dealt with for our other resources, in which a common matching URL pattern will be used for multiple method invocations.
The answer I came up with is a dispatcher method in the Items.py view that I eventually refactored out into its own module so it can be reused more easily in other view modules. So, in the views directory, I ended up with this dispatcher.py file:
from django.http import Http404 class dispatcher: def __init__(self, GET, POST, member_class): self.GET=GET self.POST=POST self.member_class = member_class def dispatch(self, request, id=None): if id: member_item = self.member_class() if request.method == ‘GET’: return member_item.GET(request, id) else: if request.has_key(‘_action’): if request.POST[‘_action’].lower() == ‘put’: return member_item.PUT(request, id) elif request.POST[‘_action’].lower() == ‘delete’: return member_item.DELETE(request, id) else: if request.method == ‘GET’: return self.GET(request) else: return self.POST(request) raise Http404
So, basically, I am just letting the dispatcher know about the module level GET and POST functions with the first two arguments, and then the member class is referenced with the third argument, and it is expected to implement GET, PUT and DELETE functions on individual Item objects. As a result, all RESTful actions in Items.py should be accessible to the dispatcher, otherwise, its just passing along the request objects to the appropriate functions and handing back the response objects returned from those functions.
Here is how the dispatcher is used by my slightly more filled out Items.py view code:
from django.views.generic.list_detail import object_list from django.views.generic.create_update import create_object, delete_object from tagging_exploration.tagging.models import Items from django.http import HttpResponseNotFound def GET(request): return object_list(request, queryset=Items.objects.all()) def POST(request): # to help with initial debugging… return HttpResponseNotFound(‘POST: Items’) class Creator: def GET(self, request): # to help with initial debugging… return HttpResponseNotFound(‘GET: Item creator’) class Item: def GET(self, request, id): # to help with initial debugging… return HttpResponseNotFound(‘GET: item’) def PUT(self, request, id): # to help with initial debugging… return HttpResponseNotFound(‘PUT: item’) def DELETE(self, request, id): return delete_object(request, model=Items, object_id=id, post_delete_redirect=’/Items/’ ) class Editor: def GET(self, request, id): # to help with initial debugging… return HttpResponseNotFound(‘GET: Item editor’) # these functions work, but is there a way to write urls.py so I don’t have to do this? def creator(request): return Creator().GET(request) def editor(request, id): return Item().Editor().GET(request,id) from dispatcher import dispatcher dispatcher = dispatcher(GET, POST, Item) def dispatch(request, id=None): return dispatcher.dispatch(request, id)
To test that not only the listing of items works, but that functions like DELETE() will work properly, I changed the items_list.html template to include an appropriate delete capability for each item:
<html> <head> <meta http-equiv=”Content-Type” content=”text/html; charset=UTF-8″/> </head> <body> <h1>Items</h1> {% if object_list %} <ul> {% for item in object_list %} <li> {{ item.item_name }} <form method=”POST” action=”/Items/{{ item.id }}/”> <input type=”submit” name=”_action” value=”delete” /> </form> </li> {% endfor %} </ul> {% else %} <p>No items available.</p> {% endif %} </body> </html>
And I updated urls.py with the following patterns:
urlpatterns = patterns(”, (r’^Items/$’, ‘tagging_exploration.tagging.views.Items.dispatch’), (r’^Items/creator/$’, ‘tagging_exploration.tagging.views.Items.creator’), (r’^Items/(?P<id>d+)/$’, ‘tagging_exploration.tagging.views.Items.dispatch’), (r’^Items/(?P<id>d+)/editor/$’, ‘tagging_exploration.tagging.views.Items.editor’), )
Now, if you go to http://localhost:8000/Items/ you will be able to easily delete items in turn. I also added some confirmation text to the other methods so those URLs can also be accessed to ensure they are working.
There are some pretty obvious shortcomings in this code that not only show my limited understanding of Django, but also how rusty I have gotten in Python. I really should “look before I leap” with the method calls in dispatch and return a 404 or other appropriate error if the methods aren’t found, so there will be no risk of users seeing Python NameError exceptions. Also, the way I defined the creator, editor and dispatcher methods in the module to point to their respective implementations is very klugey, but I couldn’t seem to figure out how to get the corresponding patterns in urls.py to point to them correctly otherwise.
I did make one final customization to urls.py to generalize it. Assuming that our domain objects are the most obvious candidates for resources to interact with in our applications, I decided to build the URL patterns dynamically so that the patterns only need to be specified once for all models. The final urls.py looks like this:
UPDATE:Thanks to David Larlet from biologeek.com for his comment below suggesting the use of a view prefix to simplify the code.
from django.conf.urls.defaults import * urlpatterns = patterns(”, (r’^admin/’, include(‘django.contrib.admin.urls’)), ) from django.core.management import _get_table_list, _get_installed_models models = _get_installed_models( _get_table_list() ) for model in models: urlpatterns += patterns(‘tagging_exploration.tagging.views.’ + model.__name__, (r’^’ + model.__name__ + ‘/$’, ‘dispatch’), (r’^’ + model.__name__ + ‘/creator/$’, ‘creator’), (r’^’ + model.__name__ + ‘/(?P<id>d+)/$’, ‘dispatch’), (r’^’ + model.__name__ + ‘/(?P<id>d+)/editor/$’, ‘editor’), )
Assuming that you follow the convention established above for Items.py for each of your model classes, this should work.
final thoughts
So ends another long winded and largely theoretical post. I hope there has been enough content to provoke some thought and possibly even provide some guidance. As someone new to both REST and Django, this post is essentially me thinking out loud about how I might successfully marry the two in my tagging_exploration application. I’m sure I’ve gotten plenty wrong and welcome any suggestions readers might care to provide.
Although I feel I’m making progress in my understanding of the REST philosophy and how to apply it, there are things which I still struggle with. Although this approach to designing a RESTful controller layer looks right when dealing with simple model interactions, this is only a start, as there are many situations in a real Web application that just aren’t this simple.
In the action oriented Web development world, we’ve long realized that typically there is what you might call an “impedance mismatch” between domain objects and forms: a single page form may represent attributes from more than one domain object, and likewise, multi-page forms may correspond to one (or more than one) domain object (think “wizard” style forms). For example, if I am “tagging” an item, I will probably be presented with a page of largely Item information and a form with a field in which I can enter multiple comma delimited tags (like del.icio.us) as well as a hidden form field that identifies what User I am. Now, when I submit this form, am I updating an Item, adding new Tags, or updating the relationship between an existing Item, an existing User (me) and N number of existing Tags, that is, adding new Tag_Assignments? To the end user, it might look like I am updating an Item, when in fact, I am likely simultaneously adding multiple new tags and adding multiple new Tag_Assignments between an existing Item, an existing User and multiple existing Tags. In a RESTful approach, how does this scenario map to the four REST actions and the available resources? Clearly, there’s a lot I still don’t get.
Or maybe the scenario itself is wrong. Maybe we need to rethink how we structure user interfaces in a RESTful world, in which case, will users be able to make sense of whatever we come up with in the end? Maybe this is a great place for AJAX to help us out, and later on in this series, I will try to explore that possibility. Maybe the form I am talking about above shouldn’t be doing one large form submission, but many small, AJAX interactions as each update is made that implements distinct RESTful actions appropriate to the Web while providing a desktop GUI oriented user experience. This would seem to directly address at least some of the issues raised in this thought provoking blog post about the appropriateness of applying the MVC pattern to Web application development.
Or, consider this discussion from Peter Williams’ blog about how to handle a POST that fails validation in the section “Handling Bad Data”:
The fundamental problem here is that the separate editor resource will PUT the modified resource when you click save/submit. But what if you messed it up and, say, violated the business rule that blue products must have a price that is divisible by three? In a normal Rails app that proposed change would fail validation and the update action would just re-render the edit page with bad fields highlighted. But in a RESTful world the editor and the validation code are separate and it is wrong from REST stand point to just render the editor resource from in response to a product resource request. However, if you don’t do that you need to get the form data, and which fields are bad, from the previous attempt so that you can re-render the editor with the information the user previously entered and what was wrong with it.
One way you could solve this problem is to allow the creation of “invalid†resources. For example, you require a product to have a description. However, you receive a POST to ‘http://mystore.example/products’ without a description. You could issue the product an ID and store it in it’s invalid state (without a description) and then redirect the browser to the editor resource for that newly created, but invalid, product. That feels really clean from a design stand point but I am not sure how difficult it would be to implement. And you would certainly end up to permanently invalid resources, which might be hard to manage in the future.
Or maybe the answer is much simpler, and I just can’t see it yet because I haven’t fully made the paradigm shift to REST yet. Or perhaps the problem is not in our approach, but in our browsers, and it wouldn’t be fair to judge REST based on its limited implementation within existing browsers.
But if in the end, I’m just confused, then I hope REST advocates will recognize that there are probably a lot of similarly confused people out there. I suspect many of us are open to REST, but we need more guidance from those who have already made this shift.
Ultimately, I have progressed enough and have seen enough promise from REST that I am willing to try to stick to a RESTful approach to the tagging_exploration application as best I can to see what I can learn from it. But otherwise, I am dissatisfied with the approach I developed and would hesitate to use it in a real application. Again, I probably approached this rather naively, and so, the decision to apply REST to Django applications shouldn’t be made based on what is mostly likely a clumsy implementation, but if I am on the right track, then it seems I have introduced significant complexity to the application–making it harder to understand and maintain–with little gain. I have not added anything to the application that would be readily apparent to my boss/client/end users. I have not added any significant new functionality. This approach might encourage better application design practices, for example, it seems to encourage better separation between MVC layers. Otherwise, it looks like all I have done is comfort myself with the notion that I have at least attempted to make the application architecturally pure.
Instead, I would probably follow a semi-RESTful approach, like that used by such sites as del.icio.us and Flickr, and put actions in URLs, like I did in the first example URLs above. Again, this approach is very easy to understand, and it takes full advantage of Django’s very powerful and flexible ability to map URLs directly to views, which the approach I have outlined mangled. Again, we have to make compromises somewhere in our applications because of the current Web environment, and I feel that the semi-RESTful approach will ultimately strike the best balance between the limitations of our browsers, the strengths of our tools and the spirit of REST.
resources
source code used in this post.
rest and related:
- Resource-oriented vs. activity-oriented Web services
- RESTful and User-Friendly URLS by Bill Venners
- REST Controller for RAILS on Charlie Savage’s Blog
- (Another) Rest Controller for Rails by Peter Williams
- RESTed and Confused on Don Park’s Daily Habit
django generic views:
- Django Documentation: Generic Views
- Django tips: get the most out of generic views from The B-List.
other:
- Django Documentation: URL Dispatcher
- Quick Tip: Django views ordered by file and directory from CoderBattery
- MVC and web apps: oil and water by Harry Fuecks
- How Far Should Models Go? by Guy Naor