Using MongoDB as a simple message board backend

Posted in MongoDB, python on December 25th, 2009 by jb55 – 1 Comment

I’ve been implemented a wakaba-like message board using MongoDB as a storage backend and it has been a real breeze so far. Most interactions with the database have been one liners in Python and there is little work involved with inserting new threads or replies. I’d have to say the worse part of the whole ordeal so far is the lack of integration with existing Python web frameworks. For that reason I decided not to use Django, which seems tightly coupled to sql solutions. I opted for a Werkzeug + Jinja2 + WTForms package called Glashammer which has worked quite well for me so far. Anyway, on to MongoDB…

Database structure

With my SQL habits my initial plan was to normalize all posts, with all threads and replies as MongoDB objects within a collection. It turns out this isn’t the best way to go about it. If the thread has a large amount of replies, the referenced lookup for each reply object is inefficient, and isn’t the MongoDBish way of doing things. What I found works best is a tree structure or a simple embedded array of replies:

{
  _id: ObjectId
  name: "jb55",
  topic: "What is this I don't even",
  msg: "I have important things to say",
  replies: [
    { name: "troll", msg: "tl;dr" },
    { name: "commentator", msg: "cool story bro" }
  ]
}

As you can see replies are embedded in the thread object. This makes pulling the thread from the database very quick, since the data will be together on the disk. You’ll probably want the thread and replies to have a reference to a user object, or a simple email depending on how sophisticated you want to go. For the purposes of my 4chan/wakaba-like forum it made sense not to store a user reference.

Some common tasks with PyMongo

All interactions between Python and MongoDB are pretty smooth and it doesn’t take much work to get up and running with insertions and updates.

Creating a new thread

db.threads.insert(
{'name': 'jb55', 'topic': 'My important topic', 'msg': 'blah blah'})

Adding new replies to a thread

reply = {'name': 'jb55', 'msg': 'This is my reply'}
db.threads.update({'_id': thread_id}, {'$push': {'replies': reply}})

Getting and displaying a thread

thread = db.threads.find({'_id': thread_id})
data["thread"] = thread
return render_to_response('template.htm', **data)

Easy isn’t it? No schemas, no hastle. In the interest of keeping this article short and sweet I think I’ll leave it at that for now. Try it out if you haven’t yet, it’s fun!

Asynchronous Go API idioms

Posted in go on November 14th, 2009 by jb55 – 1 Comment

After hacking on my Go Twitter API for the past couple of days I’ve started noticing some very cool ways of taking advantage of Go’s concurrency using goroutines and channels, and I thought I’d share them here.

Originally all of my API calls were synchronous. When you called GetStatus(1234) it made an HTTP request to the server and returned when it was finished. This is ok for most applications, but it definitely left much inflexibility in the API. I thought about leaving it like this, arguing that if the client wanted asynchronous calls it could wrap it with its own goroutine. After thinking about it, having the API handle the work of creating a goroutine isn’t too big of a deal and makes less work for the client.

Return channels, not objects!

This inherent property that all potentially blocking API calls be asynchronous has an interesting effect on how you interact with the API:

api := twitter.NewApi();

statusChannel := api.GetStatus(12345); // asynchronous
pubTimelineChannel := api.GetPublicTimeline(); //asynchronous
status := <-api.GetStatus(123456); // synchronous

fmt.Printf("synchronous status: %s, async status: %s\n",
       status.GetText(), (<-statusChannel).GetText()) ;

Neat! When the client wants regular synchronous calls, all they have to do is prefix the call with the channel receive operator! What the API does is instead of returning the Status object itself, it returns a buffered channel that receives a Status object. Making the channel buffered has the added benefit of not blocking the goroutine when its sending on the channel, allowing it to destroy itself after receiving the data. When a potentially blocking API call is made, the function creates a 1-sized buffered channel of the return type, launches a goroutine that takes the channel as a parameter, and then returns the channel instantly:

func (self *Api) GetStatus(id int64) chan Status {
  response := make(chan Status, 1);
  go self.goGetStatus(id, response);
  return response;
}

Loosening the grip on the receive channel

Awesome! We now have a robust way of getting single Twitter status messages both asynchronously and synchronously, but this isn't flexible enough. Here's why: Say we wanted a clean way of getting N status messages asynchronously. With our current setup we would need to call the function N times and manage N separate channels. Now say you wanted to receive and process the status messages the moment they arrived. To do this you would have to construct a select statement with N cases, this becomes unmanageable with large values of N.

The ideal solution is to let the client create and manage an N-sized buffered channel and pass it into our API. Then they could do something like this:

const nIds = 10;
receiveChannel := make(chan twitter.Status, nIds);
api.SetReceiveChannel(receiveChannel);

var startId int64 = 5641609144;

for i := 0; i < nIds; i++ {
  api.GetStatus(startId);
  startId++;
}

for i := 0; i < nIds; i++ {
  // reads in status messages as they come in
  fmt.Printf("Status #%d: %v\n", i, <-receiveChannel);
}

Error handling?

A popular idiom used in the standard Go packages is _, ok :=. The basic idea is you return a bool or an os.Error as the second return value of the function to notify of a success or failure. This does not play well with my API, since returning multiple values from a function breaks method chaining and syncronous calls. I currently handle errors with the api function GetLastError which returns a os.Error, and the GetErrorChannel function which returns a channel that receives errors as they occur, which I found useful for tests and logging errors.

That's all for now, let me know what you think.

Building and installing your first Go package

Posted in go on November 11th, 2009 by jb55 – Be the first to comment

Go packages are just a collection of .go files which share the same package directive at the top of each file. You can think of it as a namespace if your from the C++ crowd.

Go provides a couple of makefiles to make it easy to build and install your own Go packages.

Assuming you’ve followed the instructions on their website and have your environment set up correctly, put this makefile in your package directory:

include $(GOROOT)/src/Make.$(GOARCH)

TARG=mypackagename
GOFILES=\
        packagemodule.go\
        anothermodule.go\

include $(GOROOT)/src/Make.pkg

Type
make && make install

and you’re done! You should now have a fully working package that can be imported into other Go projects!

Please note: Go was released yesterday at the time of this posting, it’s likely this process will change sometime in the future.

Learn you a Linux for great good!

Posted in Uncategorized on October 23rd, 2009 by jb55 – 3 Comments

One thing I’ve noticed in the past couple of years during my time at school and on my software engineering co-op is this: a large percentage of software engineering students simply do not know Linux! This may seem odd to programmers who couldn’t have it any other way, but if you’ve never used Linux before you may be thinking: “What’s the big deal”?

It’s fun!

No, really. Humor me for a second if you’re skeptical. If you’re a person who enjoys problem solving and learning new things, Linux is a great environment to experiment and play. It also serves as a practical solution to many problems that you may have not been aware of, which I will talk about in the following sections.

The setup

One thing I’ve always recommended to fellow students and friends is to set up a dedicated linux server in your house. Chances are you’ve accumulated a few old PC’s (if you haven’t thrown them out yet) over the years. Any will do, you don’t need a beefy system to run Linux. For example, jgblue.com runs on 256MB of ram, thanks most in part to how lightweight many linux server distros tend to be. If you don’t have a spare computer lying around you can always pick up a cheap one from some bargain PC shop. As an alternative you can try installing VMware and running linux in that. I’m going to assume you have a dedicated linux box for the following sections though.

So what flavor of Linux should you get? Ubuntu. You can get it here if this is your first time running a server this is your best choice.

I do not plan on making a comprehensive guide to installing and finding your way around Linux. There are plenty of guides on the internet which cover pretty much anything you might want to know. So comprehensive in fact I recommend following a simple rule of thumb: If you find yourself stuck on a problem, and you’re not sure where to go from there, Google it. Immediately. The minute you find in a situation where you cannot intuitively determine the answer within 10-20 seconds, just Google it. Google knows. That’s a generally a good rule of thumb for anything you find yourself hacking on.

So what can I do now?

Here are a few things which I recommend setting up to get the most out of your new Linux server:

Register a domain name to access your server from anywhere
You may of heard of dynamic dns services such as dyndns.org which provides you with a free subdomain. I prefer a service called FreeDNS. FreeDNS allows you to do the same thing, except with your own domain instead. For example I use higgr.com as the domain name for my Linux server at my home.

Set up web server for low traffic personal web pages
By installing a webserver such as Apache, you can easily host web pages for sharing files with your friends or hosting a personal blog. I wouldn’t recommend hosting high traffic websites though, since it will be running off your internet connection.

Using it as an SSH proxy server for getting around firewalls and encrypting your browsing in public places
Using Putty on windows or SSH in Linux, you can use a SOCKS5 proxy to get your home internet connection to fetch websites and tunnel them to your Putty or SSH client. This effectively beats most workplace or school firewalls, while encrypting all traffic to prevent snooping.

Using it as a firewall and backdoor into your network
Your Linux box can sit between the internet and your network. By only allowing a few ports such as SSH, you can hide your entire network while giving you the ability to access your network publicly. For example, you can set up an SSH Tunnel to Remote Desktop into your desktop computer at home. I do this to add torrents to my desktop computer at home from work. Having remote desktop access to your computer at home from anywhere in the world by tunneling through higgr.com is very cool :). There’s even a remote desktop client with SSH capabilities for your iPhone, so you can access your computer from anywhere you have cell reception.

These are just a few of the practical applications I’ve employed since setting up my Linux server. The rest is up for you to discover. Enjoy!

Static O3D Dependencies

Posted in javascript, o3d on October 8th, 2009 by jb55 – 1 Comment

One thing you might notice when writing your first O3D javascript application are the dependencies which are included at the top of your page:

o3djs.require('o3djs.util');
o3djs.require('o3djs.math');
o3djs.require('o3djs.rendergraph');
o3djs.require('o3djs.primitives');
o3djs.require('o3djs.effect');
o3djs.require('o3djs.io');
o3djs.require('o3djs.arcball');
o3djs.require('o3djs.material');
o3djs.require('o3djs.quaternion');

O3D will use these calls to grab the corresponding script using an XHR request, then write them to the DOM if they haven’t been written yet. This dynamic resolution of dependencies is nice for development, since you can add and remove ‘headers’ on the fly without having to worry about dependencies or modifying your html. The main problem with this method is that if you use o3d alot, the extra requests to your server may seem a bit unnecessary.

My solution was as follows:

  1. Determine which scripts are written to the DOM and in what order
  2. Using a makefile, concatenate the files
  3. Remove all require() calls using sed
  4. Compress the javascript using Yahoo’s YUI compressor

Determining dependencies

This step is pretty straightforward. Load your o3d application as usual and then use a DOM inspector such as Firebug, or the built in Safari/Chrome Web Inspector, and take a look in your <head> tag. You should see the injected o3d javascript (under /o3djs/):

Injected o3d dependencies

Injected o3d dependencies

Keep note of these, as they will be used in the next step

The makefile

Using a makefile made the process of concatenating and compressing the javascript extremely simple and streamlined. Here’s the makefile I used to pull this off:

# This is just the root of my static content directory
# you can remove this if you don't need it
STATIC=../../static/ 

# The folder where the the final compressed
# javascript is installed to after 'make install'
PUBDIR=$(STATIC)js/ 

# A temporary folder used to store
# generated javascript (the concatenated/compressed file)
OUTDIR=./build/ 

# Path to the YUI compressor
COMPRESSOR=../yui/build/yuicompressor-2.4.2.jar

# YUI compressor arguments
COMPRESSOR_ARGS=
VPATH=$(OUTDIR)

# Files to copy to the PUBDIR directory on 'make install'
INST_TARGETS=$(OUTDIR)jgblue3d.js

# The location of your o3djs folder
O3D=$(STATIC)j3d/o3djs/

O3D_TARGETS=$(O3D)base.js $(O3D)util.js $(O3D)event.js \
    $(O3D)error.js $(O3D)math.js $(O3D)rendergraph.js $(O3D)primitives.js \
    $(O3D)effect.js $(O3D)io.js $(O3D)arcball.js $(O3D)quaternions.js \
    $(O3D)material.js $(O3D)jgblue-3d.js

jgblue3d.js: combined3d.js
    java -jar $(COMPRESSOR) $(COMPRESSOR_ARGS) $(OUTDIR)$< -o $(OUTDIR)$@

combined3d.js: $(O3D_TARGETS)
    cat $^ > $(OUTDIR)$@
    sed -e 's/^o3djs\.require\(.*\);$$//g' $(OUTDIR)$@ > $(OUTDIR)$@.clean
    mv $(OUTDIR)$@.clean $(OUTDIR)$@

.PHONY: install clean all

all: jgblue3d.js

install:
    cp $(INST_TARGETS) $(PUBDIR)

clean:
    rm $(OUTDIR)*.js

To get this working, you’ll need to change some of the variables to fit your project. The most important thing that you will need to change is the O3D_TARGETS variable. This variable holds base.js, the path to all of the o3d dependencies (gathered in step 1), and finally your o3d code.

Now all you have to do is type:
make && make install

You’re done!

You can now remove the base.js from your html and add the newly generated javascript file! It should work exactly the same as the original**.

** No promises ;)

JGBlue – A Jumpgate Evolution Database

Posted in jgblue on October 7th, 2009 by jb55 – 1 Comment
JGBlue - A Jumpgate Evolution Database

JGBlue - A Jumpgate Evolution Database

Recently I’ve been working on a new project called JGBlue. JGBlue is a database website written in the Python web framework Django, for the upcoming space combat MMO Jumpgate Evolution. I initially started the project to teach myself Django and Javascript, and to apply my knowledge of data formats and reverse engineering I gained from tinkering with World of Warcraft.

JGBlue, pronounced “Jumpgate Blue”, follows close to current popular MMO databases such as wowhead.com and aionarmory.com. I wanted it to have the same look and feel of these sites to make it familiar and easy to use.

A look at a typical item page

One thing you’ll notice in the above image is the prototype model viewer (I call it… The Jb Cube)! To me JGBlue has always been about playing with new technology that was unfamiliar to me, which is why I had a lot of fun implementing a model viewer in Google’s new 3d plugin and javascript api called O3D. I also plan on making some future blog posts about my experience with the api so far and some techniques to get more performance out of it.

Django has been amazing, from the initial prototypes, to what’s running on the production server right now. It has provided me with everything I’ve needed so far, and their documentation is not far from perfection. I have Django running as fastcgi behind nginx, I’ve done a few benchmarks and I feel confident it’ll be able to handle whatever the Jumpgate community has to throw at it.

The release is still far away, with Jumpgate Evolution looking at a release sometime in 2010. Should give a perfectionist like me plenty of time :)

Looking forward to JGE? Leave a message and let us know what you’re most looking forward to!