Mechanizing TWiki: Scripted Wiki Editing With Python

Here at Samsung the Open Source Group, we use TWiki to file our weekly status reports, among other things. These reports include summaries of our upstream activity as measured by the number of patches we review for others and how many we land ourselves. There are various ways to collect these statistics programmatically; the question we’ll look at in this article is how to programmatically upload the data into Twiki.

TWiki is a type of wiki software similar to what powers Wikipedia, WikiTravel, TV Tropes, Muppet.wikia, and on and on. TWiki has more of a corporate-oriented focus, and includes a wealth of functionality for structuring and formatting various types of data that a corporate team might need to create dynamic reports.

Like all wiki’s, TWiki is set up to be easy for people to directly edit pages. It doesn’t require any background in HTML or CSS or JavaScript; you just click a button to edit, and the web browser displays a text box with the content for editing and buttons for formatting which is great for human editors. However, the issue for us is how to inject our mechanically gathered content into TWiki without the need for a human to get involved.

If you happen to have administrative access, you can directly insert data into the SQL database. That is fairly complicated, so perhaps that won’t work; maybe you’re not an administrator, or maybe you don’t wish to muck around under the hood.

Mechanize Your Wiki

Fortunately, there is a python module to the rescue. Python’s Mechanize allows you to create a programmatic web browser that loads a web page and lets you interact with it. Importantly, if the web page has a form on it (as our wiki editor does), it lets you select the form and interact with the buttons, text fields, and so on in that form. For example:

  import mechanize
  ...
  br = mechanize.Browser()
  br.add_password(domain, username, password)
  br.open("http://www.mysite.com/bin/edit/Category/MyPage")
  br.select_form(nr=0)
  print br.form['text']

In just these few lines we’ve implemented a tool to pull the wiki source for a given web page.

Updating the page with our own new text is as simple as:

  ...
  br.form['text'] = 'my text here'
  br.submit()
  print br.response().read()

So far we’ve assumed that the Twiki site uses only Basic Auth. In a proper intranet setup, Twiki would be configured to display a more attractive looking login page, and this will require a little more sophistication in our script. This login page doesn’t appear all the time, just when we haven’t edited in a while, so our script needs to be a bit flexible.

  def login(br):
      br.form['username'] = ''
      br.form['password'] = ''
      br.submit()

  def get_page(br, url):
      br.open(url)
      br.select_form(nr=0)
      try:
          return br.form['text']
      except:
          # Try logging in (adv. auth this time)                                              
          login()
          br.response().read()
          return get_page(url)

With a bit more massaging we can turn all the above into a proper python class, TwikiBrowser. Refer to the code in twiki_browser.py

Using this handy class, here’s a simple example to fill out a template report with our own supplied data:

    # Start up twiki
    twiki = TwikiBrowser()

    # Generate text
    params = {
       'title': 'My Title',
       'color': 'red',
       'desc':  'The quick brown fox'
    }

    with open('my-template', 'r') as f:
    	text = f.read()
    text = string.Template(text).substitute(params)

    url = ''
    twiki.set_page(url, text)

Our template would look something like this:

    <h1>$title</ht>
    <p style="color:$color">$desc</p>

Taking it Further

The next step would be to make the script detect if the page already exists: Retrieve the page with get_page() and check the text that’s returned. If the text returned is just the standard empty page text (or the content length is less than a couple lines), then you can assume the page does not exist and safely overwrite it. Otherwise, parse the old page text as needed and decide how to update or replace it.

As you can see, with a minimal amount of python code we’re able to automate the viewing, editing, and creation of pages in Twiki. To turn this into a full fledged script, you’d want to break out the username, password, and other personal bits to a config file, add command line options, and so forth. You can see a complete example of a Twiki tool for reporting git statistics at autotwiki

Author: Bryce Harrington

Bryce is a founder and developer of the Inkscape project, but began his career in the aerospace industry as a spacecraft propulsions engineer.