Sunday, March 29, 2015

Building a single/simple-minded website in Python

A lot of scripts I end up writing begin as solutions to a problem or a way for me to be lazy and maybe learn something in the process; today's topic is no different. As some of you know, when you are deploying new Linux boxes from scratch, you can be lazy and use stuff like kickstart (centos/RedHat) and preseed (ubuntu, Debian) to do the initial install before handing over to something like Puppet, Ansible, or Salt (to name a few). Thing is you need to feed the preseed/kickstart file somehow.

One way to do it is to build a .iso or setup network install (you know the drill: PXE boot, DHCP, and so on). A very nice example (which I myself have used before) is shown in the CentOS docs. Ubuntu/Debian have something very similar. Now, the step that is relevant to this blog entry is the one in which the preseed/kickstart file is passed to the new host you are building. We can make it available in a web server, and tell the new machine where it is. Just to let you know, if you are using docker, the concept is the same. I know I am going really quickly through this because all I am doing right now is explaining the need that caused me to write this.

So, we established we need a web server to feed the preseed/kickstart file. But, there are times we do not need a full fledged website, all we want it to do is to offer one single file. And once the file is provided and the host is created, the website can go away. I imagine you are smelling some kind of automated host building script that automagically creates the web server it needs in the process. And you are right, which is why I wanted something with as little footprint as I can get away with. In other words, I would love to have a web server that completely runs off a single script.

To do the deed, I chose to use Python. Besides the fact I suck at ruby, I bumped into an example of a simple python-based webserver using something called BaseHttpServer. I modified it a bit and came up with the following script to serve a preseed.cfg file:

#! /usr/bin/env python
'''
Simple dumb webserver to serve a file.
It will try to serve a file called preseed.cfg, located in the directory
program was called, on localhost:8000

The idea is you can ask for any file you want, and will get what we
give to you.

Shamelessly based on https://wiki.python.org/moin/BaseHttpServer
'''
import time
import BaseHTTPServer

HOST_NAME = '' # Accept requests on all interfaces
PORT_NUMBER = 8000
FILE = 'preseed.cfg'

def read_file(filename):
    '''
    Read in (text) file and return it as a string
    '''
    file = open(filename, "r")
    return file.read()

class MyHandler(BaseHTTPServer.BaseHTTPRequestHandler):
    def do_GET(self):
        file = read_file(FILE)
        self.send_response(200)
        self.send_header()
        self.end_headers()
        self.wfile.write(file)

if __name__ == '__main__':
    server_class = BaseHTTPServer.HTTPServer
    httpd = server_class((HOST_NAME, PORT_NUMBER), MyHandler)
    print time.asctime(), "Server Starts - %s:%s" % (HOST_NAME, PORT_NUMBER)
    try:
        httpd.serve_forever()
    except KeyboardInterrupt:
        pass
    httpd.server_close()
    print time.asctime(), "Server Stops - %s:%s" % (HOST_NAME, PORT_NUMBER)

Here's a quick rundown on what it does:

  1. Import the two libraries we need. Note they are default libraries every python install should have. This is not supposed to be a fancy or remotely clever script.
  2. Define some constants.
    HOST_NAME = '' # Accept requests on all interfaces
    PORT_NUMBER = 8000
    FILE = 'preseed.cfg'
    In the script I showed how to take the lazy way and make the service listen on all interfaces. In fact, when you run the above script, it should show in netstat as
    raub@desktop:~$ netstat -apn|grep 8000
    (Not all processes could be identified, non-owned process info
     will not be shown, you would have to be root to see it all.)
    tcp   0   0 0.0.0.0:8000    0.0.0.0:*     LISTEN     11539/python
    raub@desktop:~$
    As you know, setting the IP to 0.0.0.0 means everyone + the cat, which is why you can see it is listening on every interface in this machine, localhost included, on port 8000.
  3. The MyHandler class, which handles all the http requests, only cares about processing GET events. And, when it sees one, all it does is spits out the file preseed.cfg as a Content-type: text/plain.
  4. When you run the script, it should show something like
    Mon Mar 23 09:17:50 2015 Server Starts - :8000
    when it starts. And then when someone actually does hit the server, it would show a message like
    192.168.5.10 - - [23/Mar/2015 09:28:13] "GET / HTTP/1.0" 200 -
    which would indicate that 192.168.5.10 connected to our little webserver and, as a result, got the preseed file. If you use wget,
    wget the-server:8000
    it will create a index.html file with the contents of preseed.cfg

I will be the first to say this Python script is very small and dumbed down from the script shown in the wiki. I did that for a reason: since it ignores any real request from user (no matter what you ask, it only sends the config file), it is very simple minded in a good way. Asking it to show the list of files somewhere or upload something might be a bit challenging. Now, you might want instead of offering this config file to serve some kind of simple webpage that is created on the fly, like some status page. You could use a script like the above to do the deed.

I guess where I am really getting to is that if you need, say, a webserver to only server one simple stupid page changes are you do not need a full Apache install. In my own case, why I would even want to have a full fledged webserver running 24/7 just to server a page (or many pages) that only need to be available for a few minutes? I know this concept is not hip anymore, but there is something to be said about having a simple tool that does one single thing well and can cooperate with the other tools to build a complex task.

As I mentioned above, the script is pretty hopelessly dumb. And I bet you can improve on it. I mean, even I decided to improve on it a bit. Specifically, I wanted to be able to provide the filename, IP address (so it is only running on the network interface using that IP), and port from the command line. That would make it easier to use the script without having to modify its code.

And I found that BaseHttpServer really did not want to do that. In fact, what it really want is to read the request from a client and do something based on that. Since that is not what I wanted, I had to learn how to, well, hack my way around that by overriding the __init__ constructor. I am not going to waste time here posting the modified code; I placed the script on github where I hope one day to prettify/improve it.

Notes

  • If you do not want to use python, you can run a webserver in one line of bash or Powershell

1 comment:

Unknown said...

I have go-through your blog its really very informative. I have also learned few points.
smartphone app developer company