wiki:HecklePrimer
Last modified 9 years ago Last modified on 04/05/10 10:26:07

Heckle Primer

TOC( HecklePrimer, GettingAndInstallingHeckle, QuickStart, UsingHeckle, HeckleCommandLine, HeckleRestInterface, )?

What is Heckle?

Heckle IS a management service for large collections of computers that boot from the network.

Heckle IS a allocation manager for computing resources. Heckle is capable of assigning computers to users upon request for arbritaty durations for whatever tasks the users want.

What is Heckle Not?

Heckle is NOT a boot image creation system. Heckle tells nodes which image to boot and thats it. Of course, there is no reason why one couldn't write an image creation system on top of Heckle.

Resources

Heckle manages three Resources:

  • Nodes
  • Images
  • Hardware

Nodes

Nodes are the heart and soul of Heckle's job. Everything about Heckle revolves around the management of Nodes.

A Node is fairly generic term that refers to any computer on the network that looks to Heckle to determine what it should boot. It could be a file server, a login machine, or a general purpose cluster compute node.

Nodes have the following major properties:

  • Name - This is the host name of the computer
  • IP Address - This is the IP address of the computer
  • MAC - This is the MAC address of the computer
  • Image - This is the Image Resource that the node is currently assigned to boot
  • Hardware - This is the Hardware Resource that the node is classified as
  • Power Interfaces - These are any power interfaces the node is equipped with

The basic workflow of Heckle's Node management is contained in the Config Daemon service. It follows this series of steps:

  • Node A is powered on and requests an IP address
  • Node A is informed by DHCP to load the gPXE network bootloader
  • Node A loads gPXE from tftp and requests another IP address
  • The DHCP server notices the gPXE option is set and informs Node A to boot from Heckle's web server
  • Upon receiving the request from Node A, Heckle looks up the node in its database
  • Heckle finds a gPXE script for Node A's Image and current image stage and processes it
  • The config is sent to Node A which then follows the script and boots
  • If Node A boots successfully, it sends a success message
  • If Node A has an error, it sends a error message and Heckle is informed that it should not advance the current stage.
  • Upon the next boot, Node A gets the next stage if no error occurred, or the same stage if an error did occur.

Hardware

Hardware refers to the configurations of real physical hardware the nodes are running on.

Hardware is considered to be generic. In terms of a programming language, Hardware are classes and Nodes are instances of a particular Hardware class. Hardware defines what features a particular node has.

Hardware can be defined as loosely or as detailed as you want. The simplest Hardware is just a name. On the other hand, Hardware can be defined with many properties that define the capabilities of the platform. These can be anything you want, but most commonly Hardware properties refer to concepts like network adapters, memory sizes, CPU architecture, etc...

Hardware has the following major properties:

  • Name - The name of the Hardware class
  • Properties - A mapping of property keys to values. One property key can refer to multiple values. This allows for complex hardware that, for example, may have multiple network interface types.

Images

When a node that Heckle controls boots across the network, Heckle must decide which Image should be applied to the node.

An Image in Heckle is really a series of connected images called Stages. Each Stage consists of a name and a boot script associated with the stage. For the majority of Heckle's users this script will be a gPXE script.

However, the scripts are far more flexible than just gPXE scripts. In fact, the scripts are really Cheetah templates, which gives the person managing the scripts great control over what the finally script looks like to the node. Also, due to the template nature of the scripts, it would be possible to use another network boot rom instead of gPXE - though it would need to support loading over HTTP in order to talk to Heckle. To learn more about Stage scripts, see this page: Working with Stage Scripts?.

Allocations