IBM®
Skip to main content
    Country/region [select]      Terms of use
 
 
    
     Home      Products      Services & solutions      Support & downloads      My account     
Multifunction multimedia machine, Part 3: Scripting and scaling for fun and profit
skip to main content

developerWorks  >  Power Architecture technology  >

Multifunction multimedia machine, Part 3: Scripting and scaling for fun and profit

Displaying what you want, where you want it

developerWorks
Document options

Document options requiring JavaScript are not displayed


Rate this page

Help us improve this content


Level: Introductory

Lewin Edwards (sysadm@zws.com), Design Engineer, Freelance

10 Jan 2006

Lewin Edwards looks at the history and design of X and why it matters for an embedded graphics system and introduces a basic scripting language for controlling a multimedia display device.

The previous article showed you how to get your PowerPC®-based multimedia device up and running, loading and displaying JPEG images. This episode shows you how to add image scaling and -- more importantly -- scripting functionality, and I'm going to describe for you some of the design theory and engineering difficulties to be encountered in building a multimedia appliance. I'm also going to go into a bit more detail about some material I glossed over last time.

In the last episode, I started off by describing the Linux® framebuffer device and showing you how to attach your application to it. I then donned a stunningly lifelike weasel costume and ran our application inside X, while promising to explain this decision later.

I introduced X into the equation in last episode for two reasons:

  • There's apparently a bug in the kernel framebuffer driver for the ATI Mach64, which is the graphics chip on my "backup" development system. You can work around this bug by starting the X server, which correctly initializes the graphics chip.
  • Some of the later applications I'm going to discuss are easiest to implement by pulling in external programs that require an X environment. You can get a much faster, more seamless end result by having the X server and window manager (see below) running at all times, although most of the time you'll bypass it.

So, what is X? Variously known as XWindows, the X Window System, X11 or just X, it is a highly platform-independent graphical environment. The principal interesting design goal of X11 is to separate the display subsystem (referred to as the "server") and the application needing display services (referred to as the "client").

Mistaking a client for a server

In general, client/server applications use a small, lightweight, client front end on the user's machine, and a big heavy server application on a back end somewhere far away. X reverses this, with the server being the program running on the desktop. This is because the X server is an interface server. This nonetheless confuses people sometimes, because the "client" is most often the more processor-intensive application.

If you're used to X, you probably already knew this. I mention it only because it seems to stay counterintuitive.

The X protocol as such only describes the raw mechanism for getting blocks of pixels displayed on a screen. To organize those pixels into a coherent GUI with functional radio buttons, checkboxes, windows with title bars and resize gadgets, and so forth, you need another piece of software called a window manager. The window manager handles presentation of windows and user interaction with those windows, and provides a consistent look and feel to different applications.

X toolkits

Because of the fact that it's excruciatingly difficult to program X directly, most programs interact with toolkits that handle the low-level details in a consistent manner. (Note that in the Bad Old Days, this used to result in utterly incomprehensible user interface differences; no two X-based boxes looked or felt the same because of the disparity in window managers and toolkits. This problem has been ameliorated in recent years, mainly due to the rising popularity of specific desktop environments such as GNOME and KDE).

All this makes X rather schizophrenic. The underlying design assumption is that your program and the GUI that's rendering its output are on opposite ends of a network connection. Such a design is neatly compartmentalized but horribly inefficient on a modern system where everything -- GUI code and application code -- is running inside a single physical address space on one machine. As a result, mouse holes of various sizes provide fast transport across the territory between the application and the graphics hardware. These range from simple shared memory areas to complicated multilayer hardware-access APIs like the XFree86 Direct Rendering Interface (see Resources).

Note that the X protocol does not describe any interface for non-graphical multimedia functionality, such as audio input and output. A separate piece of software, referred to as a sound server, handles this. For this example, I will not be using a sound server, since I know that my application is going to be running on the same machine where I want the audio to be heard.

Once you start getting into the specifics of X implementations on a given platform (like, for example, the version included with Yellow Dog), things become even more complicated, because there are all sorts of different ways you can be talking to X and all sorts of ways X can be talking to your hardware. For example, you can run X on top of the kernel framebuffer driver in such a way that X never actually touches the video control registers; it just uses framebuffer ioctls to find video memory and then writes to it.

At the other extreme, you can run a "native" X server that does all its own initialization of the graphics hardware. Each method has advantages and disadvantages, and I'll dig quite deep into those issues in the next article when I talk about movie support.

For now, all this information is by way of background. The example application is not going to use any X APIs; I need to run alongside X for the external dependency reasons I discuss above, but I'm not actually going to use X for anything directly. So, enough -- for the time being -- about X.



Back to top


Defining the scripting language

When someone says "I want a programming language in which I need only say what I wish done," give him a lollipop. -- Alan Perlis

Now, what of the scripting language? Since we're talking here about something that is at least potentially a consumer appliance, the user interface and scripting system need careful thought. There's rather a lot of prior art in the field of digital picture frames, and most of it is not very good. (I can say this with such authority because I'm the engineer behind several fielded commercial digital picture frame products and a few unreleased prototypes; I was intimately involved with the product category for some five years and had the opportunity to test a lot of competitors' products). Let me begin by giving you a thumbnail sketch of the design thinking behind user interfaces on these sorts of products.

Multimedia appliances can broadly be divided into three categories:

  • "Dumb" devices -- of diverse capabilities -- that exclusively play commercially produced media containing implicit execution instructions, or which use a paradigm that is intuitively understood by the average consumer. A DVD player is a good example of this: it plays preprogrammed media and uses a very familiar VCR metaphor for most of its operational functions. Although it is possible to create your own DVDs, of course, the design intent of the player itself doesn't encompass this functionality and its user interface is concomitantly simple.
  • "Smart, small" appliances that hold a library of user-supplied content and don't offer much flexibility in how that content is played back. These devices usually have a limited capability to display information to the user. They always have a very narrow range of program content options that can be programmed on the local user interface; an MP3 player or pocket video player is a good example of this category. These appliances might connect to a computer and use the computer as an enhanced user interface.
  • "Smart, big" appliances, usually with lots of storage for user-supplied content. These devices generally have the same order of magnitude of processing capability as a desktop computer, and might have fairly sophisticated controls allowing the user to set up complex content playback options.

My categories here sound terribly dogmatic but if you think about specific examples familiar to you, you'll realize that (with a few exceptions) the devices that work the best in consumer hands tend to fall into one of these pigeonholes. If you've been a gadget-hound for any length of time, you'll certainly have bitter memories of devices that try to cross categories and make a very poor job of it.

The most common design error -- not just in multimedia appliances, either -- is to take a category 2 appliance and squeeze category 3 functionality into it. You can always devise some arcane sequence of button presses and other events to achieve some sophisticated configuration feature, but it's also always irritating for the user to have to deal with these shoehorned user interfaces. Apple's iPod device is an excellent example of how choosing not to implement complex functionality on a physically small device leads to an elegant, simple, and commercially successful design. The iPod hardware is quite capable of running a PDA-like OS with enormously complicated applications on it, but it would be much harder to use than the simple hierarchical tree menu system that Apple chose to develop.

The converse case is to develop a category 3 appliance with lots of functionality and ample system resources, but make the configuration of most of this functionality inaccessible to the user. The exasperation arises here because the appliance is more than capable of running a user interface giving full access to its capabilities, but the manufacturers choose to force you to use some kind of external object -- usually custom content authoring software on a personal computer -- to access the full range of device features. Often, this step is added solely in order to provide cross-marketing opportunities. For instance, a family of products in this digital picture frame category required a live Internet connection, and could only be administered by tinkering with a back-end Web site. The reason for this was so that the frame always had guaranteed access to a fresh stream of advertising images in addition to the user-supplied images. There was no value added to the consumer, and it was a very irksome sort of product.

Design requirements

With this in mind, consider the goals of a scripting language that will fulfil the needs of a flexible, category 3 appliance of the type in this series:

  • The file format should be easily read, decoded, re-encoded from memory structure, and written back to storage media.
  • It should be extensible and general-purpose; you might need to provide special, currently unknown parameters when playing back some kinds of media.
  • It should support easy and rapid seeking from one entry to the next, and random-access seeking.
  • Ideally, it should be editable using a commonly available tool, without exotic programming knowledge.

Unfortunately, the best way of satisfying the need for simple random seeking is by using a binary file with a fixed record length. (I used this method in the first generation of consumer digital picture frames I developed). The problem with this is that although binary files are easily read into memory, and it's easy to seek randomly within them (you just seek to a byte offset calculated as record number * record size), it's impossible to edit those files in a user-friendly way without writing custom software to do it, and extending the file format to handle new options involves tedious translations. In my previous life, I solved this issue by making the user write the script in a text editor and compiling the text into a temporary (volatile) binary file for easy random access at runtime. The reason for jumping through this hoop was because these scripts were permitted to be arbitrarily large.

For this appliance I'm going to assume arbitrarily that the script file will always be small enough to fit entirely into RAM. If you look at html.c, you'll see that the load-script function simply determines the size of the scriptfile, allocates memory to load it, then pulls the whole file in as a whole. In order to make on-the-fly parsing easier, the load function also strips out control characters.

Note, by the way, that the limit of 16,384 slides imposed by the previous version of the slide show application still exists in this version; I've merely changed the way that 16,384-element structure gets populated.

For this appliance, I've decided to use a subset of HTML and a simple HTML parser. The reason for this is that you can use any HTML editor -- Openoffice.org, for example -- to create the script file in a WYSIWYG manner.

The application -- which you should download and build at this point -- simply loads the file /web/script.html and attempts to play it. The parser I've implemented scans the script file for <IMG> tags using a very simplistic algorithm:

  1. Scan for the opening angle bracket.
  2. Wait for the characters "img" (case-insensitive, whitespace ignored).
  3. Wait for the characters "src" (case-insensitive).
  4. Wait for a quotation mark.
  5. Assume the characters from this point up to the next quotation mark are a filename to be loaded.
  6. Paths to image files are assumed to be relative to /web.

Observe that the syntax of the IMG tag supports any number of parameters for each image. HTML defines some of these; for instance you can specify WIDTH= and HEIGHT= to tell a browser to render the image with a specific geometry. You can also add practically any meta-information your slide show might need by means of custom tags.

Assuming you have two JPEGs called PIC001.JPG and PIC002.JPG, an example slide show file would be the following:


Listing 1. A sample slide show script
                
<HTML>
<BODY>
<IMG SRC="PIC001.JPG"><BR>
<IMG SRC="PIC002.JPG"><BR>
</BODY>
</HTML>

Note that the HTML, BODY, and BR tags are just syntactic candy to make the file palatable to a regular Web browser; the slide show program will run fine without them.

The other major feature I've added to this version of the slide show program is image scaling, an important feature for any digital photo album. The scaling algorithm implemented here is a simple decimation system. The code is a little tortuous to read because it uses integer arithmetic for speed reasons; I always find it easiest to visualize this type of algorithm in semi-algebraic terms.

Consider that you have a single line of an image which is m pixels wide, and which you want to scale to n pixels onscreen. Further consider that you have a source pointer s, pointed at the left-hand pixel of the source line and a destination pointer d, pointed at the left-hand pixel of the target rendering line. Clearly, you only want to render each destination pixel once, so you are going to loop n times and increment d on each iteration of the loop. The source pointer has m/n added to it on each loop iteration. (That's the tricky part to do with integer arithmetic, by the way -- it's easy with floating-point, but not as fast).

So, you now have a rather more controllable slide show program. The next article shows you how to add support for movies. As part of this process, I'll delve into a little more detail of the love-hate relationship we currently have with X, and I'll enhance the script file format even more.



Back to top


Downloads

The downloads for this article are being updated. Please try to download later.



Resources

Learn

Get products and technologies
  • Download the source code referenced in this article from the table above. The usual warning applies to Internet Explorer users - make sure to save the file as something.tar.gz!


Discuss


About the author

Lewin A.R.W. Edwards works for a Fortune 50 company as a wireless security/fire safety device design engineer. Prior to that, he spent five years developing x86, ARM and PA-RISC-based networked multimedia appliances at Digi-Frame Inc. He has extensive experience in encryption and security software and is the author of two books on embedded systems development.




Rate this page


Please take a moment to complete this form to help us better serve you.



YesNoDon't know
 


 


12345
Not
useful
Extremely
useful
 


Back to top



    About IBMPrivacyContact