My space in the world

Code is poetry

Posted by Sayak
November - 10 - 2008

Open source office suite needs some decent open and royalty-free cliparts. This tutorial presents valuable clipart repositories and an extension which enables direct downloading of graphics into the OpenOffice.org documents.

Direct downloading of cliparts is possible thanks to Clker.com Openoffice.org addon. You will find it here. The Clker.com addon provides integration with clker.com online clipart to openoffice.org. The cliparts you choose and save to your clipart basket, will appear in the extension window inside Openoffice.org. Almost all cliparts are available in ODG format, SVG and PNG files.

Clker.com Openoffice.org addon is available to download here. It has to be installed via the Openoffice.org extension manager.

You may also visit the Clker.com website. It contains a massive collection of royalty-free cliparts, available to download as ODG (Openoffice.org Draw), SVG and PNG formats.

If you prefer browsing cliparts via the internet browser, downloading them to disk, and then loading graphics inside the Openoffice.org applications, then you definitely should visit homepages of these clipart collections:

Open Clip Art Library
www.openclipart.org

This project aims to create an archive of user contributed clip art that can be freely used. All graphics submitted to the project should be placed into the Public Domain according to the statement by the Creative Commons.

The royalty-free clipart library.

The Open Clip Art Library contains vast collections of cliparts (7,000 images), divided into categories, and submitted / uploaded / remixed by users. You may download all cliparts packed in a single ZIP file or gzipped tarball.

WP Clipart
www.wpclipart.com

WPClipart is a collection of high-quality public domain images specifically tailored for use in word processors and optimized for printing on home/small office inkjet printers. There are thousands of color graphic clips as well as illustrations, photographs and black and white line art.

The royalty-free clipart library.

Nearly all graphics are in lossless, PNG format. Currently WP Clipart archive offers about 25,000 images.

OpenOffice.org User/Gallery and Clipart
documentation.openoffice.org/Samples_Templates/User/gallery/

Small but valuable clipart repository. Some files are available in ODG (Openoffice.org Draw) format, others as SVG and GIF files.

The gallery also contains various original artwork submitted to the OOo Competition. It includes 3d Chinese Tea Set, 3d Vase with Flowers, and others.

The royalty-free clipart library.

And, if you wish to find more royalty-free cliparts, visit Clip-Art.com. The site contains more links to clipart libraries.

Posted by Sayak
November - 10 - 2008

First of all start with some basic background explanation about how CPython works, an of overview how python programs run and the operation of the python virtual machine. Then I’ll touch on bytecode and disassembler and an overview of difficulties in the design decisions that were made for CPython. Afterwards, I’ll touch how other implementations different from CPython. I’ll start with Jython, IronPython, then JIT (Just In Time) compilation and the psyco module. I’ll briefly review Shed skin, which is a Python-to-C++ compiler and also touch on Parrot virtual machine. Finally, I’ll talk about stackless Python and after all that will be PyPy that incorporate all the best ideas from all another implementations of python’s VM’s.

As you probably already know, there is a growing number of Python distribution to choose from. Some major implementations includes not only original implementation called CPython which is wide spread in mature but also younger implementations like Jython and IronPyton and perhaps the newest implementation is PyPy. PyPy is specially interesting because incorporates many great ideas that have come up over the years in other Python implementations. PyPy version 1.1 just came out in September 2008 (1.0 in March 2007) and given this milestone, it seems like a good time to take look back at the history major Python implementations, to appreciate how they evaluate and build on each others ideas and also how they will continue.

CPython is a basic

So lets go over some basics about how python program runs. Don’t panic, I like to be clear. If you are already aware how Python runs code, than you can skip next few lines. Let’s start with CPython. As I said earlier, CPython is the reference implementation of Python language, you can get it from www.python.org. First release of it was in 1991 and current version is 2.6. It is named CPython because interpreter itself is written in pure C by Guido van Rossum. So when you running a python program, you are actually runs a C program which interprets your python program. Your python source code is first compiled into intermediate form called bytecode and then that code is then interpreted by what’s called Python virtual machine. If you are familiar with Java that you can see the similarity with Java byte code and Java virtual machine. Is not exactly the same byte code, but very similar. Why are bytecodes used? Well using bytecode speeds up execution, since bytecode is more compact and easier to interpret and manipulate than the original Python source code. But bytecode is not to be confused with machine code, like machine code for x86 processors. Bytecode is a higher level code that is specific to Python VM. So now we have a bytecode that is feeding the virtual machine. And basically a VM is a big loop. It gets the bytecode that has been sent to it and examines the bytecode to determine which C function has to be executed to implement the instruction for the bytecode. Each bytecode represents the operation on internal Python virtual machine data structures at the C code level. Pretty abstract isn’t it? You can watch what the VM is doing with python disassembler module dis.

Here is a short example, first I define a new simple function:

>>> def double(x):
... return x*x

And this is the result:

>>> import dis
>>> dis.dis(double)
2 0 LOAD_FAST 0 (x)
3 LOAD_FAST 0 (x)
6 BINARY_MULTIPLY
7 RETURN_VALUE

So we can say that CPython is a stack based language. It looks very similar to CPU language for processors. Enough with bytecode. Let’s summarize bytecode:

  1. Python bytecode is instructions that manipulates objects not values. No LOADs, PUTs, JUMPs etc.
  2. Python is a dynamic language, so no C equivalent for some bytecode instructions like build class (class) or make function (def). These two instructions tell the interpreter make these objects on the fly while the program is running.

Next, although CPython supports all the flexibility of the Python language, the internal design is not as flexible as it could be. In the design phase, a few decisions had to be made that are fixed to all version of CPython code. For example, garbage collection (is not easy to implement new memory management algorithms), threading model (CPython use what is called Global Interpreter Lock – GIL, that makes internal data structures coherent when multiple threads were running simultaneously), etc.

Note: GIL is the reason why CPython can’t use the potential of todays multi-core processors for multi-threaded applications.

Rewriting this C code base is, well, impossible because the code is old, huge and tricky to maintain. Let’s ask ourselves a question: “Can we ever create a new distribution to address the weaknesses with CPython?”.

The others

Well the short answer is: “Yes”. We have tried and many have succeeded. We’ll talk about the distributions next.

Jython

Jython is a python implementation that allows you to run python programs within a Java environment. It was originally created by Jim Hugunin in late 1997. He explored that Java could be as fast as C for simple numerical benchmarks, and he also discovered that it was easy to translate Python to Java by hand.

What exactly is Jython? Jython is a set of Java classes that allows Python bytecode to run on a Java Virtual Machine. Using Java Virtual Machine for Python has many advantages.

  1. Since Python and Java are using the same virtual machine, is very easy to import and use Java classes in Python.
  2. Using the JVM allows Jython to utilize all the work that has gone into improving and tuning Java VM. For example, Jython can use java’s garbage collector, JVM has existing threading library, no GIL and multi-core processors restriction.
  3. Is not necessary to reimplement processes like exception handling, libraries and other things that JVM provides.
  4. You can also use HotSpot optimizations.

So, Jython is more natural for Python that CPython, because Java is a fully object oriented language, whereas C is not. But there are some disadvantages to using Jython. Jython runs slightly slower than CPython. At this time, it is recommended to use version 2.2.1 (even if 2.5a3 is available), 2.2.1 is approximately equal to CPython 2.2. Unfortunately, Java is not directly fully compatible with C based extension modules in CPython.

Most people argue that because Java was designed for non dynamic language, the dynamic language of P
ython does not work well in it. This is only slightly true, obviously Jython works, but Sun Microsystems also says that they are working to extend JVM to provide stronger support for dynamic languages. In fact, in the last approximately two years they have included the new JSR 292 (adding new bytecode invokedynamic), which deals with Dynamic Language Support to the JVM. For more information see this. One great example of Java’s dynamic language support is Groovy.

Lets take a short look to Java HotSpot VM options which speeds up Java execution. It is Java’s combination of JIT (Just In Time) compilation and adaptive optimization. These two techniques are very useful for dynamic languages like Python.

How it works: by interpreting the bytecode, the Java VM watches for “hotspots”; that is frequently executed sections of bytecode. These hot segments are compiled “just in time” by the compiler into machine code, where the program is running. This code is cached, so next time it isn’t requiered to recompile it. Could we used this technique also for Python (Jython)? Yes, we can with the module called Psyco.

Psyco

In 2001, two years after Java HotSpot technology came about, a team lead by Armin Rigo started the project called Psyco. It is an open-source project with a goal to add JIT into CPython. What it does is emit machine code on the fly instead of interpreting the Python program step-by-step. Once the machine code is generated, the code is cached and run dynamically rather than as interpreted bytecode. The benefit here is that the program runs faster – between 2x and 100x depends on what you doing. The typical acceleration is 4x. The 100x increase is seen more in algorithmic applications, like tiny loops. The only disadvantag is large memory usage and that Psyco runs only on i386 compatible processors.

If you are interested look at the Psyco homepage.

If you are interested in Jython please visit the homepage.

IronPython

Jim Hugunin, yes the same person who created Jython, then moved on to Microsoft. There, he is using his experiences with Jython to create another Python distribution called IronPython. IronPython allows Python code to run on Microsoft VM, which is a CLR (Common Language Runtime). It is similar to JVM, but not exactly the same. It provides common services for all languages that it hosts. For example memory management, exception handling, threading support, security etc.

Also, Microsoft decided to add special features for dynamic languages called dynamic method class.

If you are interested for IronPython you can check the homepage.

Shed Skin

As I mentioned above, Shed Shed is a Python-to-C++ compiler. But, it’s hard to deal with the dynamic runtime information after the program is compiled. For example, Python doesn’t declare variables, C++ does. So Shed Skin uses a type inferencing algorithm to guess variables types. Other disadvantages are that Python can’t retype variables after compilation, and not all Python features are supported.

Although it is an experimental project, it shows that it is possible to run Python programs more than 2x – 40x faster over Psycho and more than 2x – 220x over CPython. Its also interesting know how much “just in time” optimization is possible.

If you are interested go to the homepage.

Parrot

This project started out in the Perl community as a joke – Larry Wall and Guido van Rossum would merge Perl and Python together. The merge language would be called Parrot. Of course since it was a joke, the merge was never happend. This virtual machine is currently a Perl6 Virtual Machine written in C. This VM can host more then Perl6 with support for Tcl, Python, Ruby, JavaScript and Scheme, among others. So what this means is that many languages can be interpreted by same VM. This means that like in Jython, the languages can cooperate.

Homepage is located here.

Stackless Python

This python distribution is adjusted to handle massive concurrency; it can run thousands of threads simultaneously. This is very useful for simulations and games as an example. Stackless is used extensively in the implementation of the EVE Online multiplayer game to provide for concurrency, Civilization IV, as well as in IronPort’s mail platform. Second Life is also beginning to use it.

If you are interested check homepage.

PyPy

Wouldn’t it be great if there was a single distribution that can provide everything of all the mentioned distributions above, and more? Well, we need an interpreter which can run on every mentioned interpreter and is easier to maintain than interpreters written in C. You can probably guess that I’m hinting towards what PyPy does today.

PyPy is an open-source project which was started back in 2003 by Armin Rigo (creator of Psyco) and Christian Tismer (creator of stackless Python). PyPy is constructed from various components:

  • Python interpreter – written in Python.
  • Set of tools also written in Python. This set allows using different VM’s

One of goals of PyPy if very fast execution, JIT is also included. It also fairly compatible with CPython, up to version 2.4.1. It still isn’t mature enough yet for daily use, even thought it passes around 98% of CPythons core language regression tests.

In my opinion, it is a very interesting fact that interpreter is written in same language that interprets. This sounds pretty weird, doesn’t it? You can run Python interpreter on Python interpreter 🙂 But it is an approved technique for large projects. PyPy can be run on all python interpreters, that I mentioned, but its very slooow.

I have to mention a few other interesting features that PyPy have. Except JIT and stackless features (core routines, see stackless homepage for more info), PyPy provides sandboxing. Sandboxing features are very useful to increase security of applications. The application can run fully virtualized. Or you can define proxy objects, and create something like an internal application firewall. You can also use the PyPy translators, that can translate bytecode to C, Java or Prolog for example:)

If you are inter
esting in PyPy, head over to the homepage for more information. They have great documentation I must say.

Posted by Sayak
November - 10 - 2008

Undoubtedly you’ve heard the old cliché that Windows is easier to maintain because it has GUI tools for everything while Linux requires commands lines and a terminal. Any experienced Windows administrator knows the point-and-click GUI tools don’t cover everything. Likewise any experienced Linux administrator knows there are many GUI tools for Linux configuration but terminal shells are available on ANY system regardless of how big or small and the ability to script any action in a platform-neutral way is too useful to give up. I just again encountered a situation on XP that required a command-line fix and it highlights the ignorance of many fanboys about the reality of Windows system administration.

I recently installed Windows XP Pro from scratch on a dual-boot system. I normally install Windows first as it doesn’t play well with other OSes when it comes to the boot loader. I was using an original XP OEM CD with SP1 integrated. After installation I copied over SP3 and installed that as well. Doing it this way reduces the number of update/reboot/update cycles I have to go through with Windows Update and reduces the risk of an exploit before the process is complete. After rebooting, I run Windows Update and go through the usual Windows Update update, Installer update, activation, and WGA check. I then install all of the critical updates. There are a surprisingly large number of them considering I already have SP3. Reboot again and run WU again and install NET Framework runtimes, IE7, Media Player 11, and more updates for the updates. Reboot again and go back to WU again. Install more updates for the updates and everything else I just added. Or at least I tried as they refused to install, reporting “failed” for all of them. I went through the typical diagnostics Windows admins have learned over the years of deleting out temp files, clearing the browser settings, and attempting to install each update individually to no avail. Some Google searching turned up a blog posting about Wups2.dll not being registered properly if the system is updated through WU and not rebooted before SP3 is installed (KB943144). Of course this doesn’t explain my situation as WU hadn’t been used before service pack was installed. The workaround requires stopping the WU service, manually registering the dll from a command window, and restarting the service. This fixed my problem.

This isn’t an unusual repair process for a Windows system. Even for Vista there are plenty of examples of command window (cmd.exe) and regedit repair instructions in Microsoft’s support pages. You can ignore all the myths floating around the Internet about never having to use Unixy command lines when administering Windows systems because of the wonderful graphical tools. On Windows there are many tasks that are impossible to perform with graphical tools or are just a lot easier from a command window. The only way to avoid command line tools or regedit entirely is to write a custom graphical tool that handles those specific situations (similar to “compiling a kernel” comments from Microsoft fanboys). The fanboys will point out that regedit is a graphical tool but the reality is that it isn’t much more of a “tool” than Notepad (which was used in the pre-registry days with win.ini and system.ini). An IT manager that hires an admin that doesn’t know how to use regedit or command-line tools should themselves be replaced. When screening job applicants I’ve encountered many “certified” admins that didn’t know anything about maintenance outside of the graphical tools (or even basic hardware troubleshooting for that matter). Surprisingly, I’ve also worked with software engineers that had a paranoid fear of even regedit. It’s like they’ve been brainwashed into thinking that the only “proper” way to work with the registry is to use an API and approved function calls. Apparently they haven’t experienced the “fun” of trying to remove auto-starting malware entries from it.

Because of the emphasis on graphical tools the skill of working at a low level with the Windows OS is a dying art. While the graphical tools lower the barrier for entry into system administration it also invites fools (with only superficial skill) to enter (and get certified) without low-level skills valuable for troubleshooting. Graphical tools provide them a flower-strewn path to anywhere they want to go but when a situation calls for them to go off the path they are lost – much to the pleasure of seasoned consultants who will guide them back to safety for a hefty fee. System administrators are not the types of users that recovery disks were intended for but unfortunately a lot of amateur admins rely on them.

The fundamental limitation of graphical tools is that trying design an interface for every conceivable configuration option, troubleshooting situation, and maintenance function ends up making the tool more complex and time-consuming to use than the task itself. There are occasions when GUIs are easier than command lines but it’s usually a situation involving an over-complicated design of the underlying system than a practical improvement in efficiency. The hierarchical structure and relationship of keys and values in the Windows registry is relatively simple but the file format makes regedit a necessity. Typing in a registry key path to a command line application like reg.exe, especially one that includes a GUID, is painful. On Linux you can experience a similar difficulty when trying to work with an XML configuration file like the one pam_mount now uses.

Graphical configuration tools like regedit are not unique to Windows. Gconf-editor provides a similar interface to the Gnome GConf settings database. But the terminal isn’t going away anytime soon as it’s too powerful and even on Windows the DOS-derived command window is still present. Windows admins have learned to live with its limitations, switched to higher-level programming languages, or extended it with third-party utilities like KiXtart (which I’ve used). The Windows PowerShell is Microsoft’s attempt to replace this last remnant of the DOS era and it’s legacy syntax. This may be their admission of the limitations of a GUI or just be a response to the popularity of headless systems in data centers and the need for a replacement to a 20 year old shell. I haven’t tried PowerShell myself as I moved away from the Windows platform before it was released. With the availability of virtualization I now just use Windows as a bloated runtime for legacy applications and I don’t need to do scripting anymore (although I’ll admit to playing with batch files in FreeDOS once in a while).

Posted by Sayak
November - 10 - 2008

“Google’s actions in this case appear to fly in the face of what Android is all about. Android is open source. It is available to anyone who chooses to download it. Doesn’t that mean tinkering with the code is welcome, nay, expected? It turns out, it isn’t.”

Complete Story

Posted by Sayak
November - 10 - 2008

Wibrain UMPC – The B1L runs on Ubuntu. It has a 1.2GHz VIA C7M ULV processor, 60GB HDD, 1GB DDR2 RAM and a 1024 x 600 pixel resolution with a 4.8-inch LCD display. Plus, not to mention the qwerty keyboard and direction keys for gaming.

About Me

A son, brother and friend. Enjoys scripting and making small bits of apps here and there. Wants to conquer the world (well, who doesnt). A geek who has an obsession for ponies. Loves acoustic and wants to play guitar sitting on the Hollywood hill one day!

A Word About KDE

KDE is a versatile software compilation for all platforms. It is an intuitive and powerful desktop environment that focuses on finding innovative solutions to old and new problems, and creating a vibrant, open atmosphere for experimentation.

» http://www.kde.org