Python

Introduction to Python

Python is a multifaceted, multi-platform, open source, general-purpose and object-oriented programming language like Java, C++ TCL, and Perl, but with a particular affinity towards text processing. Python is a language that maps well to the thought process and abilities of the modern sophisticated programmer. Python has minimal syntax and it is highly scalable, interpreted and readable. It is very easy to use and for maintenance and increases significant productivity.

Python is also useful for coding scripts to be embedded in server side technologies such as Microsoft ASP towards generating dynamic Web pages. Python is highly portable that its statements can be interpreted in a number of operating systems including Unix and its variants, Mac and Windows systems. Python is freely available in both source code and compiled form for nearly every hardware platform in this planet. A major deviation of Python from other languages is that Python uses the indentation of lines of code to work out how the lines should be grouped together for processing. Python is highly interactive, and specifically good for XML processing by its string slicing feature.

Python is written in C and has incorporated many robust features of C. Execution of Python programs is similar to Java. Python programs are compiled into an intermediate representation known as Python bytecode, which will be executed by Python Virtual machine. Python uses reference counting as its memory management scheme.

Python is a dynamically typed language, that is, its variables are not restricted to referencing objects of any particular type by the language.

Scripting languages are traditionally used for small tasks, gluing other programs together, and automating processes, Python is unique in its scalability, from small scripts to very large systems. Python incorporates the best features of a variety of other languages, and introduces a few new ideas of its own, into a remarkably elegant, cohesive, and useful whole. Python can help the job done more quickly and simply.

Python Features

Python's features are more inclined toward programmer efficiency and efficacy, sometimes at the expense of program efficiency. That is, Python programs are faster to write, but sometimes slower to run.

Python is a very high level language (VHLL). Python programs are a high level representation of a programmer's ideas. Low-level housekeeping details like memory allocation and reclamation are handled by Python itself. The developers need not worry about them. High level means seeing the big picture and delegating the details. Python with intelligent syntax reduces clutter. The interpreter understands program structure by examining the physical structure of the code; block delimiters are not required. Modules and classes provide exceedingly accessible abstraction, encapsulation and modularity.

Python programs are interpreted and they have only two tasks: edit and run. Normally the compiling languages like C, undergoes edit, compile, link and run cycles. With dynamic loading and some planning, the cycle can be run, edit and reload the edited part into the running process with stopping. The Python interpreter reads source files called modules, which usually have the extension .py, converts them to portable byte codes, and executes them by interpreting the byte codes much like Java does. Python is sometimes referred to as a scripting language.

Python code is clean, simple and powerful. It is remarkably easy to write, read and maintain. Python programs can be read and understood by non-Python programmers and even non-programmers. Python also makes an excellent teaching as it is relatively devoid of the details that make other languages difficult for neophytes to learn. Python's syntax is minimalist. It lacks a lot of the noise characters that other languages suffer from, like semicolons at the end of each statement, special dereferencing symbols and braces or begin/end for nested code blocks. As opposed to some form of start/end indicators ({ and }), begin/end, if/fi, case/esac and for/done in other programming languages, the hierarchy of blocks of code in Python is indicated by that code block's indentation level. Because of its simplicity, Python's syntax is also very powerful. An equivalent task will often require far less Python code than C code. Fewer lines of code also means fewer bugs, greater programmer productivity and far less frustration. Python began as a minimalist language and change to the core language syntax has been planned and deliberately slow. Future versions of Python may even reduce the complexity of the language.

Python is object-oriented. In Python, everything is an object. The implementation of classes and objects is much simpler in Python. The simplicity makes OOP accessible, laying bare the essential nature of object orientation without complicated trappings that get in the way of the concept.

In Python, variables are dynamically typed. That is, variables are created when they assigned a value and their type depends on the data they contain. An existing variable can be assigned a new value of a different type. With care and understanding, this feature leads to impressive programming power. Used carelessly, dynamic typing can be a major source of program bugs.

Python is blessed with large standard library. Python's core language is very minimal, providing much less functionality. However the standard Python distribution includes a large number and variety of modules, ready and easy to use in our own programs. Library modules include: string operations, regular expressions, file and operating system access, threads, sockets, database access, Internet protocols and even access to Python's parser internals.

Python works on multi-platform. Python has been ported to almost every hardware platform and operating system available, including Linux, Unix, MacOS, and Windows. General-purpose modules, if properly coded, will run on multiple platforms without any modification.

Python has multiple implementations. Python standard implementation has been written in C. Now there is JPython, which is 100% Java. This runs on Java platforms and provides seamless integration to Java classes. In addition, Stackless Python is available. This takes away the necessity of C stack, enabling Python to run in tight quarters, such as on handheld devices.

Python is scalable. That is, simple 10-line scripts are easy to write and read and large systems with thousands of lines of code are feasible and maintainable in Python. Python implements encapsulation at multiple levels. Functions encapsulate reusable program code, classes combine data with their associated functions, modules contain related classes and functions and packages encapsulate systems. If native Python code is too slow or can not access low-level functionality, extension modules can be written in C. If we have an application that needs to be scriptable, Python can be embedded.

Python is open source. It is completely free for use, modification, redistribution, and commercial use. Python's source code is freely downloadable.

Python for XML

Python offers ready-made classes and interfaces to many system calls and libraries. Python also provides a rich assortment of data structures. The three that occur most frequently especially in XML processing are String, Nested List, and Dictionary. There are a number of exciting utility programs developed using Python. Below, I have discussed some of the select utilities for XML processing. Python comes with a set of special notations called as PYX notations to make processing easier. An XML file is a text file.

An XML file can be converted into PYX notations using non-validating XML parser (XMLN) and validating XML parser (XMLV). The PYX output can be subjected to gawk, the GNU version of awk, for further processing. Also the pyx2xml utility program converts PYX notation to XML back.

Python has come out with regular expression support. Python has a standard library called as xmllib for simple XML processing using regular expressions. There is an useful utility program, xgrep, that can be used to find patterns in text files by using regular expressions. xgrep is also XML-aware. That is, xgrep is capable of working with PYX notations generated from XML files by the XML parsers. xgrep has been enhanced to answer for certain queries in the recent past. Apart from this, there are a number of valuable modules, such as getopt, which add power to the XML processing. Python is good for both SAX- and D0M-style XML processing. Also an utility referred to as saxshow is available in Python for describing the order in which SAX events have been called while processing an XML document. xmlproc is a validating XML parser written in Python

There is a template Python SAX driver for SQL, which provides a powerful and flexible way to map MySQL data to XML. This driver program makes MySQL look like just another XML parser from an application developer's perspective.

Another interesting Python utility referred to as xMail, which converts email messages to XML files. Once an e-mail is in XML format, it can be processed in a variety of ways using any XML-aware databases, editors, and search engines. Also they can be subjected to SAX- or DOM-style processing. One useful form of processing would be to send email from this XML notation. This is being accomplished by an utility named as sendxMail.

xTract, a memory-bound utility for retrieving fragments of XML files matching particular search criteria on the Web is coded in Python. There is also a new version called as xTract1, which can search very large XML files and hence scalable..

There is one C3 XML Editor/Viewer, which uses wxPython GUI toolkit. The heart of C3 is the relationship between an xTree object used to manipulate XML and the wxTreeCtrl object used to display and edit XML. A Python utility xFS helps converting file system information into XML and hence files can be displayed in an XML-aware Web browser, transformed into HTML format using XSL and related technologies and the xgrep utility can be used to search for file system information. Also a simple utility GetURL can retrieve a URL and print its contents to standard output

Finally Pyxie, an open source XML-processing library for Python, is being developed by professionals in the spirit of open source software development. This library, being core in nature, can be used to build a powerful suite of XML processing tools in Python. Thus Python blessed with some exciting special features, tools and libraries for facilitating text processing. Python can play a very critical role in the near future due to the exponential growth predicted for XML's role in the Internet.

These exciting qualities of Python have made it one of the hottest and most usable languages in the software arena these days.

Click for Python Links