CMS Pipelines … for NetRexx on the JVM

This year I want to tell you about a new and exciting addition to NetRexx (which, incidentally just turned 19 years old the day before yesterday). NetRexx, as some of you know, is the first alternative language for the JVM, stems from IBM, and is free and open source since 2011 (http://www.netrexx.org). It is a happy marriage of the Rexx Language (Michael Cowlishaw, IBM, 1979) and the JVM. NetRexx can run compiled ahead of time, as .class files for maximum performance, or interpreted, for a quick development cycle, or very dynamic production of code. After the addition of Scripting in version 3.03 last year, the new release (3.04, somewhere at the end of 2014) include Pipes.

We know what pipes are, I hear you say, but what are Pipes? A Pipeline, also called a Hartmann Pipeline, is a concept that extends and improves pipes as they are known from Unix and other operating systems. The name pipe indicates an inter- process communication mechanism, as well as the programming paradigm it has introduced. Compared to Unix pipes, Hartmann Pipelines offer multiple input- and output streams, more complex pipe topologies, and a lot more, too much for this short article but worthy of your study.

Pipelines were first implemented on VM/CMS, one of IBM’s mainframe operating systems. This version was later ported to TSO to run under MVS and has been part of several product configurations. Pipelines are widely used by VM users, in a symbiotic relationship with REXX, the interpreted language that also has its origins on this platform. Pipes in the NetRexx version are compile by a special Pipes Compiler that has been integrated with NetRexx. The resulting code can run on every platform that has a JVM (Java Virtual Machine), including z/VM and z/OS for that matter. This portable version of Pipelines was started by Ed Tomlinson in 1997 under the name of njpipes, when NetRexx was still very new, and was open sourced in 2011, soon after the NetRexx translator itself. It was integrated into the NetRexx translator in 2014 and will be released integrated in the NetRexx distribution for the first time with version 3.04. It answers the eternal question posed to the development team by every z/VM programmer we ever met: “But … Does It Have Pipes?” It also marks the first time that a non-charge Pipelines product runs on z/OS. But of course most of you will be running Linux, Windows or OSX, where NetRexx and Pipes also run splendidly.

NetRexx users are very cautious of code size and peformance – for example because applications also run on limited JVM specifications as JavaME, in Lego Robots and on Androids and Raspberries, and generally are proud and protective of the NetRexx runtime, which weighs in at 37K (yes, 37 kilobytes, it even shrunk a few bytes over the years). For this reason, the Pipes Compiler and the Stages are packaged in the NetRexxF.jar – F is for Full, and this jar also includes the eclipse Java compiler which makes NetRexx a standalone package that only needs a JRE for development. There is a NetRexxC.jar for those who have a working Java SDK and only want to compile NetRexx. So we have NetRexxR.jar at 37K, NetRexxC.jar at 322K, and the full NetRexx kaboodle in 2.8MB – still small compared to some other JVM Languages.

The pipeline terminology is a metaphore derived from plumbing. Fitting two or more pipe segments together yield a pipeline. Water flows in one direction through the pipeline. There is a source, which could be a well or a water tower; water is pumped through the pipe into the first segment, then through the other segments until it reaches a tap, and most of it will end up in the sink. A pipeline can be increased in length with more segments of pipe, and this illustrates the modular concept of the pipeline. When we discuss pipelines in relation to computing we have the same basic structure, but instead of water that passes through the pipeline, data is passed through a series of programs (stages) that act as filters. Data must come from some place and go to some place. Analogous to the well or the water tower there are device drivers that act as a source of the data, where the tap or the sink represents the place the data is going to, for example to some output device as your terminal window or a file on disk, or a network destination. Just as water, data in a pipeline flows in one direction, by convention from the left to the right.

A program that runs in a pipeline is called a stage. A program can run in more than one place in a pipeline – these occurrences function independent of each other. The pipeline specification is processed by the pipeline compiler, and it must be contained in a character string; on the commandline, it needs to be between quotes, while when contained in a file, it needs to be between the delimiters of a NetRexx string. An exclamation mark (!) is used as stage separator, while the solid vertical bar | can be used as an option when specifiying the local option for the pipe, after the pipe name. When looking a two adjaced segments in a pipeline, we call the left stage the producer and the stage on the right the consumer, with the stage separator as the connector.

A device driver reads from a device (for instance a file, the command prompt, a machine console or a network connection) or writes to a device; in some cases it can both read and write. An example of a device drivers are diskr for diskread and diskw for diskwrite; these read and write data from and to files. A pipeline can take data from one input device and write it to a different device. Within the pipeline, data can be modified in almost any way imaginable by the programmer. The simplest process for the pipeline is to read data from the input side and copy it unmodified to the output side. The pipeline compiler connects these programs; it uses one program for each device and connects them together. All pipeline segments run on their own thread and are scheduled by the pipeline scheduler. The inherent characteristic of the pipeline is that any program can be connected to any other program because each obtains data and sends data throug a device independent standard interface. The pipeline usually processes one record (or line) at a time. The pipeline reads a record for the input, processes it and sends it to the output. It continues until the input source is drained.

Until now everything was just theory, but now we are going to show how to compile and run a pipeline. The executable script pipe is included in the NetRexx distribution to specify a pipeline and to compile NetRexx source that contains pipelines. Pipelines can be specified on the command line or in a file, but will always be compiled to a .class file for execution in the JVM.

 pipe ”(hello) literal ”hello world” ! console”

This specifies a pipeline consisting of a source stage literal that puts a string (“hello world”) into the pipeline, and a console sink, that puts the string on the screen. The pipe compiler will echo the source of the pipe to the screen – or issue messages when something was mistyped. The name of the classfile is the name of the pipe, here specified between parentheses. Options also go there. We call execute the pipe by typing:

java hello

Now we have shown the obligatory example, we can make it more interesting by adding a reverse stage in between:

pipe ”(hello) literal ”hello world” ! reverse ! console

When this is executed, it dutifully types “dlrow olleh”. If we replace the string after literal with arg(), we then can start the hello pipeline with a an argument to reverse: and we run it with:

java hello a man a plan a canal panama

and it will respond:

amanap lanac a nalp a nam a

which goes to show that without ignoring space no palindrome is very convincing – which we can remedy with a change to the pipeline: use the change stage to take out the spaces:

pipe”(hello) literal arg() ! change /” ”// ! console”

Now for the interesting parts. Whole pipeline topologies can be added, webservers can be built, relational databases (all with a jdbc driver) can be queried. For people that are familiar with the z/VM CMS Pipelines product, most of its reference manual is relevant for this implementation. We are working on the new documentation to go with NetRexx 3.04.

Pipes for NetRexx are the work of Ed Tomlinson, Jeff Hennick, with contributions by Chuck Moore, myself, and others. Pipes were the first occasion I have laid eyes on NetRexx, and I am happy they now have found their place in the NetRexx open source distribution. To have a look at it, download the NetRexx source from the Kenai site (https://kenai.com/projects/netrexx ) and build a 3.04 version yourself. Alternatively, wait until the 3.04 package hits http://www.netrexx.org.
This post is part of the Java Advent Calendar and is licensed under the Creative Commons 3.0 Attribution license. If you like it, please spread the word by sharing, tweeting, FB, G+ and so on!

Quick prototype: Using git in NetRexx

NetRexx is a programming language for the JVM. Categorizing it is not as easy as one might think. It can be executed interpreted or compiled ahead of time, and it has a ‘scripting’ mode, that generates the class and method statements, and the mode in which all class and method statements are entered into the program source by the user. Its interpreted mode does employ the Javan compiler and can optionally leave a class file after interpretation. Consequently, it can be used in at least four ways, and these are entirely straightforward; explaining all possibilities introduces an air of complexity that is absent in reality. For this reason, I am just showing an example, and leave it up to the reader to check how it compares to other ways to get to the same result.
This year I want to show an example of what is regarded as NetRexx’s biggest advantage: easy integration in the JVM environment, making everything more straightforward, more readable, and more fun than in other languages. When encountering a new environment where work needs to be done, it suffices to look up the javadoc of the library and go for it.
Recently there was an opportunity to do some work with git – actually, there was an application to be made, and I realized its requirements overlapped for a large part with what this version management system can do. I looked around for a library and found that in jGit, from the eclipse project.
For this small experiment with git I am using NetRexx in scripting mode. This means, I am not declaring classes and just script everything in one go, on seqentially executed lines; except one method that I pasted from another program.

package com.rvjansen
import org.eclipse.jgit.
import com.eaio.uuid.UUID

trace results
builder = FileRepositoryBuilder()
repository_ = builder.findGitDir(File("/Users/rvjansen/papiamento")).readEnvironment().build()
git_ = Git(repository_)

uuid_ = newUUID()

file_ = File(repository_.getDirectory().getParent(), uuid_);
file_.createNewFile();

out = PrintWriter(BufferedWriter(FileWriter(file_)))
out.println(Date())
out.close()
git_.add().addFilepattern(uuid_).call()

git_.commit().setMessage("Added file" uuid_).call()
repository_.close()

method newUUID() returns String static
return UUID().toString.toUpperCase

I do declare a package, as this is good practice to avoid name clashes, and I am importing the packages org.eclipse.jgit and my favorite uuid package, the one that can do real timebased UUID’s following the norm – as opposed to the java UUID that does something else. In this small program the test case to be done was to version data in files kept by git, with UUID’s as filename – as not to have to use a range in a generated name. Ranges are generally bad in applications, because they come back and bite you with limitations that you did not imagine at design time.
Next we see the ‘trace results’ statement. When prototyping it is a good thing to be able to see exactly what is going on, and this is the way to do it. I am initializing a git repository, and for clarity I am using an absolute directory path. For production apps I would not do that, but when prototyping, we don’t need any relativity to make mistakes with – with absolute paths we know where everything ends up. The reference to the git repository ends up in the variable git_.
Next we are creating a file, named after a UUID that we conjure up, and write the output of Date() into it. Note that Date() comes from the standard Java library, and yes, it’s deprecated, but it beats all forms of Calendar in shortness. Obviously, these files are going to end up containing other data anyway.
After we have closed the file, we add it (its file pattern) to the git repository and we commit it, with the obligatory message. Note that all the git api’s allow for the chaining of calls. Again, I would not do that in maintainable production code, but it is great to keep examples and prototypes short.
When writing this, in two passes (first the file stuff and then the git stuff) I used interpreted execution, looked at the trace output to see that everything worked (also, to see the type info that I could leave out in NetRexx but do like to know) and for quick edit/execute turnaround.
I am perfectly happy to be able to do this in a few minutes, and I think you would be too!
Meta: this post is part of the Java Advent Calendar and is licensed under the Creative Commons 3.0 Attribution license. If you like it, please spread the word by sharing, tweeting, FB, G+ and so on!

NetRexx

Which was the first dynamic language on the JVM? Think you know? Read on!

/* This is a fully documented program */
say 'Hello World!'

In 1995 the NetRexx language was conceived by Mike Cowlishaw of IBM. We have an exact birthday – it was the 10th of December, 1995. Consequently, NetRexx is 17 years old, and this fact makes it the oldest alternative language for the JVM. Mike was involved in porting the Java VM to IBM’s platforms – and the first platform to receive a JVM was OS/2. No blog has enough space to lament OS/2 and the fate that is has met by IBM’s shabby treatment of its own intellectual property and Microsoft’s insincere stance – OS/2, the most important program for the future, does one remember?

Experience shows that it does not help crying over spilled milk, but a straight headed analysis shows that NetRexx and its ancestor Rexx were heavily impacted by the demise of OS/2 by IBM’s own hand. In 1995, by all measures, Rexx was at the top of the world. It was a heavy contender to BASIC and if only for this reason, Microsoft wanted it dead. It is a testament to the intrinsic strength of the Rexx family of languages that it survived the catastrophic series of events that the cancellation of OS/2 and the even lesser known Workplace OS formed. I remember these days vividly: I was scheduled to go to a RedBook writing session in Poughkeepsie when word came that it was all over for Workplace OS. a.k.a. Pink and Blue. Today, we are left with some parts of it, which have been packaged into the International Components for Unicode, and in a bizarre twist of fate, NetRexx has been open sourced using the ICU license.

Because Open Source it is. In the beginning of the nineties, a call for an Object Oriented Rexx became noticeable, and within IBM a project was started to provide one. Called Oryx (for Object Oriented Rexx) in those days, Simon Nash was charged with it, and it produced a Java avant-la-lettre – a collection class library married to a Rexx with an OO-syntax that somehow managed to be compatible to (what, with retroactive continuity, became) Classic Rexx. The Java reference is because the earliest implementations had a bytecode interpreter that predated Java itself a number of years.

In 1995, Mike Cowlishaw did some experimentation to see how a Rexx-like language would behave on the JVM. Some compromises were made to adhere to the JVM’s object model, and for example, the stem notation with dots, an important part of how Classic Rexx defines multidimensional, content-addressed arrays, was dropped, because the dot-notation clashed too much with Java’s method invocation syntax. In its place the indexed string, a notation with square brackets (these were a problem for a long time on EBCDIC producing keyboards) was introduced. Also, all string comparison in NetRexx was to be case insensitive – a feature that Mike always has regretted omitting from Classic Rexx (on the well-meant advice of others).

For the rest, NetRexx is a better Rexx than Rexx itself. In its current form, it is a translator that can compile to .class files, as well as interpret the program in a single shot. Orthogonal to this, there is full-blown application mode, in which the programmer declares all the classes and methods, or scripting mode, in which there is just a number of commands and method invocations specified, and the translator adds all the syntactic ceremony that the Java language requires before compiling or interpreting.

At the essence of NetRexx is the fact that one can write Java classes without all the syntax that is annoying to the programmer that is reared in the non-C tradition. C style syntax has a very paradoxical property, in which terseness has led to ceremonial syntax elements. These are avoided in NetRexx. It has been established that the same program contains up to 40% less lexical elements in NetRexx as compared to Java syntax.

In other respects NetRexx keeps close to Java. As all compiled programs are translated to Java source first, performance has kept up with improvements in the Java hotspot VM architecture. In contrast to Jython and JRuby, there is no performance penalty for unbounded dynamism in the language: one of the early slogans for NetRexx “strong typing without more typing” still is true today.

If you like clean syntax and great performance, you should try NetRexx. Integration with Java class libraries is excellent and transparent. NetRexx is the first alternative language for the JVM, and still the only alternative JVM language that was actually used to implement a part of the JVM runtime in: the bigdecimal library was first written in NetRexx. For the ones that are aware of Mike Cowlishaws’s effort on the part of better handling of decimals by computer(hardware|languages) this is no surprise. The IEEE decimal definition is in large parts equal to the Rexx Language ANSI Standard. It does not get more serious when you care for floating point precision.

With its ancestor Rexx, NetRexx shares the unbounded decimal arithmetic, the TRACE and PARSE statements (study them, and you will be sold), and a set of string functions that was and is still the best on the planet. When this has gotten you interested, know that NetRexx is free and open source, and downloadable from www.netrexx.org. Release 3.02 will be here any day, and it is accompanied by documentation befitting any former IBM product. Currently, there is IDE support for Eclipse, Jedit and Emacs, and since version 3.01 there is no JDK required any more – a JRE will do.

Meta: this post is part of the Java Advent Calendar and is licensed under the Creative Commons 3.0 Attribution license. If you like it, please spread the word by sharing, tweeting, FB, G+ and so on! Want to write for the blog? We are looking for contributors to fill all 24 slot and would love to have your contribution! Contact Attila Balazs to contribute!