Comparative Examples in MSH and KSH

 

Most shells (such as Windows CMD.EXE and the UNIX shells SH, KSH, CSH, and BASH) operate by executing a command or utility in a new process, and presenting the results (or errors) to the user as text. Text-based processing is the way in which system interaction is done with these shells. Over the years, a large number of text processing utilities—such as sed, AWK, and PERL—have evolved to support this interaction. The heritage of this operational process is very rich.

MSH is very different from these traditional shells. First, this shell does not use text as the basis for interaction with the system, but uses an object model based on the .NET platform. Second, the list of built-in commands is much larger; this is done to ensure that the interaction with the object model is accomplished with the highest regard to integrity with respect to interacting with the system. Third, the shell provides consistency with regard to interacting with built-in commands through the use of a single parser, rather than relying on each command to create its own parser for parameters.

This document constrasts the Korn Shell (KSH) and MSH by providing a number of examples of each.

Example 1

To stop all processes that begin with the letter "p" on a UNIX system, an administrator would have to type the following shell command line:

$ ps -e | grep " p" | awk '{ print $1 }' | xargs kill

The ps command retrieves list of processes and directs (|) the grep command to determine which process begins with "p"; which in turn directs (|) the awk command to select the 1st column (which is in the process id) and then passes (|) those to the xargs command which then executes the kill command for each process.  The fragility of this command line may not be evident, but if the ps command behaves differently on different systems, where the "-e" flag may not be present, or if the processed command is not in column 1 of the output, this command-line procedure will fail.

MSH:

MSH> get-process p* | stop-process

Thus, the command line may now be expressed as "get the processes whose name starts with "p" and stop them".  The get-process Cmdlet takes an argument that matches the process name; the objects returned by get-process are passed directly to the stop-process Cmdlet that acts on those objects by stopping them.

The second, more convoluted example, which stops processes that use more than 10 MB of memory becomes quite simple.

Example 2

Another, even more complex example, such as "find the processes that use more than 10 MB of memory and kill them" can lead to an equally failed outcome:

$ ps -el | awk '{ if ( $6 > (1024*10)) { print $3 } }' | grep -v PID | xargs kill

The success of this command line relies on the user knowing that the ps -el command will return the size of the process in kilobytes (kb) in column 6 and that the PID of the process is in column 3.  It is still required that the first row is removed.

Comparing Example 1 using a standard shell to Example 1a using MSH, we can see that the commands act against objects rather than against text.

MSH:

MSH> get-process | where { $_.VS -gt 10M } | stop-process

There is no issue about determining the column that contains the size of the process, or which column contains the ProcessID.  The memory size may be referred to logically, by its name.  The where Cmdlet can inspect the incoming object directly and refer to its properties.  The comparison of the value for that property is direct and understandable.

Example 3

For example, if you wanted to calculate the number of bytes in the files in a directory, you would iterate over the files, getting the length and adding to a variable, and then print the variable:

$ tot=0; for file in $( ls )

> do

>     set -- $( ls -log $file  )

>     echo $3

>     (( tot = $tot + $3 ))

> done; echo $tot

This example uses the set shell command that creates numbered variables for each white space separated element in the line rather than the awk command as in the examples above.  If the awk command were used, it would be possible to reduce the steps to the following:

$ ls –l | awk ‘{ tot += $5; print tot; }’ | tail -1

This reduces the complexity, but requires specific knowledge of a new language, the language that is associated with the awk command.

The MSH loop is similar; each file in the directory is needed, but it is far simpler as the information about the file is already retrieved:

MSH:

MSH> get-childitem | measure-object -Property length

The measure-object Cmdlet interacts with objects and if it is provided with a property from the object, it will sum the values of that property.  Because the property length represents the length of the file, the measure-object Cmdlet is able to act directly on the object by referring to the property name rather than “knowing” that the length of the file is in column 3 or column 5.

Example 4

Many objects provided by the system are not static, but dynamic.  This means that after an object is acquired, it is not necessary to acquire the object at a later time.  The data in the object is updated as the conditions of the system change.  Also, changes to these objects are reflected immediately in the system.

As an example, suppose one wanted to collect the amount of processor time that a process used over time.  In the traditional UNIX model, the ps command would need to be run iteratively and the appropriate column in the output would need to be found and then the subtraction would need to be done.  With a shell that is able to access the process object of the system, it is possible to acquire the process object once, and since this object is continually updated by the system; all that is necessary is to refer to the property.  The following examples illustrate the differences, where the memory size of an application is checked in ten second intervals and the differences are output:

$ while [ true ]

do

   msize1=$( ps -el | grep application | grep -v grep | awk '{  print $6 }' )

   sleep 10

   msize2=$( ps -el | grep application | grep -v grep | awk '{  print $6 }' )

   expr $msize2 - $msize1

   msize1=$msize2

done

MSH:

MSH> $app = get-process application

MSH> while ( 1 ) {      

>> $msize1 = $app.VS

>> start-sleep 10

>> $app.VS - $msize1  

>> }

Example 5

It is even more difficult to determine whether a specific process is no longer running.  In this case, the UNIX user must collect the list of processes and compare them to another list. 

$ processToWatch=$( ps -e | grep application | awk '{ print $1 }'

$ while [ true ]

> do

>     sleep 10

>     processToCheck=$(ps -e | grep application | awk '{ print $1 }' )

>     if [ -z "$processToCheck" -or "$processToWatch" != "$processToCheck" ]

>     then

>        echo "Process application is not running"

>        return

>     fi

> done

MSH:

MSH> $processToWatch = get-process application

MSH> $processToWatch.WaitForExit()

As is seen in this example, the MSH user need only collect the object and then subsequently refer to that object. 

Example 6

For example, suppose the user wanted to determine which processes were compiled as PreRelease code, such as when applications have been compiled in such a way to mark them as "PreRelease".

$ ???

This information is not kept in the standard UNIX executable.  To determine this information, one would need to have a set of specialized utilities to add this information to the binary and then another set of utilities to collect this information.  These utilities do not exist; it is not possible to accomplish this task.

MSH:

MSH> get-process | where {$_.mainmodule.FileVersioninfo.isPreRelease}

Handles NPM(K)    PM(K)      WS(K) VS(M)   CPU(s)     Id ProcessName

-------  ------    -----      ----- -----   ------     -- -----------

    643      88     1024       1544    15    14.06   1700 AdtAgent

    453      15    25280       7268   199    91.70   3952 devenv

In this example, a cascade of properties is done.  The appropriate property from the process object (MainModule) is inspected, the property "FileVersionInfo" is referenced (a property of MainModule) and the value of the property "IsPreRelease" is used to filter the results.  If IsPreRelease is true, the objects that are output by the get-process Cmdlet are output.

Example 7

Each object may or may not provide methods; MSH provides commands to aid the discovery of methods that are available for a specific object via the get-member Cmdlet.  For example, the string object has a large number of methods:

MSH:

MSH> get-member -input "String" -membertype method | format-table "$_"

"$_"

---------

System.Object Clone()

Int32 Compare(System.String, System.String, Boolean, System.Globalization.Cul..

Int32 CompareOrdinal(System.String, Int32, System.String, Int32, Int32)

Int32 CompareTo(System.String)

System.String Concat(System.String, System.String)

Boolean Contains(System.String)

System.String Copy(System.String)

Void CopyTo(Int32, Char[], Int32, Int32)

Boolean EndsWith(System.String, Boolean, System.Globalization.CultureInfo)

Boolean Equals(System.String, System.StringComparison)

System.String Format(System.String, System.Object, System.Object)

Char get_Chars(Int32)

Int32 get_Length()

System.CharEnumerator GetEnumerator()

Int32 GetHashCode()

System.Type GetType()

System.TypeCode GetTypeCode()

Int32 IndexOf(System.String, Int32)

Int32 IndexOfAny(Char[], Int32, Int32)

System.String Insert(Int32, System.String)

System.String Intern(System.String)

System.String IsInterned(System.String)

Boolean IsNormalized()

Boolean IsNullOrEmpty(System.String)

System.String Join(System.String, System.String[])

Int32 LastIndexOf(Char, Int32)

Int32 LastIndexOfAny(Char[], Int32)

System.String Normalize(System.Text.NormalizationForm)

Boolean op_Equality(System.String, System.String)

Boolean op_Inequality(System.String, System.String)

System.String PadLeft(Int32, Char)

System.String PadRight(Int32, Char)

System.String Remove(Int32)

System.String Replace(System.String, System.String)

System.String[] Split(System.String[], System.StringSplitOptions)

Boolean StartsWith(System.String)

System.String Substring(Int32)

Char[] ToCharArray(Int32, Int32)

System.String ToLower()

System.String ToLowerInvariant()

System.String ToString(System.IFormatProvider)

System.String ToUpper(System.Globalization.CultureInfo)

System.String ToUpperInvariant()

System.String Trim()

System.String TrimEnd(Char[])

System.String TrimStart(Char[])

As can be seen, 46 different methods are available to the string object all of which are available to the MSH user.  Unfortunately, the semantics of these methods is not visible from the shell, but a number of .NET object help is available online.

Example 8

The availability of these methods creates an explosion of possibilities.  For example, if I wanted to change the case of a string from lower to upper I would do the following: (first ksh and then MSH).

$ echo "this is a string" | tr [:lower:] [:upper:]

or

$ echo "this is a string" | tr '[a-z]' '[A-Z]'

MSH:

MSH> "this is a string".ToUpper()

The UNIX example relies on the tr cmdlet. 

Example 9

For example, suppose the string "ABC" was to be inserted after the first character in the word "string" to have the result "sABCtring".  Here are the following, first with ksh, then with MSH:

$ echo "string" | sed "s|\(.\)\(.*)|\1ABC\2|"

MSH:

MSH> "string".Insert(1,"ABC")

Both examples require specific knowledge; however, using the "insert" method is more intuitive than using the capture buffers available in sed.  Moreover, the domain specific knowledge of the "sed" language required may be somewhat more advanced than is required to use the Insert method.


Jim Truher and Jeffrey Snover

Comments

  • Anonymous
    April 21, 2006
    Now only if you could make creating and mounting images of floppies/cd/dvd either locally or securely from network as easy as it is with user mode file systems in certain other OS...

  • Anonymous
    April 21, 2006
    That looks great. But I wonder - it's almost easier to use cut or awk when you can see the output right there and think "oh I want the third column", wheras it takes a lot more poking to figure out all the properties/methods of objects in MSH. Of course, this could be solved by intellisense autocomplete in the shell...
  • Anonymous
    April 21, 2006
    The comment has been removed
  • Anonymous
    April 22, 2006
    Hey I know, lets make contrived examples even better. Lets make a perl script which kills processes based on their regex on their process names.

    regex_kill 'p*'

    OH WOW That is much simpler than KSH and MSH.

    A doesn't do something that B does because we explicitly added a shortcut to do that in B :O

    You guys should stop using the word monad. You don't deserve it.
  • Anonymous
    April 22, 2006
    The comment has been removed
  • Anonymous
    April 23, 2006
    The comment has been removed
  • Anonymous
    April 23, 2006
    Yup.  The correct statement is:
    get-childitem | Measure-Object -Property length

    I've updated the blog with the correct info.
    Thanks!
    Jeffrey Snover
  • Anonymous
    April 26, 2006
    Some of your examples could be improved (at least for debian)
    1) killall -r p* (does the same thing)
    3) du -S
    6) can be done using ps incombination with the file command

    you can find these options by reading the man pages
    Also, most people don't know/don't use awk+sed+bash anymore. it's usually perl|python|ruby nowadays.

    That being said, monad/ps looks very interesting and I look forward to being able to use it one day instead of cmd.exe
  • Anonymous
    April 26, 2006
    The article seems to confuse ease of learning with ease of use. I appreciate ease of leaning as much as the next person, but for day in, day out, ease of use is far more important.

    Dredging up scaring looking unix command lines isn't hard, but my guess is most of the target audience here didn't need translations for what they do. Lots of us can read sed, awk, and Perl (not PERL: perldoc perlfaq1) in our sleep. Syntactic sugar alone doesn't make things easy.

    There really are some neat ideas on display here, but I don't think unix is nearly as scary as you try to make it look. Add to that the fact that I, and many people, work in data centres that have linux, bsd, solaris, hp and even(gasp) windows machines and there will always be a need for a toolchain with tons of tiny, simple tools.
  • Anonymous
    April 26, 2006
    As a unix programmer, I really appreciated this rosetta stone between ksh and Power Shell(tm).

    Unfortunately, it appears that you are speaking in fluent Power Shell(tm), but pidgin unix.

    I know that working at a certain company, which shall not be named, can make your unix skills rusty. So, please consider the following a polite refresher course.

    In example 5, to wait until a process exits a native unix speaker would say:

     $  pid = pidof application
     $  wait $pid

    Really. It's that simple.

    --ben9
  • Anonymous
    May 02, 2006
    Very nice examples.
    It will be nice to see these scripting support improvements in the future versions of windows.
    But even being nice they provide no backward compatibility with posix environment.
    Do you plan anything to provide at least some non-deep support for posix (simple utils like grep, awk, sed, less, etc.)?
    Providing something new without backward compatibility with good shells is something noone likes, imho...