http://qs321.pair.com?node_id=129157

As part of the analysis of a set of experiments that I do, there are several simply mathematical adjusts, DOS-based programs, and other tools that we can use on the raw data in order to develop usable results. In the past, we'd only collect < 20 of these data sets (simply x-y data, nothing fancy) at a time, and so hand-processing the data with these tools was rather simple.

However, we've gotten capabilities to do much more data processing, possibly looking as much as 300-400 sets from a run. Using the hand methods at this point doesn't seem like it's going to work well.

Fortunately, I have access to perl, so I've developed several small scripts that help me along. However, thinking about the big picture, I've been starting to 'wrap' the usage of the DOS programs into perl code, and also breaking apart the other parts of the perl code to smaller functions, such that I can write a very simply code to process my data. For example, my end goal would be to write something like this (in perl):

foreach my $dataset (1..50) { my $data = myread( "$filename.$dataset" ); $data = shiftfunction( $data, $shiftvalue ); $data = normalize( $data, { normconstant => 1 } ); mywrite( $data, "$newfilename.$dataset" ); }
(Not actual function names, but the idea is there.)

This will be great for me, since perl's quite easy to use, but I'm thinking down the road that others will possibly want access to this data processing. And while I'm sure that others could learn and use perl (Remember, I'm in a chemical environment, not comp.sci.), I'm thinking that using a lightweight language would be easier for them to learn and pick up. For example, I'm considering the above script to be written in this microlanguage as:

for $i in 1-50 { $data = read "datafile.$i"; shift $data 0.05; normalize $data constant=1.0; write $data "outfile.$i"; }
Note that every 'function' would be of the form <function_name> <required args> [<optional args in name=value pairs>]. There would be some basic variable knowledge, and there would be cases where one could access internals of variables using dot notation, instead of ->{} as with perl. The semi-colons would be used only because the name=value pair sets could be numerous for a given command. The only programming syntax that would be used would be the 'for' construct, at least at this stage.

Conceptionally, this doesn't seem hard. But there's obviously several gotchas that I haven't processed completely. But I'm wondering if anyone has done a similar task in developing a microlanguage, and could provide any pointers for writing one?

-----------------------------------------------------
Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
"I can see my house from here!"
It's not what you know, but knowing how to find it if you don't know that's important

Replies are listed 'Best First'.
Re: Developing a microlanguage for non-perl programmers
by suaveant (Parson) on Dec 04, 2001 at 00:02 UTC
    My guess would be you would want to play with filtering perl like theDamian does to make things like Acme::Bleach and the Klingon and Latin perl translations... that combined with specialty functions... but that is by no means an authorative answer :) I've never played with stuff like that... otherwise you'll have to get into doing your own parsing... which can be a pain in the donkey.

                    - Ant
                    - Some of my best work - (1 2 3)

Re: Developing a microlanguage for non-perl programmers
by toma (Vicar) on Dec 04, 2001 at 11:50 UTC
    The creation of a little language might seed a minefield in your organization. If it is successful, it will grow into an unmanagable monster. If it fails, it is a waste of time. It is unlikely that you will create a little language that will be 'just right' for your users.

    I have dealt with many proprietary languages, big and small. They are usually painful. Common problems include poor development environments, scant documentation, lack of support, and confusing error messages.

    Little languages are fine for throw-away code, but they don't fare well when you try to use them in a workgroup. Worst case, they are successful and grow. The organization finds that it is difficult to hire and retain employees to write this type of code. The author of the system has a continuous four-deep line of people at his desk waiting for debugging help! Hundreds of commands get added to the system for each new requirement. None of them are documented, but if you change one of them, someone's critical code will break. I could go on and on with stories of pain and suffering. Don't go there!

    When you are creating a system for novice programmers, ask yourself, "How am I going to get out of this?" In computer science terms the question is "What is the life cycle for the users' programs?"

    You need an exit strategy!

    I recently had the same problem of needing a little language. My tactic is to create an XML representation of the code. So a program in the 'little language' is actually an XML document. One advantage is that I don't have to write a parser. Also, the XML parser usually deals well with syntax errors. Typically it provides both the line number and the character position of an error.

    The computer science guys are happy with my system because it will be reasonable for them to port the code in the future. Working with XML is acceptable on the resume. Maybe it will provide them with an excuse to learn .net!

    The non-programmers that are using the system like it because it looks something like HTML, which they consider doable. Mostly they just imitate my examples.

    The managers are happy with the program because it solves the immediate problem without creating obvious new problems. They can put "learn XML and .net" in some hotshot's software developer's development plan and everyone will be happy. XML is still a management-compliant buzzword this year. In my case, perl is just an implementation detail of a prototype.

    If my little language starts to become successful, I can make a one-way translator to perl, port the code, and then just support that. Maybe someone else will port the XML to some other language.

    It should work perfectly the first time! - toma
    msg me with suggestions for this node. I am interested in improving readability and adding details.

      Good point. I might temper that slightly: a little
      language would be fine if its scope and purpose is
      tightly contained (but I'm not sure if yours fits the
      bill). Here's my example.

      We were doing some heavy-duty configuration management,
      and had a custom tool to upgrade our data. We found
      that we could not simply migrate our database into the
      new database system because the new version also had
      seemingly arbitrary data schema changes. The changes
      were vast enough that schema updaters had to be written
      per table, and some were complex. It was time for an
      interpreted language that also severely restricted the
      writers to only manipulating the data at hand. So we
      wrote a LISP interpreter with functions specific to
      the data types we were manipulating.

      I can guarantee that this little language will never
      go beyond its parent application. It's not generally
      useful for anything else. Besides, few if any people
      here know or use LISP.

      On the other hand, it's a fun little language which
      performs rather complex data munging with no complaints.
      It's highly attuned to its tiny little ecological
      niche.

      Rob

Re: Developing a microlanguage for non-perl programmers
by perrin (Chancellor) on Dec 04, 2001 at 01:03 UTC
      Yes, actually you do. I just read through the 74 some odd slides and agree. The talk would have been even better, but the slides will set you on the right course—imagine a GCC that emits Perl! More to the point, the link will show you just how quickly and easily you can develope a small language in Perl for your target audience.

      On the other hand, if you feel you really must develope your own language, I'd recomend starting with Constructing Language Processors For Little Languages, by Randy M. Kaplan. New York, NY., John Wiley & Sons, 1994.

      –hsm
Re: Developing a microlanguage for non-perl programmers
by Fletch (Bishop) on Dec 04, 2001 at 00:40 UTC
Re: Developing a microlanguage for non-perl programmers
by rje (Deacon) on Dec 04, 2001 at 02:16 UTC
    That sounds like fun!

    Suaveant's suggestion of writing a filter (in perl, of
    course) sounds like a good one.

    I'd suggest you stay as close to perl as you can,
    unless the syntax is just too wierd. Non-programmers
    can handle some level of syntax... but then I'm assuming...

    On the other hand, your proposed syntax looks fine.

    Hey, have you thought of using an OO approach? Could
    you define a perl object to contain your data + the
    methods used to manipulate that data? For example,
    would the chemists understand this kind of syntax?

    loop $i (1 to 50) new data; data.read( datafile.$i ); data.shift( 0.05 ); data.normalize( constant=1.0 ); data.write( outfile.$i ); # note: could perl eval() the stuff in parentheses? endloop
    Anyhow, don't forget to post a reply to let us know
    what you're deciding and how it's going.

    -Rob

Re: Developing a microlanguage for non-perl programmers
by ralphie (Friar) on Dec 04, 2001 at 01:24 UTC
    this has been a recurrent theme in dr. dobbs for a long time ... probably far longer than i've been familiar with the publication. i just did a search there with the notion of offering suggestions, but the vein was large enough that you'll have better luck starting on your own. only some of their content is on-line, so if you find something of interest you may have to track it down.

    the website is, of course, <a href=http://www.ddj.com </href> here

      Yep, I've still got the September 1991 issue of DDJ, with articles of "Little Languages, Big Questions", "Your Own Tiny Object-Oriented Language", and "Adding An Extension Language To Your Software". Interesting.

      -rr

Re: Developing a microlanguage for non-perl programmers
by cfedde (Novice) on Dec 04, 2001 at 08:25 UTC
    I'm of the opinion that creating little languages is over rated as a solution to this class of problems. Better would be to create a class on the experiment data with the proper methods for each transformation. Then teach the users a suitable subset of perl to enable them to do the work they need to get done. That way you focus on the problem and empower the users without having to spend too much time working on the oddities of language design.
Re: Developing a microlanguage for non-perl programmers
by stefan k (Curate) on Dec 04, 2001 at 15:53 UTC
    As has been said above creating your own little language is probably not a very good idea. But isn't Matlab(tm) or Octave exactly what you need? It gives you loads of numerical functionality, has all the exits and error checking, provides easy data loading, has a GUI (at least recent matlab does; all chemists I know prefer GUI over code, don't ask me why ;-). You could try GNU octave first for free to check whether it fits your needs and then get some (expensive!) matlab licenses.

    I hope I really got the point in your post...

    Regards... Stefan
    you begin bashing the string with a +42 regexp of confusion

Re: Developing a microlanguage for non-perl programmers
by clemburg (Curate) on Dec 04, 2001 at 18:14 UTC

    My recommendation would be to use a syntax similar to Excel formula syntax, since a lot of users already know this, and it is a very basic prefix operator notation, which makes it easy to convert this syntax to LISP syntax, for which many parsers (and intepreters) are already available, including examples in the Parse::RecDescent distribution.

    Christian Lemburg
    Brainbench MVP for Perl
    http://www.brainbench.com

Re: Developing a microlanguage for non-perl programmers
by petral (Curate) on Dec 04, 2001 at 02:01 UTC
    Actually, wouldn't what you've shown here work fine if read returns an object ($data) and that object includes the other 3 functions? (Not that you can't do infinitely more with Filter::Simple or Parse::RecDescent, but you may not need infinitely more.)

      p
Re: Developing a microlanguage for non-perl programmers
by ralphie (Friar) on Dec 04, 2001 at 17:38 UTC
    the later posts here brought to mind the possibility that you might find that r, the gnu version of s, will satisfy your needs. r is, of course, primarily a statistical language, but the incorporated primitives would probably satisy most, if not all, of your needs. r also has the advantage of readily writing graphic output, as you can see <a href=http://www.r-project.org </href>here

    the denizens of this particular hangout will also appreciate that the r repository is named cran, and that there is a perl interface to r. i'm not an expert, it's something in which i've been intending to get literate.