Haskell Hacking: a journal of Haskell programming

Haskell Hacking has moved => Go here.
2006-12-20

Programming Haskell: argument handling and a complete cat annotations

Just some tidy ups of the previous article, on implementing unix 'cat' in Haskell.

Firstly, several people wrote in to point out the bug in the -b option. Rather than squeezing only blank lines, it would do more of a 'uniq' kind of thing, squeezing all duplicates:

    render Squeeze   = map head. group

Rather than:

    render Squeeze   = map head. groupBy (\x y -> all (all isSpace) [x,y])

Secondly, the complete source is now available via darcs, with a cabal build system.

    $ darcs get http://www.cse.unsw.edu.au/~dons/code/caths
    Copying patch 2 of 2... done!
    Applying patch 2 of 2... done.
    Finished getting.

    $ cd caths
    $ runhaskell Setup.lhs configure --prefix=/home/dons
    $ runhaskell Setup.lhs build
    $ runhaskell Setup.lhs install

And you're done.

Next week

We'll look at writing an SMP-parallel, bytestring-based url checker in Haskell:

    $ urlcheck ph-3-redux.txt                                 
    Found 0 broken links. Checked 4 links (4 unique) in 1 file.
    Search time 0 secs

/home :: /haskell :: permalink :: rss

2006-12-18

Programming Haskell: argument handling and a complete cat

This is part three in a series of tutorials on programming Haskell. You can get up to speed by reading:

Today we'll look more into how Haskell interacts with its environment, robust command line argument parsing, and writing a complete program.

Administrivia

Some clarifications on issues raised by yesterday's article.

Chunks

One issue pointed out was that the 'chunk' function was missing. This was semi-intentional. Anyway, this function just splits a list into 'n' chunks:

    chunk n xs = chunk' i xs
      where
        chunk' _ [] = []
        chunk' n xs = a : chunk' n b where (a,b) = splitAt n xs

        i = ceiling (fromIntegral (length xs) / fromIntegral n)

You may be able to write a neater one..

Solipsistic philosophers

Also, an amusing thread appeared on reddit, regarding solipsistic philosopher programs:

Of course, a good optimizing compiler will replace your solipsistic philosopher with a no-op.

A good optimising compiler, or any Haskell compiler :-) Since results that are never required are not computed in Haskell, due to laziness, we can write high performance solipsism simulators all day long:

    main = do
        let largestNumber = last [1..]
        return ()

Running our simulation of the philosopher-mathematician pondering some thoughts on large numbers:

    $ ghc A.hs
    $ ./a.out
    ./a.out  0.00s user 0.01s system 100% cpu 0.003 total

Ok. Enough jokes. Show me the code!

Getting to work

Yesterday we implemented a few toy unix programs, including 'cat'. Today we'll look at writing a complete cat program, but with a focus on interacting properly with the environment and being careful about command line handling. For our running examples, we'll consider the 'cat' and 'tac' programs. The basic spec for 'cat' is:

The cat utility reads files sequentially, writing them to the standard output. The file operands are processed in command-line order. If file is a single dash (`-') or absent, cat reads from the standard input.

It's the 'id' function of the unix shell. BSD 'cat.c' is a 255 line C program. From the man page we can see it does more than just concatenate files. It can also:

  • Numbers the output lines, starting at 1.
  • Squeezes multiple adjacent empty lines
  • Displays non-printing characters so they are visible.

Let's start by looking at the command line argument processing code.

Getting in arguments

The basic way to get arguments in a Haskell program is provided by the System.Environment library. We can use the getArgs function:

    Prelude> :m + System.Environment

    Prelude System.Environment> :t getArgs
    getArgs :: IO [String]

    Prelude System.Environment> do a <- getArgs; print a
    []

Which is empty, since we didn't provide any arguments!

In a small program we can implement all argument handling using just 'getArgs', and some simple list functions. For example, here's a basic 'tac' program, to reverse its input:

    import System.Environment
    import System.Exit

    main = getArgs >>= parse >>= putStr . tac

    tac  = unlines . reverse . lines

    parse ["-h"] = usage   >> exit
    parse ["-v"] = version >> exit
    parse []     = getContents
    parse fs     = concat `fmap` mapM readFile fs

    usage   = putStrLn "Usage: tac [-vh] [file ..]"
    version = putStrLn "Haskell tac 0.1"
    exit    = exitWith ExitSuccess
    die     = exitWith (ExitFailure 1)

This program concatenates and prints the contents of files in reverse (or reads from stdin with no arguments), along with a couple of basic command line flags for version and help strings. It's also reasonably careful about setting exit status on finishing, using the functions from System.Exit. The actual core algorithm for 'tac' is a nice pure Haskell function, and really all the hard work is done processing the command line args.

Some example use:

    $ ./tac -h
    Usage: tac [-vh] [file ..]

    $ ./tac -v
    Haskell tac 0.1

    $ ./tac A.hs B.hs
    return n
    print 1
    n <- getLine
    g = do
    ...
    import System.Exit
    import System.Environment

    $ ./tac  < A.hs 
    die     = exitWith (ExitFailure 1)
    exit    = exitWith ExitSuccess
    version = putStrLn "Haskell tac 0.1"
    ...
    import System.Exit
    import System.Environment

As you can see, once compiled it behaves like a normal unix utility, properly dealing with stdin, with file arguments and the shell.

Note that getArgs doesn't return the program name. To get that we use:

    Prelude System.Environment> :t getProgName
    getProgName :: IO String

The environment

Many programs also make use of environment variables. We can get access to the full shell 'env' using

    Prelude System.Environment> :t getEnvironment
    getEnvironment :: IO [(String, String)]

Which returns an association list, mapping environment variables to their values. We can stick this list straight into an efficient Map structure, for later use. Here's an interactive example:

    Prelude System.Environment> env <- do e <- getEnvironment; return (Data.Map.fromList e)

which we could also write as:

    env <- Data.Map.fromList `fmap` getEnvironment

Once we've got the environment in a useful Map, we can inspect it using Map lookups:

    Prelude System.Environment> :t env
    env :: Data.Map.Map String String

    Prelude System.Environment> :t Data.Map.lookup
    Data.Map.lookup :: (Ord k, Monad m) => k -> Data.Map.Map k a -> m a

That is, the lookup function takes some key, 'k', and a Map from keys to elements of type 'a', and returns an element, if found, in some monad.

More on failure

You may recall from the first tutorial that the Map 'lookup' function will fail if the key is not found. The particular way you wish it to fail depends on which monad you use. You can tell this from the type of lookup. The

    lookup :: (Monad m) => ... m a

syntax indicates that lookup is polymorphic in its monad: it will work for any monad type, and its behaviour is determined by the particular instance of the monad interface you ask for. When a lookup fails, it calls the 'fail' function for the monad you're using. When a lookup is successful, it calls the 'return' function of the same monad. Being 'polymorphic in a monad' really just means that it will call which particular concrete monad 'subclass' you happen to be using.

Looking at the various useful monads for this, we can choose which failure behaviour we would prefer. Here's the implementation of the 'fail' interface for a variety of monads. It's up to you to pick which behaviour you'd like.

For Maybes, we get the null value, Nothing, on failure:

    instance  Monad Maybe  where
        return   = Just
        fail _   = Nothing

For Eithers, we get an error string:

    instance (Error e) => Monad (Either e) where
        return   = Right
        fail s   = Left (strMsg s)

For lists, we get the empty list on failure:

    instance  Monad []  where
        return x = [x]
        fail _   = []

And for IO we get an exception thrown:

    instance  Monad IO  where
        fail    = ioError . userError
        return  = returnIO 

So, depending on the type signature, the compiler will statically pick one of these 'fail's to use on a lookup failing at runtime. For example, to fail with a null value, we'd use the Maybe monad:

    Prelude System.Environment> Data.Map.lookup "USER" env :: Maybe String
    Just "dons"

    Prelude System.Environment> Data.Map.lookup "LUSER" env :: Maybe String
    Nothing

Using Nothing for fatal errors isn't the best practice for large programs, since you usually need to know what failed. For a string-annotated Nothing, we can use the Either monad:

    Prelude System.Environment> :m + Control.Monad.Error

    Prelude System.Environment Control.Monad.Error> Data.Map.lookup "LUSER" env :: Either String String
    Left "Data.Map.lookup: Key not found"

Which is more useful. To fail with a proper exception we'd use the IO monad:

    Prelude System.Environment> Data.Map.lookup "LUSER" env :: IO String
    *** Exception: user error (Data.Map.lookup: Key not found)

We'll now turn to a more flexible approach to argument parsing.

GetOpt

The base Haskell library comes with an implementation of getopt, a useful library for standardised argument handling. Let's implement the argument handling of the unix 'cat' program using this lib.

A type for flags

The first thing to do is define a data type representing the valid flags. First, let's import all the libraries I'll use:

    import Control.Monad
    import Data.Char
    import Data.List
    import System.Console.GetOpt
    import System.Environment
    import System.Exit
    import System.IO
    import Text.Printf

Now in a new file, Cat.hs, we'll write:

    data Flag
        = Blanks                -- -b
        | Dollar                -- -e 
        | Squeeze               -- -s
        | Tabs                  -- -t
        | Unbuffered            -- -u
        | Invisible             -- -v
        | Number                -- -n
        | Help                  -- --help
        deriving (Eq,Ord,Enum,Show,Bounded)

The 'data' keyword defines a new data type, 'Flag', which can have one of several values. Such a type is often called a sum (or union) type. So 'Flag' is a new user-defined type, just like other types, such as Bool or Int. The identifiers on the right hand side of the | are the types constructors. That is, values which have type 'Flag'. We ask the compiler to also derive some instances of various common classes for us (so we don't have to write the code ourselves).

With just this we can already start playing around with the flag data type in GHCi:

    > :reload
    > :m + Data.List
    > let s = [Number, Squeeze, Unbuffered, Squeeze]

    *Main Data.List> let s = [Number, Squeeze, Unbuffered, Squeeze]

    *Main Data.List> sort s
    [Squeeze,Squeeze,Unbuffered,Number]

    *Main Data.List> nub s
    [Number,Squeeze,Unbuffered]

    *Main Data.List> map fromEnum s
    [6,2,4,2]

    *Main Data.List> [Blanks .. ]
    [Blanks,Dollar,Squeeze,Tabs,Unbuffered,Invisible,Number,Help]

User defined data types are really first class citizens in Haskell, and behave just like the 'inbuilt' types.

Binding to command line flags

The next step is to associate some particular command line strings with each abstract flag. We do this by writing a list of 'Option's, which tie long and short argument flags to the particular abstract Flag value we need, and also associated a help string with each flag:

    flags =
       [Option ['b'] []       (NoArg Blanks)
            "Implies the -n option but doesn't count blank lines."
       ,Option ['e'] []       (NoArg Dollar)
            "Implies the -v option and also prints a dollar sign (`$') at the end of each line."
       ,Option ['n'] []       (NoArg Number)
            "Number the output lines, starting at 1."
       ,Option ['s'] []       (NoArg Squeeze)
            "Squeeze multiple adjacent empty lines, causing the output to be single spaced."
       ,Option ['t'] []       (NoArg Tabs)
            "Implies the -v option and also prints tab characters as `^I'."
       ,Option ['u'] []       (NoArg Unbuffered)
            "The output is guaranteed to be unbuffered (see setbuf(3))."
       ,Option ['v'] []       (NoArg Invisible)
            "Displays non-printing characters so they are visible."
       ,Option []    ["help"] (NoArg Help)
            "Print this help message"
       ]

Parsing the flags

To actually turn the list of command line flags getArgs gives us, into a useful list of abstract Flag values, we use the 'getOpt' function, which returns a triple consisting of flags that were set, a list of any non-flag arguments, and a list of error messages. First we need a couple of libraries:

And now to parse the 'cat' argument grammar, we would use:

    parse argv = case getOpt Permute flags argv of

        (args,fs,[]) -> do
            let files = if null fs then ["-"] else fs
            if Help `elem` args
                then do hPutStrLn stderr (usageInfo header flags)
                        exitWith ExitSuccess
                else return (nub (concatMap set args), files)

        (_,_,errs)      -> do
            hPutStrLn stderr (concat errs ++ usageInfo header flags)
            exitWith (ExitFailure 1)

        where header = "Usage: cat [-benstuv] [file ...]"

If the arguments don't make sense, we fail with a usage message, and set the exit status to 1. The final list of flags to use, and any files to open, can be returned to main for processing now:

    main = do
        (as, fs) <- getArgs >>= parse
        putStrLn $ "Flags: " ++ show as
        putStrLn $ "Files: " ++ show fs

We can now test out the argument parsing code:

    $ ghc Cat.hs

    $ ./a.out          
    Flags: []
    Files: []

    $ ./a.out A.hs Z.hs
    Flags: []
    Files: ["A.hs","Z.hs"]

Ok, files are good. How about the flags that imply other flags?

    $ ./a.out -b A.hs Z.hs
    Flags: [Number,Blanks]
    Files: ["A.hs","Z.hs"]

    $ ./a.out -btvu A.hs Z.hs
    Flags: [Number,Blanks,Tabs,Invisible,Unbuffered]
    Files: ["A.hs","Z.hs"]

Good. And invalid flags:

    $ ./a.out -i A.hs Z.hs
    unrecognized option `-i'
    Usage: cat [-benstuv] [file ...]
      -b          Implies the -n option but doesn't count blank lines.
      -e          Implies the -v option and also prints a dollar sign (`$') at the end of each line.
      -n          Number the output lines, starting at 1.
      -s          Squeeze multiple adjacent empty lines, causing the output to be single spaced.
      -t          Implies the -v option and also prints tab characters as `^I'.
      -u          The output is guaranteed to be unbuffered (see setbuf(3)).
      -v          Displays non-printing characters so they are visible.
          --help  Print this help message

Ok, that was pretty easy. Now let's try to implement these functions!

Implementing cat

So now we have to map those abstract flag values to real behaviour. I'll start with the easy ones first.

If -u is set, we turn off all buffering. After that, we map a 'cat' function over each file. So the program's 'main' is just:

    main = do
        (args, files) <- getArgs >>= parse
        when (Unbuffered `elem` args) $ hSetBuffering stdout NoBuffering
        mapM_ (cat args) files

Where 'cat' will process the files one at a time. 'cat' is where all the hard work is done.

Most of the operations the cat program does requires access to each line of the file. We also need to be able to handle the special file name, "-". What we'd like to do is separate out any IO, from operations on each file's content. To do this we'll write a higher order function, 'withFile', which takes a filename, opens it, splits it into lines and applies a function to the contents of the file, before writing the result to stdout:

    withFile s f = putStr . unlines . f . lines =<< open s
      where
        open f = if f == "-" then getContents else readFile f

Now we can implement the pure 'cat' function, implementing the cat program's functionality. Firstly, if there are no command line flags, the 'cat' function does nothing to the input:

    cat [] f = withFile f id

That is, it applies the 'id' function to the stream generated by withFile. That was easy.

Now, if there are some arguments, we'll need to process them. This can be a little tricky, since the effect of the command line flags are cumulative, and we better process them in the right order. What is that order? Well, from experimentation :-) it seems that (if all flags are enabled) 'cat' proceed to:

  • first squeeze any blank lines;
  • then any visibility flags are processed;
  • then line numbering occurs;
  • then, finally, any visible newlines are printed as '$'.
The visibility flags transform non-printing characters into a visible representation. The key to coding this up is recognising that its just a functional pipeline. So we can write it as:
    cat as f = withFile f (newline . number . visible as)

Where 'visible' renders any non-printing chars. Then we number the resulting lines (if the arguments are set), and then finally make any remaining newlines visible. Note that the core of the algorithm does no IO. It's a pure function from [String] -> [String]. Now the implementation of 'number':

      where
        number  s    = if Blanks `elem` as then numberSome s else ifset Number numberAll s
        newline s    = ifset Dollar (map (++"$")) s
        visible as s = foldl' (flip render) s as
        ifset a f    = if a `elem` as then f else id

Here we actually handle all the data traversal. And use a little helper function, 'ifset', to conditionally execute a function if the corresponding command line is set. Note that slight trickiness involving numbering: either we number all lines, or number the non blank lines, but not both.

'render', the function to print invisible characters, is just:

    render Squeeze   = map head . group
    render Tabs      = map $ concatMap (\c -> if c == '\t' then "^I" else [c])
    render Invisible = map $ concatMap visible
      where
        visible c | c == '\t' || isPrint c = [c]
                  | otherwise              = init . tail . show $ c
    render _ = id

And the numbering function:

    numberLine      = printf "%6d  %s"
    numberAll s     = zipWith numberLine [(1 :: Integer)..] s
    numberSome s    = reverse . snd $ foldl' draw (1,[]) s
      where
        draw (n,acc) s
                | all isSpace s = (n,   s : acc)
                | otherwise     = (n+1, numberLine n s : acc)

And we're done! In the end, our entire implementation is some 89 lines of code, of which 60 are to do with importing modules, or command line argument parsing. The actual heart of the program is fairly tiny in the end.

Let's run the code.

    $ ghc -O Cat.hs -o cat

Check it actually prints its arguments:

    $ ./cat Cat.hs | head
    import System.Console.GetOpt
    import System.IO
    import System.Exit
    import System.Environment
    import Data.List
    import Data.Char
    import Control.Monad
    import Text.Printf

    main = do

Or multiple arguments:

    $ ./cat Cat.hs /usr/share/dict/words | tail
    zymotoxic
    zymurgy
    Zyrenian
    Zyrian
    Zyryan
    zythem
    Zythia
    zythum
    Zyzomys
    Zyzzogeton

Does it number lines:

    $ ./cat -n Cat.hs  | tail
        80  
        81      (_,_,errs)      -> do
        82          hPutStrLn stderr (concat errs ++ usageInfo header flags)
        83          exitWith (ExitFailure 1)
        84  
        85      where header = "Usage: cat [-benstuv] [file ...]"
        86  
        87            set Dollar = [Dollar, Invisible]
        88            set Tabs   = [Tabs,   Invisible]
        89            set f      = [f]
What about non blank lines:
    $ ./cat -b Cat.hs | tail

        72      (_,_,errs)      -> do
        73          hPutStrLn stderr (concat errs ++ usageInfo header flags)
        74          exitWith (ExitFailure 1)

        75      where header = "Usage: cat [-benstuv] [file ...]"

        76            set Dollar = [Dollar, Invisible]
        77            set Tabs   = [Tabs,   Invisible]
        78            set f      = [f]
How about some visibility flags:
    $ cat -eb Cat.hs | tail
    $
        72      (_,_,errs)      -> do$
        73          hPutStrLn stderr (concat errs ++ usageInfo header flags)$
        74          exitWith (ExitFailure 1)$
    $
        75      where header = "Usage: cat [-benstuv] [file ...]"$
    $
        76            set Dollar = [Dollar, Invisible]$
        77            set Tabs   = [Tabs,   Invisible]$
        78            set f      = [f]$
And turning on all the flags:
    $ cat -bnvste Cat.hs | tail
    $
        72      (_,_,errs)      -> do$
        73          hPutStrLn stderr (concat errs ++ usageInfo header flags)$
        74          exitWith (ExitFailure 1)$
    $
        75      where header = "Usage: cat [-benstuv] [file ...]"$
    $
        76            set Dollar = [Dollar, Invisible]$
        77            set Tabs   = [Tabs,   Invisible]$
        78            set f      = [f]$

Nice!

Summary

Well, in the end I didn't get on to exception handling, or the use of bytestring to improve performance further. However, we have implemented (95%) of the unix 'cat' program, including all argument handling and functionality, in about an hour and a half.

Once it typechecked, the code just worked, except for one bug where I originally rendered newline before counting lines (simply because the spec was underspecified). Lesson: you can start writing your unix scripts in Haskell right now. They'll be flexible, clean, and easy to maintain. And most of all, fun to write!

Hopefully next time we'll look into using bytestrings for processing larger volumes of data, and the use of exception handling to deal with unusual errors.

The complete source

And just for reference, there's the complete source:

    import System.Console.GetOpt
    import System.IO
    import System.Exit
    import System.Environment
    import Data.List
    import Data.Char
    import Control.Monad
    import Text.Printf

    main = do
        (args, files) <- getArgs >>= parse
        when (Unbuffered `elem` args) $ hSetBuffering stdout NoBuffering
        mapM_ (cat args) files

    withFile s f = putStr . unlines . f . lines =<< open s
      where
        open f = if f == "-" then getContents else readFile f

    cat [] f = withFile f id
    cat as f = withFile f (newline . number . visible as)
      where
        number  s    = if Blanks `elem` as then numberSome s else ifset Number numberAll s
        newline s    = ifset Dollar (map (++"$")) s
        visible as s = foldl' (flip render) s as
        ifset a f    = if a `elem` as then f else id

    render Squeeze   = map head . group
    render Tabs      = map $ concatMap (\c -> if c == '\t' then "^I" else [c])
    render Invisible = map $ concatMap visible
      where
        visible c | c == '\t' || isPrint c = [c]
                  | otherwise              = init . tail . show $ c
    render _ = id

    numberLine      = printf "%6d  %s"
    numberAll s     = zipWith numberLine [(1 :: Integer)..] s
    numberSome s    = reverse . snd $ foldl' draw (1,[]) s
      where
        draw (n,acc) s
                | all isSpace s = (n,   s : acc)
                | otherwise     = (n+1, numberLine n s : acc)

    data Flag
        = Blanks                -- -b
        | Dollar                -- -e 
        | Squeeze               -- -s
        | Tabs                  -- -t
        | Unbuffered            -- -u
        | Invisible             -- -v
        | Number                -- -n
        | Help                  -- --help
        deriving (Eq,Ord,Enum,Show,Bounded)

    flags =
       [Option ['b'] []       (NoArg Blanks)
            "Implies the -n option but doesn't count blank lines."
       ,Option ['e'] []       (NoArg Dollar)
            "Implies the -v option and also prints a dollar sign (`$') at the end of each line."
       ,Option ['n'] []       (NoArg Number)
            "Number the output lines, starting at 1."
       ,Option ['s'] []       (NoArg Squeeze)
            "Squeeze multiple adjacent empty lines, causing the output to be single spaced."
       ,Option ['t'] []       (NoArg Tabs)
            "Implies the -v option and also prints tab characters as `^I'."
       ,Option ['u'] []       (NoArg Unbuffered)
            "The output is guaranteed to be unbuffered (see setbuf(3))."
       ,Option ['v'] []       (NoArg Invisible)
            "Displays non-printing characters so they are visible."
       ,Option []    ["help"] (NoArg Help)
            "Print this help message"
       ]

    parse argv = case getOpt Permute flags argv of
        (args,fs,[]) -> do
            let files = if null fs then ["-"] else fs
            if Help `elem` args
                then do hPutStrLn stderr (usageInfo header flags)
                        exitWith ExitSuccess
                else return (nub (concatMap set args), files)

        (_,_,errs)      -> do
            hPutStrLn stderr (concat errs ++ usageInfo header flags)
            exitWith (ExitFailure 1)

        where header = "Usage: cat [-benstuv] [file ...]"

              set Dollar = [Dollar, Invisible]
              set Tabs   = [Tabs,   Invisible]
              set f      = [f]

/home :: /haskell :: permalink :: rss

2006-12-17

Programming Haskell: String processing (with a dash of concurrency)

This is part two in a series of tutorials on programming Haskell. You can get up to speed by reading yesterday's introductory article.

Today we'll look more into the basic tools at our disposal in the Haskell language, in particular, operations for doing IO and playing with files and strings.

Administrivia

Before we get started, I should clarify a small point raised by yesterday's article. One issue I forgot to mention was that there are slight differences between running Haskell in ghci, the bytecode interpreter, and compiling it to native code with GHC.

Haskell programs are executed by evaluating the special 'main' function.

    import Data.List

    mylength = foldr (const (+1)) 0
    main = print (mylength "haskell")

To compile this to native code, we would feed the source file to the compiler:

    $ ghc A.hs
    $ ./a.out 
    7

For a faster turnaround, we can run the code directly through the bytecode interpreter, GHCi, using the 'runhaskell' program:

    $ runhaskell A.hs
    7

GHCi, the interactive Haskell environment, is a little bit different. As it is an interactive system, GHCi must execute your code sequentially, as you define each line. This is different to normal Haskell, where the order of definition is irrelevant. GHCi effectively executes your code inside a do-block. Therefore you can use the do-notation at the GHCi prompt to define new functions:

    $ ghci
    Prelude> :m + Data.List

    Prelude> let mylength = foldr (const (+1)) 0

    Prelude> :t mylength
    mylength :: [a] -> Integer

    Prelude> mylength "haskell"
    7

For this tutorial I will be developing code in a source file, and either compiling it as above, or loading the source file into GHCi for testing. To load a source file into GHCi, we do:

    $ ghci    
    Prelude> :load A.hs

    *Main> :t main
    main :: IO ()

    *Main> :t mylength
    mylength :: [a] -> Integer

    *Main> mylength "foo"
    3

    *Main> main
    7

Now, let's get into the code!

IO

As the Camel Book says:

Unless you're using artificial intelligence to model a solipsistic philosopher, your program needs some way to communicate with the outside world.

In yesterday's tutorial, I briefly introduced 'readFile', for reading a String from a file on disk. Let's consider now IO in more detail. The most common IO operations are defined in the System.IO library.

For the most basic stdin/stdout Unix-style programs in Haskell, we can use the 'interact' function:

    interact    ::  (String -> String) -> IO ()

This higher order function takes, as an argument, some function for processing a string (of type String -> String). It runs this function over the standard input stream, printing the result to standard output. A surprisingly large number of useful programs can be written this way. For example, we can write the 'cat' unix program as:

    main = interact id

Yes, that's it! Let's compile and run this program:

    $ ghc -O A.hs       

    $ cat A.hs | ./a.out
    main = interact id

How does this work? Firstly, 'interact' is defined as:

    interact f = do s <- getContents
                    putStr (f s)

So it reads a string from standard input, and writes to standard output the result of applying its argument function to that string. The 'id' function itself has the type:

    id :: a -> a

'id' is a function of one argument, of any type (the lowercase 'a' in the type means any type can be used in that position, i.e. it is a polymorphic function (also called a generic function in some languages)). 'id' takes a value of some type 'a', and returns a value of the same type. There's is only one (non-trivial) function of this type:

    id a = a

So 'interact id' will print to the input string to standard output unmodified.

Let's now write the 'wc' program:

    main    = interact count
    count s = show (length s) ++ "\n"

This will print the length of the input string, that is, the number of chars:

    $ runhaskell A.hs < A.hs
    57

Line oriented IO

Only a small number of programs operate on unstructured input streams. It is far more common to treat an input stream as a list of lines. So let's do that. To break a string up into lines, we'll use the ... 'lines' function, defined in the Data.List library:

    lines :: String -> [String]

The type, once again, tells the story. 'lines' takes a String, and breaks it up into a list of strings, splitting on newlines. To join a list of strings back into a single string, inserting newlines, we'd use the ... 'unlines' function:

    unlines :: [String] -> String

There are also similar functions for splitting on words, namely 'words' and 'unwords'. Now, an example. To count the number of lines in a file:

    main = interact (count . lines)

We can run this as:

    $ ghc -O A.hs

    $ ./a.out < A.hs
    3

Here we reuse the 'count' function from above, by composing it with the lines function.

On composition

This nice code reuse via composition is achieved using the (.) function, pronounced 'compose'. Let's look at how that works. (Feel free to skip this section, if you want to just get things done).

The (.) function is just a normal everyday Haskell function, defined as:

    (.) f g x = f (g x)

This looks a little like magic (or line noise), but its pretty easy. The (.) function simply takes two functions as arguments, along with another value. It applies the 'g' function to the value 'x', and then applies 'f' to the result, returning this final value. The functions may be of any type. The type of (.) is actually:

    (.) :: (b -> c) -> (a -> b) -> a -> c

which might look a bit hairy, but it essentially specifies what types of arguments make sense to compose. That is, only those where:

    f :: b -> c
    g :: a -> b
    x :: a

can be composed, yielding a new function of type:

    (f . g) :: a -> c

The nice thing is that this composition makes sense (and works) for all types a, b and c.

How does this relate to code reuse? Well, since our 'count' function is polymorphic, it works equally well counting the length of a string, or the length of a list of strings. Our littler 'wc' program is the epitome of the phrase: "higher order + polymorphic = reusable". That is, functions which take other functions as arguments, when combined with functions that work over any type, produce great reusable 'glue'. You only need vary the argument function to gain terrific code reuse (and the strong type checking ensures you can only reuse code in ways that work).

More on lines

Another little example, let's reverse each line of a file (like the unix 'rev' command):

    main = interact (unlines . map reverse . lines)
Which when run, reverses the input lines:
    $ ./a.out < B.hs
    rahC.ataD tropmi
    ebyaM.ataD tropmi
    tsiL.ataD tropmi

So we take the input string, split it into lines, and the loop over that list of lines, reversing each of them, using the 'map' function. Finally, once we've reversed each line, we join them back into a single string with unlines, and print it out.

The 'map' function is a fundamental control structure of functional programming, similar to the 'foreach' keyword in a number of imperative languages. 'map' however is just a function on lists, not built in syntax, and has the type:

    map :: (a -> b) -> [a] -> [b]

That is, it takes some function, and a list, and applies that function to each element of the list, returning a new list as a result. Since loops are so common in programming, we'll be using 'map' a lot. Just for reference, 'map' is implemented as:

    map _ []     = []
    map f (x:xs) = f x : map f xs

File IO

Operating on stdin/stdout is good for scripts (and this is how tools like sed or perl -p work), but for 'real' programs we'll at least need to do some file IO. The basic operations of files are:

    readFile  :: FilePath -> IO String
    writeFile :: FilePath -> String -> IO ()

'readFile' takes a file name as an argument, does some IO, and returns the file's contents as a string. 'writeFile' takes a file name, a string, and does some IO (writing that string to the file), before returning the void (or unit) value, ().

We could implement a 'cp' program on files, as:

    import System.Environment

    main = do
        [f,g] <- getArgs
        s     <- readFile f
        writeFile g s

Running this program:

    $ ghc -O A.hs

    $ ./a.out A.hs Z.hs

    $ cat Z.hs
    import System.Environment

    main = do
        [f,g] <- getArgs
        s     <- readFile f
        writeFile g s

Since we're doing IO (the type of readFile and writeFile enforce this), the code runs inside a do-block, using the IO monad. "Using the IO monad" just means that we wish to use an imperative, sequential order of evaluation. (As an aside, a wide range of other monads exist, for programming different program evaluation strategies, such as Prolog-style backtracking, or continutation-based evaluation. All of imperative programming is just one subset of possible evaluation strategies you can use in Haskell, via monads).

In do-notation, whenever you wish to run an action, for its side effect, and save the result to a variable, you would write:

    v <- action

For example, to run the 'readFile' action, which has the side effect of reading a file from disk, we say:

    s <- readFile f

Finally, we can use the 'appendFile' function to append to an existing file.

File Handles

The most generic interface to files is provided via Handles. Sometimes we need to keep a file open, for multiple reads or writes. To do this we use Handles, an abstraction much like the underlying system's file handles.

To open up a file, and get its Handle, we use:

    openFile :: FilePath -> IOMode -> IO Handle

So to open a file for reading only, in GHCi:

    Prelude System.IO> h <- openFile "A.hs" ReadMode
    {handle: A.hs}

Which returns a Handle onto the file "A.hs". We can read a line from this handle:

    Prelude System.IO> hGetLine h
    "main = do"

To close a Handle, and flush the buffer:

    hClose :: Handle -> IO ()

Once a Handle is closed, we can no longer read from it:

    Prelude System.IO> hClose h
    Prelude System.IO> hGetLine h
    *** Exception: A.hs: hGetLine: illegal operation (handle is closed)

We can also flush explicitly with:

    hFlush :: Handle -> IO ()

Other useful operations for reading from Handles:

    hGetChar     :: Handle -> IO Char
    hGetLine     :: Handle -> IO [Char]
    hGetContents :: Handle -> IO [Char]

To write to Handles:

    hPutChar    :: Handle -> Char -> IO ()
    hPutStr     :: Handle -> [Char] -> IO ()
    hPutStrLn   :: Handle -> [Char] -> IO ()
    hPrint      :: Show a => Handle -> a -> IO ()

Some other useful actions:

    hSeek     :: Handle -> SeekMode -> Integer -> IO ()
    hTell     :: Handle -> IO Integer
    hFileSize :: Handle -> IO Integer
    hIsEOF    :: Handle -> IO Bool

An example: spell checking

Here is a small example of combining the Data.Set and List data structures from yesterday's tutorial, with more IO operations. We'll implement a little spell checker, building the dictionary in a Set data type. First, some libraries to import:

    import System.Environment
    import Control.Monad
    import Data.Set

And the complete program:

    main = do
        [s] <- getArgs
        f   <- readFile "/usr/share/dict/words"
        g   <- readFile s
        let dict = fromList (lines f)
        mapM_ (spell dict) (words g)

    spell d w = when (w `notMember` d) (putStrLn w)

Running this program, on its own source, and it reports the following words are not found in the dictionary:

    $ ghc -O Spell.hs -o spell

    $ ./spell A.hs
    Data.Char
    =
    <-
    (map
    toUpper
    n)
    =
    <-
    getLine
    1

Writing the results out

If we wanted to write the results out to a temporary file, we can do so. Let's import a couple of other modules:

    import Data.Set
    import Data.Maybe
    import Text.Printf
    import System.IO
    import System.Environment
    import System.Posix.Temp

Refactoring the main code to separate out the reading and writing phases in to their own function, we end up with the core code:

    main = do
        (f, g) <- readFiles
        let dict = fromList (lines f)
            errs = mapMaybe (spell dict) (words g)
        write errs

    spell d w | w `notMember` d = Just w
              | otherwise       = Nothing

Where reading is now its own function:

    readFiles = do
        [s] <- getArgs
        f   <- readFile "/usr/share/dict/words"
        g   <- readFile s
        return (f,g)

And writing errors out to their own file:

    write errs = do
        (t,h) <- mkstemp "/tmp/spell.XXXXXX"
        mapM_ (hPutStrLn h) errs
        hClose h
        printf "%d spelling errors written to '%s'\n" (length errs) t
Pretty simple! Running this program:
    $ ghc --make -O Spell.hs -o myspell
    [1 of 1] Compiling Main             ( Spell.hs, Spell.o )
    Linking myspell ...

    $ ./myspell Spell.hs
    67 spelling errors written to '/tmp/spell.ia8256'

Extension: using SMP parallelism

Finally, just for some bonus fun ... and hold on to your hat 'cause I'm going to go fast ... we'll add some parallelism to the mix.

Haskell was designed from the start to support easy parallelisation, and since GHC 6.6, multithreaded code will run transparently on multicore systems using as many cores as you specify. Let's look at how we'd parallelise our little program to exploit multiple cores. We'll use an explicit threading model, via Control.Concurrent. You can also make your code implicitly parallel, using Control.Parallel.Strategies, but we'll leave that for another tutorial.

Here's the source, for you to ponder. First some imports:

    import Data.Set hiding (map)
    import Data.Maybe
    import Data.Char
    import Text.Printf
    import System.IO
    import System.Environment
    import Control.Concurrent
    import Control.Monad

The entry point, modified to break the word list into chunks, and then dispatching a chunk to each thread:

    main = do
        (f, g, n) <- readFiles
        let dict = fromList (lines f)
            work = chunk n (words g)
        run n dict work

The 'run' function sets up a channel between the main thread and all children thread ('errs'), and prints spelling errors as they arrive on the channel from the children. It then forks off 'n' children threads on each piece of the work list:

    run n dict work = do
        chan <- newChan
        errs <- getChanContents chan    -- errors returned back to main thread
        mapM_ (forkIO . thread chan dict) (zip [1..n] work)
        wait n errs 0

The main thread then just waits on all the threads to finish, printing any spelling errors they pass up:

    wait n xs i = when (i < n) $ case xs of
        Nothing : ys -> wait n ys $! i+1
        Just s  : ys -> putStrLn s >> wait n ys i

Each thread spell checks its own piece of the work list. If it finds a spelling error, it passes the offending word back over the channel to the main thread.

    thread chan dict (me, xs) = do
        mapM_ spellit xs
        writeChan chan Nothing

     where
        spellit w = do
            when (spell dict w) $
                writeChan chan . Just $ printf "Thread %d: %-25s" (me::Int) w

The 'spell' function is simplified a little:

    spell d w = w `notMember` d
which we could also write as:
    spell = flip notMember

We modify the readFiles phase to take an additional numeric command line argument, specifying the number of threads to run:

    readFiles = do
        [s,n] <- getArgs
        f     <- readFile "/usr/share/dict/words"
        g     <- readFile s
        return (f,g, read n)

We compile this with the GHC SMP parallel runtime system:

    $ ghc -O --make -threaded Spell.hs -o spell

Now, we can run 'n' worker threads (lightweight Haskell threads), mapped onto 'm' OS threads. Since I'm using a 4 core linux server, we'll play around with 4 OS threads. First, running everything in a single thread:

    $ time ./spell test.txt 1 +RTS -N1
    ...
    Thread 1: week:                    
    Thread 1: IO!
    ./spell test.txt 1 +RTS -N1 99% cpu 2.533 total

Ok, now we change the command line flag to run it with 4 OS threads, to try to utilise all 4 cores:

    $ time ./spell 4 +RTS -N4
    ...
    Thread 2: week:                    
    Thread 3: IO!
    ./spell test.txt 4 +RTS -N4 111% cpu 2.335 total

Ok. Good... A little bit faster, uses a little bit more cpu. It turns out however the program is bound currently by the time spent in the main thread building the initial dictionary. Actual searching time is only some 10% of the running time. Nonetheless, it was fairly painless to break up the initial simple program into a parallel version.

If the program running time was extended (as the case for a server), the parallelism would be a win. Additionally, should we buy more cores for the server, all we need to is change the +RTS -N argument to the program, to start utilising these extra cores.

Next week

In upcoming tutorials we'll look more into implicitly parallel programs, and the use of the new high performance ByteString data type for string processing.

/home :: /haskell :: permalink :: rss

2006-12-16

Programming Haskell

Its about time we got some job done in Haskell, eh? Now, one of my favourite programming books as an undergraduate was the Camel Book, "Programming Perl". It was full of lots of practical examples of Perl code, written well. (And I'm grateful to Larry Wall, Tom Christiansen and Randal Schwartz for writing the book that made programming fun).

So what would it look like if we wrote a Haskell tutorial in this style? Let's have at it!

Getting started

Like some languages Haskell can be both compiled and interpreted. The most widely used implementation of Haskell currently is GHC, which provides both an optimising native code compiler, and an interactive bytecode interpreter. I'll be using GHC (or its interactive front end, GHCi, for all code. So grab a copy of GHC now, from your package system, or the GHC home page.

Start up GHCi:

    $ ghci
       ___         ___ _
      / _ \ /\  /\/ __(_)
     / /_\// /_/ / /  | |      GHC Interactive, version 6.6, for Haskell 98.
    / /_\\/ __  / /___| |      http://www.haskell.org/ghc/
    \____/\/ /_/\____/|_|      Type :? for help.

    Loading package base ... linking ... done.
    Prelude> 

The interpreter now waits for your code fragments. The "Prelude" prompt indicates which library modules are in scope, and in this case, only the basic language module, known as the Prelude.

Now we can start running Haskell code.

    Prelude> "G'day, world!"
    "G'day, world!"

    Prelude> putStrLn "G'day, world!"
    G'day, world!

You can compile this code to a native binary using GHC, by writing in a source file:

    main = putStrLn "G'day, world!"

and then compiling the source to native code. Assuming your file is A.hs:

    $ ghc A.hs

This produces a new executable, ./a.out (a.out.exe on windows), which you can run like any other program on your system:

    $ ./a.out 
    G'day, world!

Variables

We can name arbitrary fragments of Haskell using variables. Like so:

    phrase = "G'day, world!"
    main = putStrLn phrase

We don't have to define what type phrase is, as Haskell uses type inference to infer at compile time the types of all expressions in the program. As "G'day, world!" is a string, so must phrase be a string. There are a bunch of basic types of values to play with. Here's a small sample:

    answer      = 42
    pi          = 3.141592653589793238462643383279502884197169399375105820974944592
    avocados    = 6.02e23
    pet         = "Lambda"
    sign        = "I love my " ++ pet
    coat        = "It costs $100"
    hence       = "whence"
    thence      = hence
    moles       = 2.5
    x           = moles * avocados
    c           = '#'
    pair        = (2.5, "lambdas")
    list        = [5,6,4,3,1]
    options     = Just "done"
    failed      = Nothing
    void        = ()

One important thing to remember is that Haskell's variables, like in most functional programming languages, are like variables in mathematics, and are just names for expressions. They're explicitly not mutable boxes, like in most imperative programming languages. As a result, you never need to worry about initialising a Haskell variable, nor do you need to worry about the current value in a variable: it always has the same value, and can always be replaced with its definition. So the following behaves just like it would in maths:

    answer      = 42
    another     = answer + 1
    more        = another + answer
    main        = print more
That is,
    $ ghc A.hs
    $ ./a.out 
    85

Now, since variables are just names for program fragments, you can evaluate Haskell on paper by replacing all names with their definition, until you reach a final value, like so:

    main = print more 
  =>
    main = print (another + answer)
  =>
    main = print ((answer + 1) + answer)
  =>
    main = print ((answer + 1) + 42)
  =>
    main = print ((42 + 1) + 42)
  =>
    main = print (43 + 42)
  =>
    main = print 85
  =>
    85

Having such a simple system for variables allows for a wide range of interesting optimisations, and makes understanding what a program is doing at any point much easier, since you don't have to worry about what state a variable might currently be in. (Of course, some problems need (theadsafe) mutable boxes, and they're available as a library for when you need that).

Collections

Often you need to collect a bunch of values together into some kind of collection. Haskell has many many collection types, but in particular, it has lists and finite maps, which operate much like arrays and hashes of the imperative world.

Lists

A list is just an ordered, um, list of values. They can be nested, and transformed in all sorts of ways, using functions. Assuming your file, A.hs, contains:

    home  = ["couch", "chair", "table", "stove"]

We can play around with this stuff like so:

    $ ghci A.hs

    *Main> home
    ["couch","chair","table","stove"]

    *Main> head home
    "couch"

    *Main> tail home
    ["chair","table","stove"]

    *Main> last home  
    "stove"

    *Main> home !! 2
    "table"

    *Main> reverse home
    ["stove","table","chair","couch"]

    *Main> map reverse home
    ["hcuoc","riahc","elbat","evots"]

Loading in the List library gives us some more functions to use:

    *Main> :m + Data.List

    *Main Data.List> intersperse "#" home
    ["couch","#","chair","#","table","#","stove"]

    *Main Data.List> concat (intersperse "#" home)
    "couch#chair#table#stove"

    *Main Data.List> home \\ ["table","stove"]
    ["couch","chair"]

Finite Maps

Finite maps (or maps) are the lookup tables of purely functional programming. Whenever you'd use some kind of hash in an imperative language, you can replace it with a Map in Haskell. Like hashes, maps can be seen as a table of pairs of keys and values. You can declare a new map:

    import Data.Map

    days = fromList
        [ ("Sun",  "Sunday"     )
        , ("Mon",  "Monday"     )
        , ("Tue",  "Tuesday"    )
        , ("Wed",  "Wednesday"  )
        , ("Thu",  "Thursday"   )
        , ("Fri",  "Friday"     )
        , ("Sat",  "Saturday"   ) ]
You can also convert a map to a list, using (well, duh!) toList:
    *Main> toList days
    [("Fri","Friday"),("Mon","Monday"),("Sat","Saturday"),("Sun","Sunday"),("Thu","Thursday"),("Tue","Tuesday"),("Wed","Wednesday")]
Note that they come out unordered, just like in hashes. If you just want the keys of the map:
    *Main> keys days
    ["Fri","Mon","Sat","Sun","Thu","Tue","Wed"]

    *Main> elems days
    ["Friday","Monday","Saturday","Sunday","Thursday","Tuesday","Wednesday"]

Since maps are a good structure for looking up values, you can search them using the lookup function. This function returns the element, if found:

    *Main> Data.Map.lookup "Tue" days
    "Tuesday"

Since the name 'lookup' is also used by a list function of similar purpose in the Prelude, we use the qualified name here to disambiguate which 'lookup' to use.

On failure

But what happens if the key is not found? (Feel free to skip this section if you don't care about errors yet) lookup will then fail, and how it fails depends on what kind of failure you want. Haskell goes to great lengths to make programming for failure flexible. For example, to fail with an exception:

    *Main> Data.Map.lookup "Thor" days
    *** Exception: user error (Data.Map.lookup: Key not found)

Which is the same as failing with an IO error. We can specify this specifically with a type annotation, to say "fail with an IO error":

    *Main> Data.Map.lookup "Thor" days :: IO String
    *** Exception: user error (Data.Map.lookup: Key not found)

Often you might instead prefer that some special value is returned on failure:

    *Main> Data.Map.lookup "Thors" days :: Maybe String
    Nothing

Maybe you'd just like an empty list:

    *Main> Data.Map.lookup "Thor" days :: [String]
    []
Finally, you can always provide an explicit default value:
    *Main> findWithDefault "Not found" "Thor" days            
    "Not found"
Failure is entirely under your control!

Actions

Now, real programs interact with the outside world. They call functions which do IO, as a side effect, and then return some value. In Haskell, functions with side effects are often called actions, to distinguish them from normal Haskell functions (which behave like mathematical functions: they take inputs and return a result, with no side effects). Programming with side effects is carefully handled in Haskell, again to control the possibility of errors, and all functions which have side effects have a special type: the IO type.

For example, the function to print a string has the following type (and you can ask the interpreter for the type interactively):

    Prelude> :t putStr
    putStr :: String -> IO ()

which tells you that this function takes a String as an argument, does some IO side effect, and returns the null value. It is equivalent to the following C type:

    void putStr(char *);

but with a bit of extra information, namely, that the function does some IO. We would print out some element of our map like so:

    main = print ("Tue in long form is " ++ findWithDefault "Not found" "Tue" days)

    *Main> main
    "Tue in long form is Tuesday"

An example

One of the classic programming puzzles for introducing real world problems is the 'class grade' problem. You have a text file containing a list of student names and their grades, and you'd like to extract various information and display it. In deference to The Camel Book, we'll follow this lead, and start with a file "grades", containing something like this:

    Alonzo 70
    Simon 94
    Henk 79
    Eugenio 69
    Bob 80
    Oleg 77
    Philip 73
    ...

Student's appear multiple times, with entries for each of their subjects Let's read this file, populate a map with the data, and print some statistical information about the results. First thing to do is import some basic libraries:

    import Data.Char
    import Data.Maybe
    import Data.List
    import Data.Map hiding (map)
    import Text.Printf
And now here's the entire program, to read the grades file, compute all the averages, and print them:
    main = do
        src <- readFile "grades"
        let pairs   = map (split.words) (lines src)
            grades  = foldr insert empty pairs
        mapM_ (draw grades) (sort (keys grades))
      where
        insert (s, g) = insertWith (++) s [g]
        split [name,mark] = (name, read mark)

    draw g s = printf "%s\t%s\tAverage: %f\n" s (show marks) avg
      where
        marks = findWithDefault (error "No such student") s g
        avg   = sum marks / fromIntegral (length marks) :: Double

Running it

How do we run this program? There's lots of ways:

Compile it to native code

    $ ghc -O Grades.hs

    $ ./a.out 
    Alonzo  [70.0,71.0]     Average: 70.5
    Bob     [80.0,88.0]     Average: 84.0
    Eugenio [69.0,98.0]     Average: 83.5
    Henk    [79.0,81.0]     Average: 80.0
    Oleg    [77.0,68.0]     Average: 72.5
    Philip  [73.0,71.0]     Average: 72.0
    Simon   [94.0,83.0]     Average: 88.5

Run it in the bytecode interpreter

    $ runhaskell Grades.hs
    Alonzo  [70.0,71.0]     Average: 70.5
    Bob     [80.0,88.0]     Average: 84.0
    Eugenio [69.0,98.0]     Average: 83.5
    Henk    [79.0,81.0]     Average: 80.0
    Oleg    [77.0,68.0]     Average: 72.5
    Philip  [73.0,71.0]     Average: 72.0
    Simon   [94.0,83.0]     Average: 88.5

Execute it interactively

    $ ghci Grades.hs
    Prelude Main> main
    Alonzo  [70.0,71.0]     Average: 70.5
    Bob     [80.0,88.0]     Average: 84.0
    Eugenio [69.0,98.0]     Average: 83.5
    Henk    [79.0,81.0]     Average: 80.0
    Oleg    [77.0,68.0]     Average: 72.5
    Philip  [73.0,71.0]     Average: 72.0
    Simon   [94.0,83.0]     Average: 88.5

Make the script executable

Under unix, you can use the #! convention to make a script executable. Add the following to the top of the source file:

    #!/usr/bin/env runhaskell

And then set the script executable:

    $ chmod +x Grades.hs

    $ ./Grades.hs       
    Alonzo  [70.0,71.0]     Average: 70.5
    Bob     [80.0,88.0]     Average: 84.0
    Eugenio [69.0,98.0]     Average: 83.5
    Henk    [79.0,81.0]     Average: 80.0
    Oleg    [77.0,68.0]     Average: 72.5
    Philip  [73.0,71.0]     Average: 72.0
    Simon   [94.0,83.0]     Average: 88.5

Next week: more IO!

/home :: /haskell :: permalink :: rss

2006-12-15

More on syntax

Yesterday's article, about programming language syntax in an ideal world, provoked a small storm of controversy over on reddit.

Unfortunately, since there were a couple of other Haskell articles on the same page, the article's argument, that languages of the future should look like pseudocode, was somewhat lost. A number of readers assumed the article was a piece of Haskell advocacy, which wasn't the intention. Instead, it was a plea for the syntax of future languages to be optimised for writing by humans, and not for parsing by compilers. That is:

The languages of the future should look like pseudocode

I'll cache the original example of "future syntax" here, so you can keep it in mind:

    fib 0 = 0
    fib 1 = 1
    fib n = fib (n-1) + fib (n-2)

Note that this is an argument for syntax, independent of whatever language you want to use. And it ignores all efficiency issues in current languages (that's a job for the compiler designers of the future).

Of course, programmers being very practical people took this as an opportunity to demonstrate how their language of choice (and being reddit, this means lisp) already provides this "future syntax". Well, sort of:

    (define fib
      (match-lambda
        [0 0]
        [1 1]
        [n (+ (fib (- n 1)) (fib (- n 2)))]))
    (define (fibo x)
      (cond
        ((= 0 x) 0)
        ((= 1 x) 1)
        (else (+ (fibo (- x 1))
                 (fibo (- x 2))))))
    (define (fibo x)
      (case x ((0 1) 1)
              (else (+ (fibo (- x 1)) (fibo (- x 2))))))
    (defmethod fib ((n (eql 0))) 0)
    (defmethod fib ((n (eql 1))) 1)
    (defmethod fib (n)
          (+ (fib (- n 1)) (fib (- n 2))))
    (define (fibo x)
       (if (< x 2)
           x
         (+ (fibo (- x 1)) (fibo (- x 2)))))
    (let ((fibs (make-hash-table)))
       (defun fib (n)
          (if (< n 2) n
              (or (gethash n fibs)
              (setf (gethash n fibs)
                    (+ (fib (- n 1)) (fib (- n 2))))))))

Others focused on improving the particular languages I cited. The best contribution on this issue came from Greg Buchholz, who illustrated neat implementations in a variety of languages, far more flattering than the examples I used (which, incidentally, were not my creation, they were wikipedia's ....)

Perhaps unsurprisingly (in hindsight) plenty of readers assumed the plea for future syntax was a plea for "Haskell now!", and talked at length about efficiency concerns, and the lack of real world code/IO in the examples:

    all of these are monstrously inefficient...
    > let fibs = 1 : 1 : zipWith (+) (fibs) (tail fibs)
    > take 10 fibs
    [1,1,2,3,5,8,13,21,34,55]

    That's surely more efficient
    And kinda memory eating :-)

    You'd better choose something like

    fib = fib_iter 1 1
      where
        fib_iter x y n | n == 1 = x
                       | otherwise = fib_iter y (x + y) (n - 1)
    there's a much more efficient way to do it in Hs using the language's
    lazyness (the ability to recursively define infinite lists):

    fibs = 1 : 1 : zipWith (+) (fibs) (tail fibs)

And, this being the interblargh, some people missed the point entirely:

    This code doesn't have any resemblance to everyday programming however -
    which generally involves trawling through databases, string matching, and I/O.
    What is the point of showing examples which do not reflect reality of
    most(?) programing tasks?
    I've been a ASM/C++/C# developer for over a year now. But seriously, how
    stupid is this? You are going to tout a language strength due to how
    efficiently it can work out the Fibonacci sequence?
    That is not a real-world example, and if you were to use a real-world
    example, this code would be horrific.

So what can be made of all this? I think that it can be objectively stated that current programming languages, for the most part, don't look like ideal pseudocode languages, with syntax often bogged down in language features, and parser hints. Overuse of keywords (defun,define,def,let rec), extraneous typing hints (like $x), and parser hints (';' and '{' '}'), all obscure the underlying code. Step back for a second: you're 10 years old, and you're hoping to become a cool hacker. What syntax do you wish you could use?

Now Python, Haskell and the Lisp family seem to be the closest to where we want to go, but they're not there yet. Even in Haskell, where years were spent sorting out the syntax, and as a result has some nice features for human programmers (pattern matching on data structures, and lovely whitespace), you can still end up writing awkward code (and this is entirely my own fault):

    withGS :: (g -> (g -> LB ()) -> LB ()) -> ModuleT (GlobalPrivate g p) LB ()
    withGS f = withMS $ \state writer ->
      f (global state) $ \g -> writer $ state { global = g }

    readGS :: ModuleT (GlobalPrivate g p) LB g
    readGS = global `fmap` readMS

    writeGS :: g -> ModuleT (GlobalPrivate g p) LB ()
    writeGS g = withGS (\_ writer -> writer g)

    forkLB :: LB a -> LB ()
    forkLB f = (`liftLB` f) $ \g -> do
                forkIO $ do
                    timeout (15 * 1000 * 1000) g
                    return ()

    liftLB :: (IO a -> IO b) -> LB a -> LB b
    liftLB f = LB . mapReaderT f . runLB

So finally, if you're in any way involved in the design of syntax for a new language, or fixing up the syntax of existing languages, please think hard about what you would have wished to program in, had you never been exposed to the languages of today. Programming languages don't have to be ugly and noisy! Concentrate on what the human author of the code would like to write: code that is aesthetically pleasing and even beautiful, and don't compromise your syntax, for the short term gain of making parsing easier.

/home :: /haskell :: permalink :: rss

2006-12-14

On syntax

Forget about what programming languages you know. What syntax do you wish you could write in? (Names have been changed to protect the innocent):

This?

    function generate_fibonacci_sequence( $length ) {
        for( $l = array(1,1), $i = 2, $x = 0; $i < $length; $i++ )
                $l[] = $l[$x++] + $l[$x];
    
        return $l;
    }

What about this?

    sub fibo
    {
        my ($n, $a, $b) = (shift, 0, 1);
        ($a, $b) = ($b, $a + $b) while $n-- > 0;
        $a;
    }

Getting better?

    def fib(n):
        if   n < 2: return 1
        else      : return fib(n - 1) + fib(n - 2)
Hmm, isn't this going backwards?
    (define fibo
     (lambda (x)
       (if (< x 2)
         x
         (+ (fibo (- x 1)) (fibo (- x 2))))))

Finally:

    fib 0 = 0
    fib 1 = 1
    fib n = fib (n-1) + fib (n-2)

If the programming languages of the future don't look like the last example, I think we've failed. Programming languages are notation for describing solutions to problems. If that notation is verbose, clunky or full of special cases it should be abandoned in favour of a more concise notation. Real languages should look like pseudcode!

My suspicion is that in 50 years time we'll look back on the syntax of today's programming languages and think them highly bizarre, like looking back at Frege's logic notation (pdf):

Syntax has come along way since the 1960s, but there's still room for much improvement, looking at the widely used languages employed today. The designers of many (not all) popular languages seem to forget that syntax should be optimised for writing by humans, not for parsing by machines.

/home :: /haskell :: permalink :: rss

Type declarations made mechanical

Haskell uses type inference, so explicit type declarations are rarely required. This:

    import Control.Monad.Fix
    fibs = fix ((1:) . scanl (+) 1)
    main = print (take 20 fibs)

Is just as good as:

    import Control.Monad.Fix

    fibs :: [Integer]
    fibs = fix ((1:) . scanl (+) 1)

    main :: IO ()
    main = print (take 20 fibs)

Running this:

    $ runhaskell A.hs
    [1,1,2,3,5,8,13,21,34,55,89,144,233,377,610,987,1597,2584,4181,6765]

Now, once programs reach a decent size, the advantage of type declarations appears: it functions as machine-checkable documentation. So people new to your code can more quickly work out what the code is doing.

Contrast this (real world) Haskell code from a network client:

    accessorMS decompose f = withMS $ \s writer ->
      let (t,k) = decompose s in f t (writer . k)

And with type annotations:

    accessorMS :: (s -> (t, t -> s))
               -> (t -> (t -> LB ()) -> LB a)
               -> ModuleT s LB a

    accessorMS decompose f = withMS $ \s writer ->
      let (t,k) = decompose s in f t (writer . k)

So at least you know have some idea of what that code does.

There's an intuition here: type declarations are good, but they can be mechanically inferred. Let's automate that then! Here's a quick script I use every day. It just passes your top level declaration to ghci, and asks it to infer the type. The resulting type signature is spliced back in to your code:

    #!/bin/sh

    # input is a top level .hs decls

    FILE=$*
    DECL=`cat`
    ID=`echo $DECL | sed 's/^\([^ ]*\).*/\1/'`
    echo ":t $ID" | ghci -v0 -cpp -fglasgow-exts -w $FILE
    echo $DECL

Save this as an executable shell file in your path. Now you can call this from your editor. For example, from vim, you'd use the following .vimrc:

    :map ty :.!typeOf %

Hitting :ty while the cursor is positioned on top of a top level declaration takes your code from:

    writePS who x = withPS who (\_ writer -> writer x)

to this:

    writePS :: forall p g. String -> Maybe p -> ModuleT (GlobalPrivate g p) LB ()
    writePS who x = withPS who (\_ writer -> writer x)

I can't emphasise enough how useful this is. So, steal this code and improve your productivity!

/home :: /haskell :: permalink :: rss

Type inference by hand

IPI asked this how #haskell:

How can I mentally infer the type of the composition of these functions?
    a :: [[Char]] -> [([Char],Int,Int)]
    b :: [Int] -> [(Int,Bool)]
    c :: [([Char],Int,Int)] -> ([Char],Int)
    d :: [(Int,Bool)] -> [[Char]]

That is, what is the type of:

    c . a . d . b

The first thing I'd do is rename the tricky types:

    a :: [String] -> [(String,Int,Int)]
    b :: [Int] -> [(Int,Bool)]
    c :: [(String,Int,Int)] -> (String,Int)
    d :: [(Int,Bool)] -> [String]

and then introduce some type synonyms:

    type A = [String]
    type B = [(String,Int,Int)]
    type C = [(Int,Bool)]

Now we can rename the complex types to a more symbolic form:

    a :: A -> B
    b :: [Int] -> C
    c :: B -> (String,Int)
    d :: C -> A

and now we can see how to combine the pieces. It's a little jigsaw puzzle!

    b :: [Int] -> C

so there's only one option to compose with b, namely

    b :: [Int] -> C
    d :: C -> A

composing these, and we hide the intermediate result:

    d . b :: [Int] -> A

now, we need something that takes an A:

    a :: A -> B

So the result of something composed with a, will be a value of type B:

    a . d . b :: [Int] -> B

and finally, something that takes a B:

    c :: B -> (String,Int)

leading to:

    c . a . d . b :: [Int] -> (String,Int)

and we're done!

/home :: /haskell :: permalink :: rss

Theory and practice: where Haskell is heading

Seen on the haskell cafe:

One of the most exciting aspects of Haskell is that pragmatic interest in the language has been growing steadily without academic interest in it declining in any way. As a result, we have a language that represents an interesting mixture of good and useful, although it is not entirely clear yet how long this nice balance will hold.
We have had lots of languages that were intended to be well-designed (good, beautiful, ..), but never much used in practice, and we have also had lots of languages that were intended to be pragmatic (practical, useful, ..), without much interest in theoretical beauty. But how many languages are there where the two aspects have converged, with both communities still actively interested in the result?

Its exciting working in Haskell right now, with new practical tools, libraries and jobs appearing daily, and yet at the same time the theory guys keep cranking out new abstractions for us: applicative functors, software transactional memory, stream fusion...

We have the technology!

/home :: /haskell :: permalink :: rss

2006-12-11

The lambda revolution

This article describes the entire process of writing, packaging and releasing a new Haskell library. If you love programming Haskell, and want to keep doing so, you'll write some libraries and release them. The lambda revolution isn't going to happen without your code!

Creating Data.DList, it took about 60 minutes. People often need O(1) append, so about time we had a library for it. Here's the entire transcript of creating a new Haskell project and releasing it to a public server. I then created a web page for the project, and announced it on the haskell@ mailing list. I used the following tools:

Ready? Here we go!


Script started on Mon Dec 11 13:47:40 2006

$ cd dons/src
$ mkdir dlist
$ cd dlist

$ mkcabal
Project name: dlist
What license ["GPL","LGPL","BSD3","BSD4","PublicDomain","AllRightsReserved"] ["BSD3"]:
What kind of project [Executable,Library] [Executable]: Library
Is this your name? - "Don Stewart " [Y/n]:
Is this your email address? - "<dons@cse.unsw.edu.au>" [Y/n]:
Created Setup.lhs and dlist.cabal

$ cp ~/dons/src/yi/Yi/Lexers.hs .
$ ls
Lexers.hs   Setup.lhs   dlist.cabal

$ mkdir Data
$ mv Lexers.hs Data/DList.hs

$ vim Data/DList.hs
$ vim dlist.cabal

$ darcs init
$ darcs add dlist.cabal Setup.lhs Data/ Data/DList.hs
$ darcs record --all
Shall I record this change? (1/?)  [ynWsfqadjkc], or ? for help: a
What is the patch name? Initial import of dlist package
Do you want to add a long comment? [yn]n
Finished recording patch 'Initial import of dlist package'

$ runhaskell Setup.lhs configure --prefix=/home/dons
Setup: Warning: No exposed modules or executables in this package.
Setup: Error: Non-empty library, but empty exposed modules list. 
Cabal may not build this library correctly

$ vim dlist.cabal

$ runhaskell Setup.lhs configure --prefix=/home/dons
Configuring dlist-0.1...
configure: /home/dons/bin/ghc-pkg
configure: Dependency base-any: using base-2.0
configure: Using install prefix: /home/dons
configure: Binaries installed in: /home/dons/bin
configure: Libraries installed in: /home/dons/lib/dlist-0.1/ghc-6.6
configure: Private binaries installed in: /home/dons/libexec
configure: Data files installed in: /home/dons/share/dlist-0.1
configure: Using compiler: /home/dons/bin/ghc
configure: Compiler flavor: GHC
configure: Compiler version: 6.6
configure: Using package tool: /home/dons/bin/ghc-pkg
configure: Using ar found on system at: /usr/bin/ar
configure: Using haddock found on system at: /home/dons/bin/haddock
configure: No pfesetup found
configure: Using ranlib found on system at: /usr/bin/ranlib
configure: Using runghc found on system at: /home/dons/bin/runghc
configure: Using runhugs found on system at: /home/dons/bin/runhugs
configure: Using happy: /home/dons/bin/happy
configure: Using alex: /home/dons/bin/alex
configure: Using hsc2hs: /home/dons/bin/hsc2hs
configure: No c2hs found
configure: Using cpphs: /home/dons/bin/cpphs
configure: No greencard found
Preprocessing library dlist-0.1...
Building dlist-0.1...

$ runhaskell Setup.lhs build
Preprocessing library dlist-0.1...
Building dlist-0.1...
Data/DList.hs:17:2: parse error on input `DList'

$ vim Data/DList.hs +17

$ runhaskell Setup.lhs build
Preprocessing library dlist-0.1...
Building dlist-0.1...
[1 of 1] Compiling Data.DList       ( Data/DList.hs, /usr/obj/cabal/Data/DList.o )
/usr/bin/ar: creating /usr/obj/cabal/libHSdlist-0.1.a

$ runhaskell Setup.lhs install
Installing: /home/dons/lib/dlist-0.1/ghc-6.6 & /home/dons/bin dlist-0.1...
Registering dlist-0.1...
Reading package info from ".installed-pkg-config" ... done.
Saving old package config file... done.
Writing new package config file... done.

$ darcs amend-record
Mon Dec 11 14:20:56 EST 2006  Don Stewart <dons@cse.unsw.edu.au>
  * Initial import of dlist package
Shall I amend this patch? [yNvpq], or ? for help: y
hunk ./Data/DList.hs 15
-module Data.DList
+module Data.DList (
Shall I add this change? (1/?)  [ynWsfqadjkc], or ? for help: y
hunk ./dlist.cabal 10
+Exposed-modules:     Data.DList
Shall I add this change? (2/?)  [ynWsfqadjkc], or ? for help: y
Finished amending patch:
Mon Dec 11 14:21:53 EST 2006  Don Stewart <dons@cse.unsw.edu.au>
  * Initial import of dlist package

$ runhaskell Setup.lhs haddock
Preprocessing library dlist-0.1...
Running Haddock for dlist-0.1...
Warning: cannot use package base-2.0:
   HTML directory /home/dons/share/ghc-6.6/html/libraries/base does not exist.
/usr/obj/cabal/tmp/Data/DList.hs:"/usr/obj/cabal/tmp/Data/DList.hs": 37:1: parse error in doc string: [TokPara]

$ vim Data/DList.hs +37

$ runhaskell Setup.lhs haddock
Preprocessing library dlist-0.1...
Running Haddock for dlist-0.1...
Warning: cannot use package base-2.0:
   HTML directory /home/dons/share/ghc-6.6/html/libraries/base does not exist.

$ w3m -dump dist/doc/html/index.html 
 dlist-0.1:  Contents Index
dlist-0.1:
Differences lists: lists supporting efficient append
Modules
show/hideData
Data.DList
Produced by Haddock version 0.8

$ mkdir tests
$ vim tests/Properties.hs
$ vim tests/Parallel.hs

$ cd tests
$ runhaskell Properties.hs 
2: singleton                : OK, 1000 tests.
1: model                    : OK, 1000 tests.
2: snoc                     : OK, 1000 tests.
1: append                   : OK, 1000 tests.
$ cd ..

$ darcs add tests
$ darcs add tests/Parallel.hs tests/Properties.hs
$ darcs whatsnew -s
M ./Data/DList.hs -1 +1
A ./tests/
A ./tests/Parallel.hs
A ./tests/Properties.hs

$ darcs record --all
...
Shall I record this change? (6/?)  [ynWsfqadjkc], or ? for help: y
What is the patch name? Add QuickChecks
Finished recording patch 'Add QuickChecks'

$ darcs setpref test "cd tests ; runhaskell Properties.hs"
Changing value of test from '' to 'cd tests ; runhaskell Properties.hs'
changepref test
cd tests ; runhaskell Properties.hs
Shall I record this change? (1/?)  [ynWsfqadjkc], or ? for help: y
What is the patch name? pre   run tests on each commit
Do you want to add a long comment? [yn]n
Running test...
2: singleton                : OK, 1000 tests.
2: snoc                     : OK, 1000 tests.
1: model                    : OK, 1000 tests.
2: append                   : OK, 1000 tests.
Test ran successfully.
Looks like a good patch.
Finished recording patch 'run tests on each commit'

$ darcs whatsnew -s
No changes!

$ darcs tag
What is the version name? 0.1   dlist 0.1
Finished tagging patch 'TAG dlist 0.1'

$ runhaskell Setup.lhs sdist
Building source dist for dlist-0.1...
*** Exception: LICENSE: copyFile: does not exist (No such file or directory)

$ cp ~/pqc/LICENSE .
$ vim LICENSE 

$ darcs add LICENSE
$ darcs record
addfile ./LICENSE
Shall I record this change? (1/?)  [ynWsfqadjkc], or ? for help: y
...
Shall I record this change? (2/?)  [ynWsfqadjkc], or ? for help: y
What is the patch name? add license
Do you want to add a long comment? [yn]n
Running test...
2: singleton                : OK, 1000 tests.
2: snoc                     : OK, 1000 tests.
1: model                    : OK, 1000 tests.
2: append                   : OK, 1000 tests.
Test ran successfully.
Looks like a good patch.
Finished recording patch 'add license'

$ vim README
$ darcs add README
$ darcs record --all
What is the patch name? Add a readme
Do you want to add a long comment? [yn]n
Running test...
2: singleton                : OK, 1000 tests.
2: snoc                     : OK, 1000 tests.
1: model                    : OK, 1000 tests.
2: append                   : OK, 1000 tests.
Test ran successfully.
Looks like a good patch.
Finished recording patch 'Add a readme'

$ ls
Data        LICENSE     README      Setup.lhs   _darcs      dist        dlist.cabal tests

$ darcs dist -d dlist-0.1. 
Created dist as dlist-0.1.tar.gz

$ darcs whatsnew -s
No changes!

$ darcs put pill00.cse.unsw.edu.au:/home/chakcvs/darcs/dlist/
Finished applying...

Script done on Mon Dec 11 15:04:54 2006

/home :: /haskell :: permalink :: rss

Quick interpreters with the Reader monad

Rolling your own interpreters in Haskell is one of the joys of using this language, due to the support for:

  • algebraic data types
  • pattern matching on data
  • monadic environments
  • higher order functions

Here's a quick, complete interpreter for a simple math language of binary operators, with support for variables. It uses the Reader monad for a natural embedding of lexical scoping. Writing interpreters in Haskell is just so easy, no wonder Pugs was written in it.

import qualified Data.Map as M
import Control.Monad.Reader

--
-- A syntax tree type for simple math, with variables
--
data Exp = IntE Int
         | OpE  Op Exp Exp
         | VarE String
         | LetE String Exp Exp

type Op = Int -> Int -> Int

--
-- The interpreter
--
eval (IntE n)       = return n
eval (OpE op e1 e2) = liftM2 op (eval e1) (eval e2)

eval (VarE x)       = do
    env <- ask
    return $ maybe (error "undefined variable") id (M.lookup x env)

eval (LetE x e1 e2) = do
    env <- ask
    v   <- eval e1
    local (M.insert x v) (eval e2)

--
-- Run the interpreter
--
main = print $ runReader (eval test) M.empty

--
-- A simple text expression:
--
--      let x =
--          let y = 5 + 6
--          in y / 5
--      in x * 3
-- 
-- ==>  6
--
test = LetE "x" (LetE "y" (OpE (+) (IntE 5) (IntE 6))
                      (OpE div y (IntE 5)))
                (OpE (*) x (IntE 3))
    where x = VarE "x"
          y = VarE "y"

Let's run the test program:

$ runhaskell A.hs
6

Adding a front end (with Parsec), and then implementing the rest of Ruby, is left as an exercise for the reader.

/home :: /haskell :: permalink :: rss

2006-12-10

The computational view of monads

I originally wrote this article as a comment to the programming subreddit, however I've found myself referring to it a few times since then, and reproduce it here for posterity. It attempts to motivate monads for people coming from imperative, stateful languages (already living in IO).


In Haskell, a purely functional language, we use monads to provide a library-based approach to sequential, imperative evaluation: the famous IO monad. No wonder Haskell people talk about it a lot: library-based imperative programming in a purely functional language is fun.

But monads don't stop there! In languages with builtin sequential evaluation the IO monad must seem boring or silly -- no wonder it all seems a puzzle.

This is where it gets interesting. Since monads provide an abstraction over evaluation order, they can be used in other languages (and Haskell) to implement *non*-sequential evaluation order, as a library, also!

So instead of special language support for, say, exceptions and callcc, you just implement monads as a library, and get:

Perhaps this gives a flavour for non-Haskell people for why they're so useful. Continuations as a library! Non-deterministic evaluation as a library!

So just as its well known that with continuations you can implement threaded control or exceptions as a library, with monads you can implement continuations themselves as a library, along with all the other fun toys. Then, using monad transformers, you can compose separate monads, providing, say, exception handling over a state encapsulation: you can specify precisely what programming language concepts a particular program requires. Here is:

Users of other languages are likely, and rightly so, to be less interested in the solution to IO based on monads, and far far more interested in things like library-based continuations or exceptions (see for example monadic delimited continuations in OCaml).

Hope that gives a hint towards what this monad stuff is all about.

/home :: /haskell :: permalink :: rss

2006-12-07

Playing with the Devil's Bullets

    Now, George was a good straight boy to begin with, but there was bad blood in him
    Someway he got into the magic monads and that leads straight to Devil's work, 
    Just like marijuana leads to heroin
    You think you can take them monads or leave 'em, do you?
    Just save a few for your bad days

    Well, now, we all have those bad days when you can't hack for shit.
    The more of them monads you use, the more bad days you have without them
    So it comes down finally to all your days being bad without the monads
    It's magics or nothing

    Time to stop chippying around and kidding yourself,
    Kid, you're hooked, heavy as lead

/home :: /haskell :: permalink :: rss

The Devil's Bullets: William Burroughs, Tom Waits and Haskell.

Reading "Why Functional Programming is a Silver Bullet", and for some reason I'm reminded of the Robert Wilson, Tom Waits and William S. Burroughs play, "The Black Rider". The Haskell promise can be elegantly phrased:

    You can never go hunting
    With just a flintlock and a hound
    You won't go home with a bunting
    If you blow a hundred rounds

    It takes much more than wild courage
    Or you'll hit just the tattered clouds
    You must have just the right bullets
    And the first one's always free

    You must be careful in the forest
    Broken glass and rusty nails
    If you're to bring back something for us
    I have bullets for sale

    Why be a fool when you can chase away
    Your blind and your gloom
    I have blessed each one of these bullets
    And they shine just like a spoon

    To have sixty silver wishes
    Is a small price to pay
    They'll be your private little fishes
    And they'll never swim away

    I just want you to be happy
    That's my only wish
    I'll fix your wagon and your musket
    And the spoon will have his dish

    And I shudder at the thought of your
    Poor empty hunter's pouch
    So I'll keep the wind from your barrel
    And bless the roof of your house
    
Feel free to replace bullets with lambdas. Now, how to describe the addictive property of Haskell programming:
    Now, George was a good straight boy to begin with, but there was bad blood in him
    Someway he got into the magic bullets and that leads straight to Devil's work, 
    Just like marijuana leads to heroin
    You think you can take them bullets or leave 'em, do you?
    Just save a few for your bad days

    Well, now, we all have those bad days when you can't shoot for shit.
    The more of them magics you use, the more bad days you have without them
    So it comes down finally to all your days being bad without the bullets
    It's magics or nothing

    Time to stop chippying around and kidding yourself,
    Kid, you're hooked, heavy as lead
    

Haskell gives you access to the Devil's bullets.

(Lyrics by Tom Waits/William Burroughs, from The Black Rider)

/home :: /haskell :: permalink :: rss

2006-12-05

Building GHC in under *8* minutes

In response to my last post, Simon Marlow suggested building the first GHC stage with -O, hoping that an optimised GHC in turn will be faster compiling the libraries and second stage. Here's the result:

With the following build.mk:

SRC_HC_OPTS     = -H64m -Onot -fasm
GhcStage1HcOpts = -O -fasm
GhcStage2HcOpts = -Onot -fasm
GhcLibHcOpts    = -Onot -fasm
GhcLibWays      =
SplitObjs       = NO

On a 2 processor, 4 core linux machine, running with -j10:

make -j10 > /dev/null  1015.77s user 155.19s system 249% cpu 7:49.50 total

Great! A new GHC build record (I think). Can't wait till the 16 core machine arrives...

/home :: /haskell :: permalink :: rss

2006-12-03

Building GHC in under 10 minutes

People often complain at how long the GHC build process can be. This is even more of an issue for developers working on libraries, or the compiler itself, who need to quickly check their code typechecks and links inside the glorious maze that is GHC's builds system.

With the fairly recent GNU make -j fixes, and a decent multicore machine, you can cut your build times dramatically. Here's a transcript, with some hints on getting fast turnarounds from GHC builds:

$ cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 4
model name      : Intel(R) Xeon(TM) CPU 2.80GHz
stepping        : 8
cpu MHz         : 2800.184
cache size      : 2048 KB
bogomips        : 5604.39

processor       : 7
vendor_id       : GenuineIntel
cpu family      : 15
model           : 4
model name      : Intel(R) Xeon(TM) CPU 2.80GHz
stepping        : 8
cpu MHz         : 2800.184
cache size      : 2048 KB
bogomips        : 5600.47

$ uname -msr
Linux 2.6.15.3-general i686

$ bzip2 -dc ghc-6.6-src.tar.bz2 | tar xf -
$ cd ghc-6.6
$ cat ../build.mk
SRC_HC_OPTS     = -H64m -Onot -fasm
GhcStage1HcOpts = -Onot -fasm
GhcStage2HcOpts = -Onot -fasm
GhcLibHcOpts    = -Onot -fasm
GhcLibWays      =
SplitObjs       = NO

$ mv ../build.mk ./mk/
$ autoreconf
$ ./configure
checking build system type... i686-pc-linux-gnu
checking host system type... i686-pc-linux-gnu
checking target system type... i686-pc-linux-gnu
Canonicalised to: i386-unknown-linux
...
config.status: creating config.mk
config.status: creating include/HsRegexPosixConfig.h
config.status: include/HsRegexPosixConfig.h is unchanged

$ time nice gmake -j10 > /dev/null
ghc: 25107068 bytes, 5 GCs, 125160/125160 avg/max bytes residency (1 samples), 63M in use, 0.00 INIT (0.00 elapsed), 0.05 MUT (0.53 elapsed), 0.02 GC (0.02 elapsed) :ghc
...
/usr/bin/ar: creating libHSghc.a
make -j10 > /dev/null  1299.72s user 163.13s system 250% cpu 9:44.08 total

$ compiler/stage2/ghc-inplace --version
The Glorious Glasgow Haskell Compilation System, version 6.6

And we're done! So, use GNU make's -j flag, disable optimisations, use the native code generator, and get as many cores as you can.

/home :: /haskell :: permalink :: rss

2006-11-21

Old photographs

A newly discovered photograph of my great great grandfather's sister, emigrated to Australia in 1830. The photo is pre-1890, I'm guessing.


/home :: /random :: permalink :: rss

2006-11-19

More QuickChecks

On #haskell, Tony Morris asked how he could improve the following function:
replaceAll :: Eq a => [a] -> [a] -> [a] -> [a]
replaceAll [] _ xs = xs
replaceAll _ [] xs = xs
replaceAll _ _ []  = []
replaceAll xs ys zs'@(z:zs) = if xs `isPrefixOf` zs' then ys ++ drop (length xs) zs' else z : replaceAll xs ys zs
A quick clean up to use guards instead, and add some lightweight automated testing followed:
import Data.List
import Text.Printf
import Test.QuickCheck

replaceAll :: Eq a => [a] -> [a] -> [a] -> [a]
replaceAll [] _ xs = xs
replaceAll _ [] xs = xs
replaceAll _ _ []  = []
replaceAll xs ys zs'@(z:zs)
    | xs `isPrefixOf` zs' = ys ++ drop (length xs) zs'
    | otherwise           = z : replaceAll xs ys zs

--
-- testsuite
--

prop_length xs ys zs =
    length xs == length ys ==>
        length (replaceAll xs ys zs) == length zs
    where _ = xs :: [Int]

run = mapM_ (\(s,a) -> printf "%-25s: " s >> a) [("length", test prop_length)]

{-

*Main> :l A.hs
*Main> run
length                   : OK, passed 100 tests.

-}
It's great how you can so quickly roll together nice resuable (higher order + polymorphic = reusable), and well specified code in Haskell.

/home :: /haskell :: permalink :: rss

2006-11-12

More refactoring

jcreigh wrote on his RPN calculator in Haskell. Here's some refactoring for fun (adding StateT and readline, mostly).
import Data.Char
import Data.Maybe
import System.Console.Readline
import System.Exit
import System.IO
import Control.Monad.State
import qualified Data.Map as M

type Stack      = [Double]
type Operator   = Stack -> Stack

forever a = a >> forever a
io        = liftIO

main = runStateT (forever repl) []

repl = do
    m <- io $ readline "> "
    case m of
        Nothing -> io $ exitWith ExitSuccess
        Just l  -> do modify $ flip eval (parse l)
                      io . print =<< get

eval = foldl (flip id)

parse = map parse' . words

parse' t | Just op <- M.lookup t table = op
         | otherwise                   = (read t :)

table = M.fromList
    [("+"    , binaryOp (+)         )
    ,("-"    , binaryOp (-)         )
    ,("*"    , binaryOp (*)         )
    ,("/"    , binaryOp (/)         )
    ,("**"   , binaryOp (**)        )
    ,("log"  , unaryOp log          )
    ,("sqrt" , unaryOp sqrt         )
    ,("dup"  , \(x:xs)   -> x:x:xs  )
    ,("swap" , \(a:b:xs) -> b:a:xs  )
    ,("pop"  , tail                 )]

unaryOp  f (a:s)   = f a : s
binaryOp f (a:b:s) = f b a : s

/home :: /haskell :: permalink :: rss

Response to "Getting real with Haskell"

Magnus wrote about his beginning steps in Haskell. Here's a quick refactor of the code, for fun, mostly:
import Data.Either
import MissingH.Time.ParseDate
import System.Environment
import System.Locale
import System.Time

diffAndToString s e = formatTimeDiff defaultTimeLocale "%R" diff
    where diff = normalizeTimeDiff $ diffClockTimes (toClockTime s) (toClockTime e)

calcDiff start end = case (parse start, parse end) of
    (Nothing, _)     -> Left "start time"
    (_, Nothing)     -> Left "end time"
    (Just s, Just e) -> Right $ diffAndToString s e
  where parse = parseCalendarTime defaultTimeLocale "%C%y%m%d-%H%M"

resultString [s,e] = either ("Bad "++) id (calcDiff s e)

main = putStrLn . resultString =<< getArgs

/home :: /haskell :: permalink :: rss

2006-11-05

Music in the Domain

Caught Little Birdy, Sarah Blasko, along with Tex Perkins and Tim Rogers at Legs 11. Lots of fun in the park, with music, wine and food -- the most relaxed gig I've ever been to. The new Little Birdy album, Hollywood, has some real Yeah Yeah Yeah-ish sounds, and even some Chrissie Amphlett moments. On the topic of sound, my music of the moment must be Deerhoof though, for sheer crazy, noisy fun.

/home :: /music :: permalink :: rss

2006-11-02

@pl-ising Stefan Holdermans' Code

Stefan made a little joke about how sometimes Haskell code can be non-obvious. Although his code seems pretty tame, I wondered what it would look like @pl-ified by lambdabot. The results were surprisingly good:
import Control.Arrow
import Data.List

order = map snd
      . sortBy ((. fst)
      . compare . fst)
      . zipWith ((id ***) . flip (,)) [toEnum 0..]
      . sortBy ((. snd) . compare . snd)
      . zip [0..]
Looks about the same, right? Yes! I thought lambdabot would explode this, and it didn't :) The end.

/home :: /haskell :: permalink :: rss

2006-10-19

Back from San Francisco

Back from San Francisco, after spending a week there for the Google Summer of Code Summit. I'll write more about that in the next few days. Getting back, and I notice that it's really feeling like summer in Sydney now: it's 30-odd degrees, and our cactus is flowering for the first time. Nina Cat thinks its good weather for sleeping, as usual.

/home :: /random :: permalink :: rss

Tshirt

And of course, the real reason we all participated in the Google Summer of Code: the tshirts! Mine finally arrived, the day after I get back from the GSoC summit, oh well :)

/home :: /haskell :: permalink :: rss

2006-10-04

Roll your own network code in Haskell

This week a medium length tutorial on network programming in Haskell. We build a concise IRC bot, and toss some custom control structures and a monad into the mixture. A small taste:

-- The 'Net' monad, a wrapper over IO, carrying the bot's immutable state.
type Net = ReaderT Bot IO
data Bot = Bot { socket :: Handle }

-- Set up actions to run on start and end, and run the main loop
main :: IO ()
main = bracket connect disconnect loop

listen :: Handle -> Net ()
listen h = forever $ do
    s <- io (hGetLine h)
    io (putStrLn s)
    when (ping s) $ pong s

Read the tutorial.

/home :: /haskell :: permalink :: rss

2006-09-27

Love of Diagrams

Listening to the new Love of Diagrams self-titled EP. I'd caught them opening for Sonic Youth back in July, and they just blew me away -- the energy, the swirling noise and angular sounds, so I was looking forward to this release. And now they've signed to Matador (home of Interpol), so hopefully there's a full album around the corner.

/home :: /music :: permalink :: rss

2006-09-23

Mirror of Simple Unix Tools for Haskell

Unfortunately a power failure has meant that haskell.org is down. So here's a mirror of yesterday's 'unix tools tutorial'.

The mirror.

/home :: /haskell :: permalink :: rss

2006-09-22

Haskell for Unix

Today a small tutorial for beginner Haskell programmers, wondering what to do with the language. We implement a range of common unix tools as Haskell one liners, using a "just show me the code" philosophy. The examples should help give a flavour of the beauty and expressiveness of Haskell programming. Here's a preview of complete programs:

-- The 'cat' program
cat     = input id

-- Sort a file
srt     = input sort

-- Reverse a file (tac)
tac     = input reverse

-- remove duplicate lines from a file (like uniq)
uniq    = input nub

Read the tutorial.

/home :: /haskell :: permalink :: rss

2006-09-21

A 5 minute introduction to testing Haskell with QuickCheck

Following an interesting query on the haskell-cafe mailing list, I've put up a very quick introduction to testing Haskell code using the QuickCheck testing framework.

The tutorial.

/home :: /haskell :: permalink :: rss

2006-09-20

Efficient wiki editing with W3M and Vim

I find I spend quite some time editing wiki pages, such as the Haskell.org wiki. However, browser-based editing leaves a lot to be desired, generally making the editing process more inefficient and difficult than it needs to be. I suspect that these inefficiencies hamper contributions, especially in longer texts. The main problem with wiki editing is the browser-based editor forms aren't proper editors, so you're just wasting your time using them.

Key missing features are:
  • Syntax highlighting
  • Regular expressions substitutions
  • Spell checking
  • Programming tool support (particular for a PL wiki like Haskell.org)

We can fill these holes, and improve efficiency, by using the text-based browser w3m along with your favourite text editor, which for me is Vim. w3m is a simple, fast, terminal browser, which uses an external editor by default, for editing online forms.

Syntax highlighting for wiki files, with embedded Haskell, and Vim is available here. Save this to your ~/.vim/syntax directory, and enable it dynamically when editing online with :setf haskellwiki.

Using your favourite editor from w3m is as simple as setting: editor: vim -c "setf haskellwiki" in the w3m configuration screen (type o in w3m to bring this up). This also enables syntax highlighting by default, including Haskell fragments embedded in "<haskell>" tags.

And of course, vim being vim, you have access to unix tools via the shell (such as aspell), programming tools like lambdabot (pointfree refactoring of wiki code, anyone?) and more.

More details on text editor support for wiki editing, from a variety of browsers. Thanks to jgrimes for advice on this.

Finally, for the future, I'd like to investigate a generic mechanism for exporting wiki pages to a local revision control system, such as darcs. For large wiki subsections I work on, I currently save files locally in a new darcs repository, edit them, then upload the subsection in a single pass when complete. This is tedious -- power users should always have a revision control back end to the wiki, for pushing and pulling changes directly. This is the way this blog is published: developed in a local darcs repository, and then pushed to a public repository, which blosxom then renders. Ideally, preparing wiki articles too should involve only the editor and a revision control system, for publishing.

/home :: /net :: permalink :: rss

2006-09-11

Persistent irc declarations, at last

One of the more oft-requested lambdabot features finally got implemented today: persistent value bindings. So lambdabot behaves more like a communal Haskell session. Let's define a useful combinator:
lambdabot> let eq f g = \x -> f x == g (x::[Int])
Defined.
Which we can refer to as a local binding:
lambdabot> > L.eq head last [1..10]
 False
Or pass to QuickCheck:
lambdabot> check head `L.eq` last
 Exception: Prelude.head: empty list
Telling us our property isn't quite right. Let's fix that:
lambdabot> let eqnotnull f g = \x -> (not (null x)) ==> f x == g (x::[Int])
Defined.
Which is much more useful:
lambdabot> check head `L.eqnotnull` last
 Falsifiable, after 0 tests: [2,0]
Or even something True:
lambdabot> check minimum `L.eqnotnull` (head . sort)
 OK, passed 500 tests.
I like having the machine figure stuff like this out for me.

/home :: /haskell :: permalink :: rss

2006-09-10

Syntax highlighting for the planet

planet.haskell.org now supports syntax highlighting of Haskell code using Malcolm Wallace's HsColour tool. To colorise your Haskell fragments, build and install HsColour, and run it with -css. The following source:
-- main :: IO ()
main = print $ case Just e of
        Nothing -> error "no return"
        Just x  -> x
    where e = map (+1) [1..100]
Handed to HsColour, and then prune the head and body tags, leaving a css-ified pre markup:
$ HsColour -css A.hs | sed '1,/body/d;s,</body>,,;$d'

-- main :: IO ()
main = print $ case Just e of
        Nothing -> error "no return"
        Just x  -> x
    where e = map (+1) [1..100]
Which uses the css tags defined here. Happy typesetting!

/home :: /haskell :: permalink :: rss

QuickCheck for the IRC masses

Lambdabot's plugin list keeps growing. We now have a QuickCheck plugin. QuickCheck is a rather spiffy unit test generator/framework, used particuarly for checking invariants in Haskell libraries. The lambdabot plugin makes it available from irc or the command line, to check properties of Haskell code. Ok, something basic:
lambdabot> check True
 OK, passed 100 tests.

A bit more complex:

lambdabot> check False == True
 Falsifiable, after 0 tests:

A list property:

lambdabot> check \s -> (reverse . reverse) s == id (s :: [Int])
 OK, passed 100 tests.

Or for partial functions:

lambdabot> check \s -> (head . sort) s == minimum (s :: [Int])
 Exception: Prelude.head: empty list

Ah, so better specify the precondition:

lambdabot> check \s -> not (null s) ==> (head . sort) s == minimum (s :: [Int])
 OK, passed 100 tests.

Or an invalid property:

lambdabot> check \s -> not (null s) ==> (head . sort) s == maximum (s :: [Int])
 Falsifiable, after 2 tests:
    [1,0]

We can hook this into the free theorem's plugins, just for fun:

lambdabot> free filter
$map f . filter (g . f) = filter g . $map f

lambdabot> check \f g s -> (map (f::Int->Int) . filter (g . f)) s ==
                           (filter g . map f) (s::T)
OK, passed 100 tests.

And we're already using it to answer language questions! Live from #haskell:

16:46  glguy:: ?type rem
16:46  lambdabot:: forall a. (Integral a) => a -> a -> a
16:46  glguy:: ?type mod
16:46  lambdabot:: forall a. (Integral a) => a -> a -> a
16:46  glguy:: synonyms?
16:51  dons:: ?check \a b -> a `rem` b == a `mod` (b :: Int)
16:51  lambdabot::  Exception: divide by zero
16:51  glguy:: heh
16:51  dons:: ?check \a b -> b > 0 ==> a `rem` b == a `mod` (b :: Int)
16:51  lambdabot::  Falsifiable, after 8 tests:
16:51  lambdabot:: -3
16:51  glguy:: ohhh negative numbers
16:51  dolio:: Ah, there you go. Dividing by negatives is different.
16:51  dons:: ?check \a b -> (a > 0 && b > 0) ==> a `rem` b == a `mod` (b :: Int)
16:52  lambdabot::  OK, passed 100 tests.
16:52  dons:: ?check \a b -> (a < 0 && b > 0) ==> a `rem` b == a `mod` (b :: Int)
16:52  lambdabot::  Falsifiable, after 6 tests:
16:52  lambdabot:: -2
16:52  dons:: QuickCheck++

/home :: /haskell :: permalink :: rss

2006-09-09

More free theorems for lambdabot

Hot on the tail of Andrew Bromage's free theorems plugin for Lambdabot, we now have two free theorem generators, thanks to Janis Voigtlaender and Sascha Boehme's FT tool.

lambdabot> ft map
 forall T1,T2 in TYPES. forall h1 :: T1 -> T2, h1 strict.
  forall T3,T4 in TYPES. forall h2 :: T3 -> T4, h2 strict.
   forall f1 :: T1 -> T3.
    forall g1 :: T2 -> T4.
     (forall x2 :: T1.
      h2 (f1 x2) = g1 (h1 x2))
     ==> forall x1 :: [T1].
        map h2 (t1 f1 x1) = t1 g1 (map h1 x1)

lambdabot> ft foldl1
 forall T1,T2 in TYPES. forall h1 :: T1 -> T2, h1 strict.
  forall f1 :: T1 -> T1 -> T1.
   forall g1 :: T2 -> T2 -> T2.
    (forall x2 :: T1.
     forall x3 :: T1.
      h1 (f1 x2 x3) = g1 (h1 x2) (h1 x3))
    ==> forall x1 :: [T1].
       h1 (t1 f1 x1) = t1 g1 (map h1 x1)

/home :: /haskell :: permalink :: rss

2006-09-04

Rewriting Haskell Strings

Yay. All done! The new paper describing the stream fusion system used in Data.ByteString is finished. Check it out :)

/home :: /haskell :: permalink :: rss

2006-08-28

More stream fusion

Some benchmarks for the Data.ByteString unstable branch, implementing the new stream fusion system to remove intermediate arrays. GHC does a really good job generating array traversal code using streams. Here's the difference between fps 0.7 and the unstable branch, for a range of fusible array functions:
Yay for rewrite rules! Here's a comparison of the strict and lazy ByteString types, indicating the complexity difference for lazy functions that manipulate on the spine of the lazy ByteString:

/home :: /haskell :: permalink :: rss

2006-08-26

Theorems for free and undoing monads in lambdabot

Two new fun plugins for lambdabot this week. The first, by Andrew Bromage, is @free, a free theorems generator. Given a function's type, the plugin returns theorems that hold for that function:

lambdabot> free sortBy :: (a -> a -> Ordering) -> [a] -> [a]
g x y = h (f x) (f y) => map f . sortBy g = sortBy h . map f

lambdabot> free foldl :: (a -> b -> a) -> a -> [b] -> a
f . h x = k (f x) . g => f . foldl h y = foldl k (f y) . map g

lambdabot> free filter :: (a -> Bool) -> [a] -> [a]
g x = h (f x) => map f . filter g = filter h . map f

lambdabot> free fmap :: (a -> b) -> F a -> F b
g . h = k . f => $map_F g . fmap h = fmap k . $map_F f

lambdabot> free sequence_ :: [M a] -> M ()
g y = y => mapM g (sequence_ x) = sequence_ (map (mapM f) x)

lambdabot> free unfoldr :: (b -> Maybe (a, b)) -> b -> [a]
 (((
    g (fst y) = fst z && f (snd y) = snd z
  ) =>
    p y = z
  ) =>
    mapMaybe p (h x) = k (f x)
  ) =>
    map g . unfoldr h = unfoldr k . f

Cool, huh? Free properties for your code! (Lots of properties checkable with QuickCheck perhaps..)

The second new plugin, written by Spencer Janssen, is @undo, a desugarer for do-notation, meaning we can plug monadic syntax into other lambdabot plugins:

lambdabot> undo do x <- f ; y <- g ; return (mapM_ k $ x ++ y)
f >>= \ x -> g >>= \ y -> return (mapM_ k $ x ++ y)

which we can then feed to the pointfree plugin:

lambdabot> compose pl undo do x <- f ; y <- g ; return (mapM_ k $ x ++ y)
(`fmap` g) . (mapM_ k .) . (++) =<< f

or find the type of:

lambdabot> compose type compose pl undo \f g k -> do x <- f ; y <- g ; return (mapM_ k $ x ++ y)
 forall a (m :: * -> *) b (m1 :: * -> *).
        (Monad m, Monad m1) =>
            m1 [a] -> m1 [a] -> (a -> m b) -> m1 (m ())

Haskell is fun!

/home :: /haskell :: permalink :: rss

What mp3 players should be like

My new M-Cody M20 mp3 player arrived today from the UK. Mmm... design.
MCody M20

Its tiny, smaller than a nano ipod, and when idle is completely black -- the OLED screen disappears. 2G is plenty, and it also has an FM tuner with presets, as well as a text file viewer.

The sound is great (after ditching the headphones it came with, and plugging in some good Sennheiser ones). The touch sensitive case took an hour or so to really get the handle of, and it seems quite intuitive now.

Had a bit of trouble at the start using it under OpenBSD (I managed to mangle the partition table on the device, and couldn't produce a vfat fs anyway), but under Linux things work perfectly, appearing like a usb stick.

So, all in all: tiny, beautiful, functional.

/home :: /hardware :: permalink :: rss

Haskell on 32 processors and beyond

Roman Leshchinskiy mentioned yesterday that he managed to get GHC running SMP Data Parallel Haskell on a 32 cpu Sun E6900, with memory, 80G real and 500G swap (a bargain at somewhere around $250,000 US for this midrange server)

Its fun watching GHC Haskell scale so well..

An aside: DPH uses stream fusion to automatically fuse concurrent array code. In Data.ByteString we adapted the rewrite rules and code, to instead fuse code over flat, unboxed Word8 arrays, for fast strings.

/home :: /haskell :: permalink :: rss

2006-08-07

more default defaults

In normal Haskell you can use what is possibly the most obscure Haskell98 feature, the default keyword, to modify how type defaulting works, and resolve ambiguities -- but only for numeric classes. The result is less type signatures, as below:
default(Double)

main = print $ 2 + 1
In ghci, this defaulting has been extended to the Eq, Ord and Show classes, so we can write, in ghci:
> reverse []
[]
and ghci will find the right Show instance. Now, in lambdabot, we can evaluate user's code in the irc channel, but often the user has to add extra types to have the right Show instance found:
dons:: > reverse []
lambdabot:: Add a type signature
dons:: > reverse [] :: [()]
lambdabot:: []
This is annoying. What would be really nice is if the defaulting mechanism used by ghci was available to userland Haskell, by a -fglasgow-extension of the default keyword mechanism.

Update Simon has now kindly added a new flag to GHC, "-fextended-default-rules", which does exactly this. Lambdabot won't be second class anymore :)

/home :: /haskell :: permalink :: rss

2006-08-04

Stream fusion

Duncan Coutts and I have been furiously hacking away at a new array fusion system for Data.ByteString, using an improved set of loop combinators based on hylomorphisms. Some initial results (raw) are available.
  • Old system, which people are now using.
  • New system, based on stream fusion, using latest ghc patches, which we hope to have out in the next couple of weeks.
int-e on #haskell kindly graphed the results. For fuseable pipelines, this gives some indication of the speed ups people are likely to see in ghc 6.6. The numbers compare fuseable functions that traverse the byte strings in various directions, with and without accumulators. And once again QuickCheck has been hugely helpful in getting our rewrite rules correct, and spotting flaws in our reasoning. GHC seems to do a much better job generating tight, strict loop with the new combinators :)

/home :: /haskell :: permalink :: rss

2006-08-03

main :: IO a

Noticed today that in the ghc 6.5 branch, the following program is valid:
main :: IO String
main = return "hello, world"
This works for any return type, in Show, it seems. (Except, notably, ()).
$ runhaskell A.hs 
"hello, world"

/home :: /haskell :: permalink :: rss

About

Real World Haskell

RWH Book Cover

Archives

Recommended

Blog Roll

Syndicate