Haskell Tutorial

Adapted from A Gentle Introduction to Haskell and Haskell-Tutorial by Jerry James and Vicki Allan

This tutorial is intended as a self-help guide. If you get lost using it, take note of your problems and let me know so I can make it clearer.

Getting Started

If you are using HUGS (recommended), then familiarize yourself with the environment by working through the examples in the getting started document.

Introduction

Haskell is a “typeful” programming language (the term typeful was coined by Luca Cardelli). Types are pervasive, and the newcomer is best off becoming well aware of the full power and complexity of Haskell's type system from the outset.

Hello, World!

Create a file called HelloWorld.hs (hs stands for “haskell script”).  It contains the following code:

{- HelloWorld.hs
   This is a multiline comment.
-}
-- This is a single-line comment.
 
hello :: IO ()
hello = putStr "Hello, world!"

Load this file into your interpreter. Then evaluate the expression hello to invoke the function by typing hello. Does it do what you expect?

We can learn several bits of Haskell syntax from this example. First, we now know how to write both multiline and single-line comments. The first line of actual code is the equivalent of a C++ function prototype: it gives the types of the parameters and the return value of the hello function. In this case, there are no parameters, and IO () is the return type. We will explain later what this means. Finally, the function itself is defined using an assignment operator. This operator binds the definition to the name hello.  The fact that there are no parameters is reflected in the fact that nothing occurs before the equal sign other than the function name.

Now edit this file and remove the type declaration for hello. Load the revised version into your interpreter and invoke the hello function again. It still works! Haskell does not need most type declarations. It can figure out the appropriate types itself using a type inference. Nevertheless, the type declarations often contribute toward good code documentation.

Types

Every expression has an associated type:

42 :: Integer

‘a’ :: Char

[1,2,3] :: [Integer]       // A list of integers

(‘b’,4) :: (Char,Integer)  // A tuple in which the first component is a Char

and the second  is an Integer

What should the value of a function be?  It should at least convey the fact that it takes values of one type (say T1) and returns values of (possibly) some other type (say T2).  In Haskell, this is written as T1 -> T2 and we say that such a function maps values of type T1 to values of type T2.  If there is more than one input, the notation is extended with more arrows. 

Writing Functions in Haskell

In a functional language, everything (even a program) is a function. Thus, writing a program is actually writing a function. You can write many functions in a single file; that is, you can have many programs in a single file. Each function then may use other functions to calculate its result. There are many predefined functions available like in other programming languages. These functions are included in the Haskell Prelude and in the Haskell Libraries.

Writing a function consists of two parts. First, we have to define the types for the arguments and the result. A function may have no arguments, such as our hello function above. This is called a constant function.

The simplest case of a definition has the form:

name :: type

name = expression

 

as in the example

pi :: Float

pi = 3.1415926535

 

:: is read “is of type”.  The second line says “pi is defined to be 3.1415926535”.

Other functions may have one or more arguments. However, each function has a single result, e.g.:

doubleIt :: Integer -> Integer

The name of our new function is doubleIt. The :: separates the name of the function from the type definition. The parameter types are separated from each other and from the return type by ->. Therefore, the declaration above reads “doubleIt is a function with one Integer parameter, returning Integer.”

The second part of writing a function is the implementation, e.g.:

doubleIt x = 2*x

This line is read “doubleIt applied to an unknown variable x is defined to be the value 2*x”.

Note that the parameters are not in parentheses, as you may be used to from learning other languages. To test the functionality of our doubleIt function, we type doubleIt
2 at the Haskell interpreter prompt. What is the result?

Exercise 1: Write a Haskell function to find the circumferance of a circle, given the radius.  Hint, if you use the pi function described above, it conflicts with the pi function declared in Prelude. You can avoid this conflict by typing the following (at the top of your file):

module Try where

import Prelude hiding(pi)

Let us take another example. Like many programming languages, Haskell has a predefined data type for Boolean values. An important operator for Boolean values is the logical and. Although Haskell already knows that function we will implement it again. One solution would be to write is in the following way:

and1 :: Bool -> Bool -> Bool
and1 a b = if a == b then a else False

The type indicates there are two input parameters and one output parameter, but the notation is a bit strange.  In mathematical notation we would say, and1: (Bool x Bool) -> Bool.  Haskell just assumes that everything but the last parameter is an input parameter.

Here we use a branch (if-then-else) to split the calculation method for the result. If the parameters are equal (either both
True or False), we return the value of one of the parameters. If the parameters are not equal, we return False. Another method would be the use of pattern matching and give various ways to compute the answer. The system will  try each choice looking for a match, and use the first match. The logical and has a special case where the result is True. In all other cases, the result is False. We know which parameters the special case has. We can write these parameters in a separate line and give the result. For all other cases we write an additional line:

and2 :: Bool -> Bool -> Bool
and2 True True = True
and2 x y = False

This says, if the two input paramters match “True True” return a True, otherwise the parameters will match “x y” so return a False. In our example, we did not use the values of the parameters for calculating the result of the second line. It is not necessary to know the values of the parameters because we have a constant result. In Haskell, we can write _ instead of providing names for these parameters. The advantage is that the reader still knows the number of parameters, but also knows that the result is independent of the values of these parameters. The second line would then look like this:

and2 _ _ = False

Exercise 2: Write a Haskell script (called fib) which computes fibonacci numbers.  Recall fib 0 = 0, fib 1 = 1, and fib n add fib (n-1) and fib (n=2)

Exercise 3: Write a Haskell script (called aver) which computes the average of three integers (return an integer).  Note that `div` performs integer division.

Exercise 4: Write a Haskell script (xor) which returns true if either of the arguments is True and False otherwise.

Guards

Guards (or conditions) are used to give alternatives.   Guards can be used instead of an if-then-else. Guards indicate various cases in the computation.

 

max :: Int -> Int-> Int

max x y

   | x > y = x

   | otherwise = y

 

If the first condition is satisfied, the result is x.  Otherwise, the result is y.  Note the guards are evaluated in order.  The first guard that is true controls the result.

 

Exercise 5: try writing a function with multiple guards which computes the max3 (the maximum of three values).

 

Layout

A script contains a series of definitions, one after another.  How is it clear when one ends and another begins?  It is all done with indentation.  Formally, a defintion is ended by the first piece of text which lies at the same indentation or to the left of the start of the definition.

 

Exercise 6: Try the following function (with strange indentation) to see how Haskell enforces the indentation rule.

 

  max4 :: Int -> Int-> Int

  max4 x y

   | x > y = x

| otherwise = y

 

The reader will note that we have capitalized identifiers that denote specific types, such as Integer and Double, but not identifiers that denote values, such as roots. This is not just a convention: it is enforced by Haskell's lexical syntax. In fact, the case of the other characters matters, too: foo, fOo, and fOO are all distinct identifiers.

Exercise 7: Try the following function definition to see how Haskell tells you it doesn’t like the capitalization.

And1 :: Bool -> Bool -> Bool
A
nd1 a b = if a == b then a else False

 

Haskell supports block beginning and ending markers (curly braces) and statement separators (semicolons).

Characters

Characters in Haskell are written in single quotes, such as `a`.  The built-in type fromEnum converts from an enumerated type (such as Char) to an integer.  So, for example, fromEnum ‘A’ returns a 65.  toEnum converts from an integer back to an enumeriated type (such as Char).  The following example shows two functions which together convert from a lower case character to upper case.

 

 

offset:: Int

offset = fromEnum 'A' - fromEnum 'a'

 

toUpper :: Char -> Char

toUpper ch = toEnum(fromEnum ch + offset)

 

Where and if-then-else

Let us take a more complex example. Consider quadratic equations of the form a2x2 + a1x + a0 = 0. The equation has two (possibly equal) roots. Recall that the formula for solving quadratic equations is:

x = (-b +/- sqrt(b2 - 4ac)) / 2a

Let us write a function that computes the roots. It will need to take 3 values as inputs and produce two values as outputs. We know how to provide multiple parameters, but how do we return multiple values? We pack them into a tuple. Here is the type definition of our function (indicating three input parameters of type Float and two output parameters of type Float):

roots :: Float -> Float -> Float -> (Float, Float)

Note that the type definition of the return value (a tuple) is a set of parentheses enclosing a comma-separated list of types.

The next step is the definition of the function roots:

 
  roots a b c = (x1, x2)
      where x1 = e + sqrt d / (2 * a)
            x2 = e - sqrt d / (2 * a)
            d = b * b - 4 * a * c
            e = - b / (2 * a)
        

The first line indicates that the inputs are named a, b, and c and the result of the function is the pair (x1, x2). Both values x1 and x2 are defined by local definitions (after the where clause). The local definitions calculate parts of the result.  Notice, the local definitions do not require a declaration (as would be required in C++).   In general, it is useful to create a local definition for a partial calculation if the result of that calculation is needed more than once. The result of the local definitions d and e, for example, are necessary for both parts of the solution (x1 and x2). The solutions themselves could have been written directly in the tuple. However, defining them as local definitions improves the readability of the code because the symmetry of the solution is clearly visible and the efficiency, as the values do not need to be computed twice.

This function works for real roots of a quadratic equation, but not for imaginary roots. We can test that as follows:

Main> roots 1.0 2.0 1.0
(-1.0,-1.0)
Main> roots 1.0 1.0 1.0
(
Program error: argument out of range

(Note that this is the HUGS response; GHCi instead gives
(NaN,NaN) as the response to the second invocation of roots, where NaN means not a number.) The problem here is that a negative real number is being passed to the sqrt function.

One way to address the problem is to detect it and tell the user what is really going on. Try this version of the roots function:

 
  roots a b c = if d < 0 then
                error "Cannot solve for complex roots"
                else (x1, x2)
                    where x1 = e + sqrt d / (2 * a)
                          x2 = e - sqrt d / (2 * a)
                          d = b * b - 4 * a * c
                          e = - b / (2 * a)
        

This version shows the use of Haskell's if-then-else construct. The value of an if-then-else is the value of the then part or the value of the else part, depending on whether the if part is true or false, respectively.

Yet another approach which combines guards and where returns a list of roots.

roots   a b c | delta < 0  = error "complex roots"

                | delta == 0 = [-b/(2*a)]

                | delta > 0  = [-b/(2*a) + radix/(2*a),

                                -b/(2*a) - radix/(2*a)]

                  where

                  delta = b*b - 4*a*c

                  radix = sqrt delta

 

The first equation uses the builtin error function, which causes program termination and printing of the string as a diagnostic.

Where clauses may occur nested, to arbitrary depth, allowing Haskell programs to be organized with a nested block structure. Indentation of inner blocks is compulsory, as layout information is used by the parser.

 

Exercise 8: Write a function addPair:: (Int,Int) -> Int which returns the sum of the elements of the pair.

Exercise 9: Write a function minAndMax :: Int -> Int ->(Int,Int) which accepts two integers and returns a tuple containing the (min, max) of the numbers.

Exercise 10: Write big3 input three numbers and output the largest of the three.  Use your function minAndMax to solve this.  (There are easier ways of writing big3, but the point is to be able to use a function returning a tuple.)

Built in Operators

In addition to the operators below, functions such as mod or div (integer division) can be placed in back quotes and used as infix operators.  For example x `div` 3.   They can also be used as prefix operators without the quotes. For example, div x 3.

 

User Defined Operators

The functions we define are prefix operators because they occur before their arguments.  Infix operators are used between their arguments (such as 4+3).  They can be written before their arguments (like normal function calls) by enclosing the operator in parentheses.  Thus,

Prelude> (+) 4 5

9

 

We can create user defined infix operators by using operators symbols (! # $ & * + . /  ? \ ^  |  : - ~)

 

For example, we can define a new operator &&& as an infix min function

(&&&):: Int -> Int-> Int

x &&& y

    | x > y = y

    |otherwise = x

 

Exercise 11: Create an infix function # which returns the average of the two Float arguments.  Note / performs floating point division.

Predefined Types, Constructors, and Classes

Here is a list of the types that all Haskell implementations support natively (i.e., excluding library-defined types, like Complex):

()

This is the unit datatype, with only one value, namely ().

Bool

Possible values are True and False.

Char

Literals are written as in C, namely 'a', 'b', etc.

String

Literals are written as in C, namely "A string".

Int

Fixed-precision integers. The Haskell specification requires that these be at least 29-bits wide. In practice, both Hugs and GHC support 32-bit Ints.

Integer

Arbitrary-precision integers.

Float

Single-precision IEEE floating point values (same as a C float).

Double

Double-precision IEEE floating point values (same as a C double).

In addition, Haskell has several builtin type constructors.

Lists

The empty list is written []. New elements are added to the front of the list with the : operator; e.g., 1 : [2, 3, 4]. Lists are concatenated with the ++ operator; e.g., [1, 2, 3] ++ [4, 5, 6].

Tuples

            Fixed number of elements, separated by commas, enclosed in parentheses.

All Haskell implementations support tuples up to size at least 15.

Abstract datatypes

These are created with the data keyword. In fact, the unit type is defined as an abstract datatype. We will say more about these later.

Finally, Haskell supports a number of standard type classes. Think of these as being similar to Java interfaces. They guarantee the existence of certain operators on elements of the class.

Integral

Type class containing the Int and Integer types.

RealFloat

Type class containing the Float and Double types.

RealFrac

Type class containing RealFloat. The library-defined Ratio type is also a member of this class.

Floating

Type class containing RealFloat. The library-defined Complex type is also a member of this class.

Real

Type class containing Integral and RealFrac.

Fractional

Type class containing RealFrac and Floating.

Num

Type class containing Real and Fractional.

We will see some other important type classes later.

 

Algebraic types

We have seen base types of Int, Float, Bool, Char.

We have seen composite types such as tuples (t1,t2,t3,…tn), list types [t1], and function types (t1->t2-> … tn) where the t1, t2…tn are themselves types.

 

However, there are still things we are missing such as

  • The types for the months January… December.
  • The type whose elements are either a number or a string.  (a house will either have a number or name, say)
  • A type of trees.

 

All of these can be modeled as algebraic types.

 

The simplest form of an algebraic type is an enumerated type.

data Season = Spring | Summer | Fall | Winter

data Weather = Rainy | Hot | Cold

data Ordering = LT|EQ|GT  --  built into the Ordering Class

 

A more complicated algebraic type (the product type) allows for the type constructor to have types associated with it.

data Student = USU String  Int
data Address = None | Addr String
data Age = Years Int
data Shape = Circle Float | Rectangle Float Float
 

Thus, Student is formed from a String (call it st) and an Int (call it x) and the element Student formed from them is USU st x.

 

showStudent :: Student -> String

showStudent (USU name cred) = name ++ "--" ++ (show cred)

*Main> showStudent (USU "Vic" 45)

"Vic--45"
 
The function show is necessary to turn cred into a string (necessary for concatenation).

 

Pattern matching is used to define functions by cases:

isRound:: Shape -> Bool

isRound (Circle _) = True

isRound (Rectangle _ _ ) = False

area:: Shape -> Float

area (Circle r) = pi*r*r

area  (Rectangle h w) = h*w

 

The general form of the algebraic type is

 

Data TypeName

   = Con1 T11 .. T1n |

      Con2 T21..T2m |

       ...

Each Coni is a constructor which may be followed by zero or more types.  We build elements of TypeName by applying this constsructor functions to arguments.

 

Algebraic types can also be recursive

 

 

data Expr = Lit Int | Add Expr Expr | Sub Expr Expr

data BST = Nil | Node Int Tree Tree

printBST:: BST -> [Int]

printBST Nil = []

printBST (Node x left right) = (printBST left) ++ [x]++ (printBST right)

 

main> printBST (Node 5 (Node 3 Nil Nil) Nil)

[3,5]

Exercise 12: Using the definitions above, write a function forecast :: Season -> Weather which accepts the season and predicts the weather forecast.  The only tricky thing is that the system needs to know how to turn instances of the enumerated type into a string.  This is accomplished  by:
 
instance Show Season where
  show Spring = "Sp"
  show Summer = "Su"
  show Fall = "Fa"
  show Winter = "Wn"
 
instance Show Weather where
  show Rainy = "Rainy"
  show Hot = "Hot"
  show Cold = "Cold"
 
When the value to be printed is the name of the instance, s simpler way is to merely add the line deriving(Show)

data Season = Spring | Summer | Fall | Winter

  deriving (Ord,Eq,Show)
 
This not only tells the system that this has the Show characteristics, but it also is an ordinal type which can be compared for equality.
 
Exercise: Write the code to insert a value into the binary search tree BST. A constant function definition (which returns a tree) may be helpful in your testing.
aTree = (Node 5 (Node 3 Nil Nil) (Node 10 Nil Nil))

Type synonyms

For convenience, Haskell provides a way to define type synonyms --- i.e. names for commonly used types. Type synonyms are created using type declarations– and are like typedefs in C.. Examples include:

 

type String = [Char]
type Person = (Name, Address)
type Student2 = (String,Int)
type Score = (String,Int)
type Name   = String

 

This definition of String is part of Haskell, and in fact the literal syntax "hello" is shorthand for the list of characters below:

 
['h','e','l','l','o']
 
 
When you use these definitions, Hugs gets confused because String is already defined in the Prelude.  The message Hugs prints is something like: Error “fact.hs” Ambiguous type constructor occurrence “String”.
*** Could refer to: Fact.String Hugs.Prelude.String
 
To redefine a Prelude type, after the module line, include the line
 
import Prelude hiding (String)
 
If there are several things that are redefined, the line might look like
import Prelude hiding (String,min,max,add)
 
Why use Student over Student2?  One, it is more readable and two, it can’t be confused with other types (such as Score).  However, Student2 has the advantage of being in a basic form so build-in functions can be used.

 

 
Exercise 13: Using the definitions above, write a function getName::Person ->Name which accepts a person as input and returns just the name.  Note that an address is not just a string but looks like 
Addr “100 East 241 South, Richmond”.  Thus, the call to your function looks like
    getName (“Robert Johnson”, Addr “241 Highway 89”)

 

Recursion, or “Who stole loops and counters?”

Recursive functions are often specified with patterns. That is, the function is written with more than one equation, and the Haskell runtime system does pattern-matching on the parameters to figure out which equation is applicable. For example, here is the factorial function written in Haskell.

 
  fact 0 = 1                   -- base case
  fact n = n * fact (n - 1)    -- induction step
        

When this function is called, Haskell checks whether the parameter is 0. If so, it matches the first equation and evaluates to 1. Otherwise, it matches the second equation. Functions can be written with any number of equations.

Exercise 14: Suppose we have to raise a to a power n.  If n is even (2*m, say), then

   an = a2*m = (am)2

Else is n is odd (2*m+1 say)

an = a2*m+1 = a(am)2

Use this fact to define a recursive function pow :: Int ->Int -> Int.

Lists

We encountered tuples in the quadratic equation solver. Tuples can have 2, 3, 4, or more elements in them, but the number of elements is fixed. Also, the elements of a tuple can have different types. For example, the first element of a pair may be a name (a string) and the second the age (an number).