The Scala Essentials

George Francis Jr
8 min readSep 6, 2017

--

As Li Haoyi noted, Scala is a complex programming language that offers a multitude of APIs to solve the same problem. When I first started my Scala journey, I was often confused on which API was the correct one to use for a given task. Did my code use idiomatic Scala techniques? Is it readable? After spending a few years studying the language and interviewing top Scala developers, I’ve finally cracked this code. Here, I break down a few essential tools/tips to becoming a master Scala developer.

  1. Ammonite REPL — Repeat. Evaluate. Print. Loop. The holy grail of all tools in Scala land. REPL is an interactive interpreter that allows you to experiment with the Scala language. Experiment, Experiment, Experiment! Open your favorite text editor and write small, pure functions and test out your algorithms immediately with this tool! Before you get started, here is how you set up a playground to try out the concepts that we will go over. Make sure Java 1.8 & Scala 2.12 are installed and execute the commands below in a terminal (MacOS,Unix,Linux).

Example output of the commands above:

Setting up REPL for the 1st time

With Ammonite you can dynamically import your favorite APIs! I have pre-configured a few imports for you. Checkout predef.sc and add your own initialization code!

Dynamically import a library using Ivy

Here is a snapshot of the REPL in action:

2. Case Classes — Store immutable data. Our first line of defense in the war for data integrity. Tip: Fail fast! Create well defined business requirements. Use require() statements to create strongly typed case classes that only accept valid data! require() statements are executed before the case class instance is created. If the data does not meet our requirements, a java.lang.IllegalArgumentException is immediately thrown. Try out the code below in the REPL:

Output :

Failure scenario: invalid data

Continued:

Success scenario: valid data

3. Sealed Traits— Scala’s version of Algebraic Data Types (ADT). ADTs allow us to define a fixed set of possible values for a given type. Scala has an abundance of ADTS: Option(Some OR None), Try(Success OR Failure), List(head::tail OR Nil) just to name a few. In the ADT world, the children of parent types are also known as sum types (logical or). For example, Success & Failure are the sum types of Try. The constructor arguments of a child type are known as the product type (logical and). Scala provides a useful feature called pattern matching for ADTs. The Scala compiler will check to make sure the match block covers all of the possible sum types and warn you if you missed a case! The compiler also checks for the constructor arguments of each type. Tip: Turn warnings into errors by using the following compiler flag: -Xfatal-warnings. Let’s modify the previous example:

Output of Version 1 (missing sum type):

Continued:

Version 1 — Sum Type: Adult is missing

Output of Version 2 (complete sum types):

Version 2 — Complete set of sum types

4. Regex w/extractors —A regular expression is an invaluable tool for parsing formatted strings however, matching/extracting data from regex apis proves to be a daunting task for many developers. The Java approach typically involves creating a regular expression that defines matching groups and writing verbose logic to check for matches/extract data based on the group index (Java 7 introduced named groups). In Scala, we can name regular expression groups and then nest them in match blocks! Tips: Write a pure function that accepts a string and returns an Option[T]. Keep your regular expressions simple. If you are parsing a delimited string, write a generic expression that focuses on capturing the column data between the delimiter. Use Try blocks to attempt to instantiate the case class. See the example below.

Output:

Continued:

Test Case Output:

5. Higher Order Functions (HOF) — Functions that take a function (or lambda expression) as input and or return a function as output. The HOFs provided by the Scala collections api (Array, List, Set, etc) abstract the for loop out of the equation. These HOFs will internally loop/iterate for you applying the function f(x) to each element in the collection. The idea is that developers can focus on transforming the data without worrying about mundane for loop logic and the null pointer exceptions they cause. Map is one of the most commonly used HOFs. Map accepts a unary function (one input) f(x) that transforms the data. Map is 1:1. If you have a list with 10 elements and you provide a lambda expression to map, for example someList.map(f(x)), you will get a new list with 10 elements. Filter is another HOF that accepts a predicate function which returns a boolean. someList.filter(f(x)) would give us a new list that is the same size or smaller than the original. As a side note, Apache Spark was mostly developed in Scala! Map, flatMap, filter, etc. work the same way in Spark’s RDD API as they do in Scala. The only difference is Spark scales the operations across nodes in a cluster.

Let’s try out the example below.

Output:

Continued:

6. Monads — Container types (data wrappers). Many of Scala’s builtin algebraic data types double as Monads. Monads have a sum type that represents the happy path and one that represents the edge case. For example, Option->Some is a container for data that is not null. Option->None denotes the case where data is missing (Null in Java). List -> :: represents the case where the list contains a head and a tail (at least 1 element). List->Nil denotes the case where the list is empty.

Let’s experiment with the Try monad. The Try monad has two sum types: Success & Failure. Success is a wrapper containing the result of the Try block. Failure is a wrapper containing the exception that was thrown.

Output: Version 1

Output: Version 2

What happens if we chain map calls on the Success and Failure instances of the monad? What about when f(x) returns another monad? Let’s Try it out :-)

When the Try monad instance is equal to Success (the happy path) map will perform f(x). When the monad is equal to Failure (edge case) map will do nothing and return the Failure instance. These statements hold true no matter how many times we chain map calls on a monad. Monads transform data while handling edge cases gracefully :-) Monads have two requirements: unit and flatMap (aka Bind). Unit is a function that takes the data and wraps it with the monad. For example, Some(x) or Some.apply(x) satisfy the unit requirement for the Option monad. FlatMap accepts a unary function f(x), conditionally unwraps the monad and returns f(x) only if the monad instance does not represent the edge case. Let’s write our own map & flatMap functions for the Option monad to better understand what is happening behind the scenes:

sealed abstract class Option[+A] {
...
...
...
def map[B](f: A => B): Option[B] =
this match {
case Some(x) => Some(f(x)) //Perform f(x) and wrap the result
case None => None //Edge case
}
def flatMap[B](f: A => Option[B]): Option[B] =
this match {
case Some(x) => f(x) //No wrapping after performing f(x)
case None => None //Edge case
}
}
case class Some[+A] (a: A) extends Option[A] {...}
case class None() extends Option[Nothing] {...}

In the code above notice how map unwraps the monad, performs f(x) and then re-wraps the result in the Option monad. FlatMap unwraps the monad and performs f(x) and returns the result. This concept of unwrapping proves to be essential when dealing with 2 or more monads of the same type.

Let’s rewrite the previous example using flatMap.

Unwrapping nested monads w/flatMap

Considering the following scenario.

If today is Sunday and the television is available (not in use), I will watch Game of Thrones.

trait Sunday
trait Television
trait GameOfThrones
def getSunday(localDate:LocalDate): Option[Sunday] = {...}
def getTelevision(sunday:Sunday): Option[Television] = {...}
def watchGameOfThrones(television:Television): GameOfThrones = {...}
val gameOfThrones:Option[GameOfThrones] = getSunday(localDate).flatMap(sunday => getTelevision(sunday)).map(tv => watchGameOfThrones(tv))

FlatMapping/mapping 2 or more monads tends to make the code harder to reason about. Scala has a really useful monad assistant called the for comprehension. The for comprehension abstracts the flatmap/mapping for us making the code much easier to read! Lets rewrite the logic above:

val gameOfThrones:Option[GameOfThrones] = for {
sunday <- getSunday(...)
tv <- getTelevision(sunday)
} yield GameOfThrones

Here is a more practical example to try in the REPL

Output:

Continued:

Success Scenario Output:

Failure Scenario Output:

In the example above we wrote code simulating the workflow of purchasing and receiving a product from Amazon. We wrapped our long running mock web service calls in Futures. Future is a monad that represents an asynchronous process. In order for our package to be delivered the purchaseSession and shippedItem Futures must execute successfully. Notice the mock web service functions did not explicitly return Futures, instead we wrapped the function calls in Future monads. This simplifies testing as you can test the external APIs in isolation without being forced to test the concurrency logic. Tip: Avoid using statements that wait for a Future to complete as this will block the current thread, preventing it from processing other requests, essentially negating the benefits of the asynchronous callback. Remember f(x) in the flatMap/Map calls only get executed for the positive sum type of the Monad/ADT. In the case of Futures, only on Success.

--

--

George Francis Jr
George Francis Jr

No responses yet