This is the third installment in a series that parallels Matt Brubeck’s guided tour through making a browser engine so simplistic that its design can be captured in a handful of blog entries. The first entry offered a minimal set of types for a document object model. The second entry described a basic parser that could reify that model from an HTML document. This entry repeats those two exercises for a different kind of document that browsers must understand: a cascading style sheet.
As usual, you should read our fearless leader’s post prior to reading this one. After all, Matt works for Mozilla, while I’m just someone who thought it would be fun to follow along.
CSS Model
Cascading Style Sheets (CSS) is a declarative language specific to the domain of associating elements in some markup language with properties describing their format and layout. The structure of the language, at the largest scale, is just a list of rules.
1 2 3 |
|
One rule might look like this:
1
|
|
Each rule is made of a comma-delimited list of selectors followed by a declaration block, the former being everything prior to the opening curly-brace, and the latter being a semicolon-delimited list of declarations bounded by curly-braces.
1 2 3 4 |
|
We’re going to confine ourselves to a well-trafficked subset of selector syntax, but we’re going to allow a place to extend our model later should we decide to support selector combinators or selector namespaces. We’ll do this by defining an enum
– which, for now, will only have one case:
1 2 3 |
|
That case encompasses the familiar:
…a simple selector can include a tag name, an ID prefixed by ‘#’, any number of class names prefixed by ‘.’, or some combination of the above.
1 2 3 4 5 |
|
Each declaration in the semicolon-delimited list is just a name/value pair separated by a colon.
1 2 3 4 |
|
The values in these pairs, however, can take many forms. Again, we aim to ignore the devil hiding in the mountain of details so we limit ourselves to simple keywords, pixel-unit values, and colors.
1 2 3 4 5 6 7 8 9 |
|
Now that we have some types we can use to represent simple stylesheets, we move on to making a corresponding parser.
CSS Parser
Our quick-and-dirty parser implementation looks very much like the HTML parser in the last post, with a handlful of functions that invoke the consumeWhile()
function with various predicates.
1 2 3 4 |
|
This is a good place to mention a little bit of follow-up from the last post. Recall that I ran into a little hurdle when I tried to test character-set membership using a Swift Character
type, and ended up creating bridging the gap through NSString
, which was a bit of a kluge. It turns out that this hack only became necessary because I was insisting on using the old NSCharacterSet
class. If I had given up on that API, I might have ended up with some predicates that looked a lot more like their Rust counterparts in the Robinson code.
After playing around in a Swift Playground, I figured out that ranges of Characters actually do work after all. I just had to use ...
instead of ..
, and replace Rust’s |
operator with commas.
1 2 3 4 5 6 |
|
That said, I would have had to write some extra code either way, because Matt’s code gets to lean on Rust’s std::char::is_whitespace
method. Swift’s Character
doesn’t have such a method (yet) so either I create one of my own – similar to the validIdentifierChar()
method above – or I set up the Rube Goldberg bridge to NSCharacterSet
.
Now we can look at our method to parse a SimpleSelector
. You may have noticed that I declared its fields as var
above. There was a reason for that: the way the following method imitates its Rust counterpart requires it. Notice how we allocate the SimpleSelector
before entering the loop, and mutate its fields as we encounter characters in the loop. If I really wanted to keep my structure immutable, I could have created more local variables and constructed the SimpleSelector
at the end of the loop.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
|
The type alias for the Specificity
tuple looks very much like the equivalent in Robinson. I was tempted here to use Swift’s feature that allows us to name the three fields of the tuple. There were a few moments where I had to keep reminding myself which field represented which part of the selector, so I probably should have leveraged that feature.
1
|
|
I wrote the function that computes the specificity of a selector turned out as a Swift “computed property”. There was no particular reason for that, and I’m not yet sure when one should or should not do that.
There is more to say about how this compares to the equivalent method in Robinson, though. Notice how Rust’s std::option
is an actual collection with an iterator that you can query for cardinality. This is something you cannot yet do with Swift’s Optional
. I don’t mind having to substitute some ternary expressions here, but I do like the Rust way.
1 2 3 4 5 6 7 8 9 10 11 |
|
Here is the last pair of methods mentioned in Matt’s post. As with most of what we have seen so far, the Swift versions looks very much like the Rust counterparts. The most obvious difference here is that the break
statement requires an outerLoop
label to signify that it is the while
loop that we want to break out of, and not the switch
. Rust doesn’t have this ambiguity. The reason can be traced back to Rust’s expression-oriented nature: its match
expression returns a value. What value should it return if one were allowed to use break
to escape the match
expression?
I bet that Swift will likely be the last major new language to go down the statement-oriented path. If I could have only one wish to make Swift better, I would wish that it had taken the high road.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
|
Finally, I would like to point out that the closure given to the sort
method above works because I overloaded the >
operator. This is another thing that Rust gives for free. Since every slot in Matt’s Specificity
tuple is occupied by something that implements Ord
, the entire tuple automatically implements Ord
, and can therefore be compared in a manner somewhat similar to how Python allows. In the absence of such linguistic luxury, I implemented this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
In the next post, we will take up the two proton packs we have built and ignore Dr. Egon Spengler’s admonition to never cross the streams.
In the mean time, you may want to check out Martin Tomasi’s project, which is following this same exercise, but is implemented in Java.