A naïve attempt at parsing CSS in JavaScript – Part 1
All I want to be able to do is colour the CSS on this website by understanding when something is a selector, when it is a property and when it is a rule. The CSS specifications have a section on tokenizing and parsing CSS.
For this project, I will jump right over the input byte stream and preprocessing sections and look at tokenization.
The first codeable instruction is to ‘repeatedly consume a token until’ the end of the file is reached. For my project, that says I should loop through the characters in a string: so I’ll start with that:
const highlight = cssStr => {
for (let i = 0; i < cssStr.length; i++) {
// consume characters
}
return '';
}
The tokens then need to be consumed. The specification describes how to consume comments which I interpret like this:
const highlight = cssStr => {
const state = {};
for (let i = 0; i < cssStr.length; i++) {
// Consume comments
if (!state.isComment && (cssStr[i] === '/') && (cssStr[i+1] === '*')) {
state.isComment = true;
}
else if (state.isComment && (cssStr[i-1] === '*') && (cssStr[i] === '/')) {
state.isComment = false;
}
}
return;
}
Technically, I should be collecting tokens. However, I just want to know when I should insert tags into the CSS to color code the different types of elements. For example, just focusing on comments, I would want this:
/* comment */
p { margin-bottom: 1em; }
To turn into this:
<span class="comment">/* comment */</span>
p { margin-bottom: 1em; }
Here’s an attempt at that in code:
const highlight = cssStr => {
const state = {};
let output = ''
for (let i = 0; i < cssStr.length; i++) {
// Consume comments
if (!state.isComment && (cssStr[i] === '/') && (cssStr[i+1] === '*')) {
state.isComment = true;
output += '<span class="comment">';
}
output += cssStr[i];
if (state.isComment && (cssStr[i-1] === '*') && (cssStr[i] === '/')) {
state.isComment = false;
output += '</span>';
}
}
return output;
}
But when I am reading the HTML characters, I’m actually going to be encountering HTML-encoded syntax. That is, >
will look like >
. That’s the challenge for tomorrow.