Introducing string literal types in BuckleScript version 8.2
Highlights of our newest changes to the internal representation and how they will benefit our users.
Important: This is an archived blog post, kept for historical reasons. Please note that this information might be outdated.
String literal types in BuckleScript
String literal types were introduced by TypeScript to model JavaScript behavior, it's a relatively new concept since most type systems are runtime encoding agnostic. However, to smooth the user experience when writing bindings to existing JS API, we are introducing string literal types which are unique in several behaviors compared with TypeScript: they support type inference, pattern matching and can be attached to data.
Vanilla string literal types
The notation in Reason for string literal types is like this: `hello
, which will be compiled into "hello".
The difference is that `hello
is given a type so that you can not mix it with other strings.
Take the following code snippet as an example:
RElet encoding = (enc) =>
switch (enc) {
| `utf8 => 0
| `ascii => 1
| `utf16 => 2
};
It will be compiled into
JSfunction encoding(en) {
if (en === "ascii") {
return 1;
} else if (en === "utf16") {
return 2;
} else {
return 0;
}
}
If you pass a random encoding, e.g, encoding (`ucs32)
, you get a type error:
This expression has type [> `ucs32 ] but an expression was expected of type [< `ascii | `utf16 | `utf8 ] The second variant type does not allow tag(s) `ucs32
Another thing you can observe from the generated JS is that since the compiler can guarantee that the input could only be `utf8, `ascii,`utf16
, it will skip the comparison with "utf8" when the first two are compared.
If we add a wild card to match any encoding
RElet encoding = (enc) =>
switch (enc) {
| `utf8 => 0
| `ascii => 1
| `utf16 => 2
| _ => 3
};
It will generate JS as below:
JSfunction encoding(en) {
if (en === "utf8") {
return 0;
} else if (en === "ascii") {
return 1;
} else if (en === "utf16") {
return 2;
} else {
return 3;
}
}
Declaring types for string literal types
Note that all string literal types can be inferred. This is very convenient for you when you are doing development. When things get more stable, it would be nice to give string literal types a name as below:
REtype utf = [
| `utf8
| `utf19
];
type ascii = [
| `ascii
]
You can also embed string literal types directly inside other types without declaring it first:
REtype t = {
encodings : list([ | `utf8 | `ascii ])
}
The cool thing is that you can create union types by simply putting the types together:
REtype encoding = [
| utf
| ascii
]
The compiler even supports sugar over named string literal types:
RElet classify = (enc) =>
switch (enc) {
| #utf => "utf" // string literals belong to utf type
| #ascii => "ascii" // string literals belog to ascii type
};
The compiler would generate well optimized code as below:
JSfunction classify(enc) {
if (enc === "ascii") {
return "ascii";
} else {
return "utf";
}
}
String literal types in bindings
Since string literal types are just strings after type checking, you can use them to bind to js libraries directly without any conversion, as follows:
REtype encoding = [
| `hex
| `utf8
| `ascii
| `latin1
| `ucs2
| `base64
| `binary
| `utf16le
];
[@bs.val] [@bs.module "fs"]
external readFileSync: (string, encoding) => string = "readFileSync";
String literal types attached to data
Since Reason is a typed language, you can not mix data of different types in a collection.
For example, you will get a type error when writing code like this: [ 3, "3" ]
.
The deep reason is that if the compiler allows you to do such things after you box different types of data in a single collection, it is hard to give such collection a type and process it later.
With string literal types, you can do things like this:
RE[ 3 -> `Int , "3" -> `String ]
Note the generated code for 3 -> `Int`, "3"-> `String
would be:
JS{ NAME: "Int", VAL : 3}
{ NAME: "String", VAL : "3"}
And you can also write code to process such collections:
RElet handle = (xs) =>
Belt.List.map(
xs,
(param) => switch(param){
| `Int(n) => n
| `String(s) => String.length(s)
},
);
The generated code would be:
JSfunction handle(xs) {
return Belt_List.map(xs, function (param) {
if (param.NAME === "Int") {
return param.VAL;
} else {
return param.VAL.length;
}
});
}
To conclude, string literal types give users a convenient way to mix data with different types and process it via pattern matching later.
Declaring types for string literal types attached to data
Type inference is great during development. Users can also write down the formal types for string literal types attached to data:
REtype number_or_string = [
| `Int(int)
| `String(string)
];
Further reading
Here we only cover most daily usage of string literal types. For more advanced usage, see here. The type theory is almost the same, however, we adapt it to make sure it is compiled into string literals to match the JS runtime.