diff --git a/src/posts/2020-05-27-unicode-support-in-go.md b/src/posts/2020-05-27-unicode-support-in-go.md index 7bd0c8a..a337e9d 100644 --- a/src/posts/2020-05-27-unicode-support-in-go.md +++ b/src/posts/2020-05-27-unicode-support-in-go.md @@ -5,6 +5,8 @@ date: 2020-05-27 tags: golang summary: "With Go being a relatively modern programming language, first released in 2009, it is not unsurprising that it has great support for Unicode strings. What is surprising is just how far this support goes." --- +_Updated 2020-05-28: Added big list of naughty strings test_ + With Go being a relatively modern programming language, first released in 2009, it is not unsurprising that it has great support for Unicode strings. What is surprising is just how far this support goes. @@ -126,3 +128,61 @@ print(π•§π•’π•£π•šπ•’π•“π•π•–) ## In the wild I've tried to find other examples of these non-Latin Unicode characters being used in real code but have so far come up empty other than Gomega. I had assumed there'd be examples of code written in Russian or Chinese that made use of this but I can't seem to find any. Perhaps having a mix of native language variables and functions mixed with the English build in library functions isn't such a desireable outcome. + + +## Update + +After posting this it was suggested to me to try the [big list of naughty strings](https://github.com/minimaxir/big-list-of-naughty-strings) to see how many of them Go can handle. This list is a collection of strings that often cause problems for programs in one way or another. + +I put together a [fairly simple test case](https://share.cluster.fun/golangnaughtystringstest.js) that used each string as a variable and then tested if the code could build. To ensure as many strings from the list could be attempted I removed all spaces from the strings. + +The results were a bit surprising... + +> 72 of the 506 strings are valid variable names in Go + +(Note: This number may be higher than it should be due to removing spaces from strings) + +Of those 72 valid strings there are some that we'd expect similar to what we covered above: + +* `ο·Ί` +* `π•Ώπ–π–Šπ––π–šπ–Žπ–ˆπ–π–‡π–—π–”π–œπ–“π–‹π–”π–π–π–šπ–’π–•π–˜π–”π–›π–Šπ–—π–™π–π–Šπ–‘π–†π–Ÿπ–žπ–‰π–”π–Œ` +* `田中さんにあげて下さい` + +But there are a few that are really surprising: + +* `nil` +* `true` +* `false` + + +So, it turns out this is a perfectly valid Go program: + +```go +package main + +import "fmt" + +func main() { + nil := "Not a value" + false := 55 + + if !true() { + fmt.Println(false) + } + + fmt.Println(nil) +} + +func true() bool { + return false +} +``` + +When run this outputs: + +``` +55 +Not a value +``` + +Please, please, never do this in your code.