Updated unicode post with naughty strings test
This commit is contained in:
parent
9066d93dc8
commit
5ae4a3d034
@ -5,6 +5,8 @@ date: 2020-05-27
|
|||||||
tags: golang
|
tags: golang
|
||||||
summary: "With Go being a relatively modern programming language, first released in 2009, it is not unsurprising that it has great support for Unicode strings. What is surprising is just how far this support goes."
|
summary: "With Go being a relatively modern programming language, first released in 2009, it is not unsurprising that it has great support for Unicode strings. What is surprising is just how far this support goes."
|
||||||
---
|
---
|
||||||
|
_Updated 2020-05-28: Added big list of naughty strings test_
|
||||||
|
|
||||||
|
|
||||||
With Go being a relatively modern programming language, first released in 2009, it is not unsurprising that it has great support for Unicode strings. What is surprising is just how far this support goes.
|
With Go being a relatively modern programming language, first released in 2009, it is not unsurprising that it has great support for Unicode strings. What is surprising is just how far this support goes.
|
||||||
|
|
||||||
@ -126,3 +128,61 @@ print(𝕧𝕒𝕣𝕚𝕒𝕓𝕝𝕖)
|
|||||||
## In the wild
|
## In the wild
|
||||||
|
|
||||||
I've tried to find other examples of these non-Latin Unicode characters being used in real code but have so far come up empty other than Gomega. I had assumed there'd be examples of code written in Russian or Chinese that made use of this but I can't seem to find any. Perhaps having a mix of native language variables and functions mixed with the English build in library functions isn't such a desireable outcome.
|
I've tried to find other examples of these non-Latin Unicode characters being used in real code but have so far come up empty other than Gomega. I had assumed there'd be examples of code written in Russian or Chinese that made use of this but I can't seem to find any. Perhaps having a mix of native language variables and functions mixed with the English build in library functions isn't such a desireable outcome.
|
||||||
|
|
||||||
|
|
||||||
|
## Update
|
||||||
|
|
||||||
|
After posting this it was suggested to me to try the [big list of naughty strings](https://github.com/minimaxir/big-list-of-naughty-strings) to see how many of them Go can handle. This list is a collection of strings that often cause problems for programs in one way or another.
|
||||||
|
|
||||||
|
I put together a [fairly simple test case](https://share.cluster.fun/golangnaughtystringstest.js) that used each string as a variable and then tested if the code could build. To ensure as many strings from the list could be attempted I removed all spaces from the strings.
|
||||||
|
|
||||||
|
The results were a bit surprising...
|
||||||
|
|
||||||
|
> 72 of the 506 strings are valid variable names in Go
|
||||||
|
|
||||||
|
(Note: This number may be higher than it should be due to removing spaces from strings)
|
||||||
|
|
||||||
|
Of those 72 valid strings there are some that we'd expect similar to what we covered above:
|
||||||
|
|
||||||
|
* `ﷺ`
|
||||||
|
* `𝕿𝖍𝖊𝖖𝖚𝖎𝖈𝖐𝖇𝖗𝖔𝖜𝖓𝖋𝖔𝖝𝖏𝖚𝖒𝖕𝖘𝖔𝖛𝖊𝖗𝖙𝖍𝖊𝖑𝖆𝖟𝖞𝖉𝖔𝖌`
|
||||||
|
* `田中さんにあげて下さい`
|
||||||
|
|
||||||
|
But there are a few that are really surprising:
|
||||||
|
|
||||||
|
* `nil`
|
||||||
|
* `true`
|
||||||
|
* `false`
|
||||||
|
|
||||||
|
|
||||||
|
So, it turns out this is a perfectly valid Go program:
|
||||||
|
|
||||||
|
```go
|
||||||
|
package main
|
||||||
|
|
||||||
|
import "fmt"
|
||||||
|
|
||||||
|
func main() {
|
||||||
|
nil := "Not a value"
|
||||||
|
false := 55
|
||||||
|
|
||||||
|
if !true() {
|
||||||
|
fmt.Println(false)
|
||||||
|
}
|
||||||
|
|
||||||
|
fmt.Println(nil)
|
||||||
|
}
|
||||||
|
|
||||||
|
func true() bool {
|
||||||
|
return false
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
When run this outputs:
|
||||||
|
|
||||||
|
```
|
||||||
|
55
|
||||||
|
Not a value
|
||||||
|
```
|
||||||
|
|
||||||
|
Please, please, never do this in your code.
|
||||||
|
Loading…
Reference in New Issue
Block a user