More on first character start vertical patterns within the Voynich

By Sherri Mastrangelo, 29 September 2025.

This is a follow-up to: “The 490 - a pattern?”

In the previous post, I shared samples of vertical patterns within the starting characters of some paragraphs.

I analyzed many of the text heavy pages, folios 1 - 57r, 58r, 58v, 65v - 66v, 75r - 87v, 90r - 96v, 103r - 116r. This is ignoring the wheel on 57v, the zodiac section, some of the other wheels and foldouts, and the heavily labeled root section. In total I used 174 pages, with 3,784 lines of text (note that number for my later probability equations). When a folio had a separate vertical column in front of the text paragraphs, such as 76r, it was ignored.

I identified 18 unique starting characters for all the lines.

The highest string-length of characters that repeat elsewhere is “7", with two different pairs of seven string length. The first pair can be found on 14r and 104r, and the second pair on 105r and 106r, as pictured below:

The next highest string length is 6 characters, with 22 different matching pairs found - though with a heavy amount of overlap between sets. For example, the string of seven length above in 14r, also includes the the string of six on 15r, “0 9 0 4 9 0”, as well as the string of five “9 0 4 9 0” counted in the data.

Of the 174 pages I analyzed, I found vertical patterns in the first characters of 5 string length or more in 88 pages or a little over 50% of the text. Again, with overlap. Here’s a sample:

You’ll notice I prefer not to work with EVA characters. I also made judgements about the characters, as some are more obvious than others. The intent was to look for patterns.

So does any of this mean anything? Are repeating patterns a natural consequence of having so many similar “words” in the manuscript - or at least so many similar prefixes? Or of a cipher? Or perhaps a lazy scribe, copying from previous pages as he made up the text?

Let’s look to probability equations - especially the Poisson distribution (same birthday) model. This is where ChatGPT+ comes in clutch. For this, my first prompt was: “Assume a manuscript of 3,784 lines of text is in a grid of various column length. Let’s look at the first column, which has 18 unique possible characters. How likely is it for a vertical pattern of seven characters to have an identical seven string pattern elsewhere in the column?”

This gave me a 1.16% chance, or about 1 in 86 that at least one 7-character vertical sequence repeats. Seems pretty reasonable actually. So how likely are two different pairs of seven string length, as shown in the image above?

That’s about 0.0067%, or 1 in 15,000, which means the text is not random nonsense.

Here’s what that can mean:

So “not random”. Which could mean a language, or a cipher. Or could still mean a scribe or two copying from other pages. It really doesn’t tell us much, does it? What do you think?

More on first character start vertical patterns within the Voynich

Helpful Guides

Improving Skills