How Pure Bible Search Performs Its Searches
If you’ve used other Bible Search programs, either commercial or free/open-
source programs, you’ve probably discovered that the search method of King
James Pure Bible Search is very very different.
Most Bible Search programs treat the text as a large blob of unknown text as if it
were some arbitrary, even changing, document. They require you to type in some
word or phrase, click ‘search’, and wait and wait as it attempts to search through
the entire database to try and find the word or phrase, while trying to employ
special matching algorithms to match similar things, because it knows you’ve
probably typed the phrase wrong, since it itself doesn’t even know what’s in the
text and what isn’t.
But in reality, the King James Bible text is neither unknown nor does it arbitrarily
change. Therefore, King James Pure Bible Search makes use of this and treats it
as a known list of words, indexed by their exact position and word forms.
For the character set, King James Pure Bible Search treats all letters, the hyphen,
and the apostrophe, as unique characters composing a word. It also treats regular
Arabic Numerals as unique characters too, but the King James Bible doesn’t have
any numbers written as numbers, they are all written as words. It’s the “Holy
Word of God”, not the “Holy Numbers”.
Using this character set, all 12838(*) unique words (excluding case) were
extracted. A concordance was created mapping each word to its exact position in
the text. For example, using Genesis 1:1, we have the following unique words and
index mappings:
and : 8
beginning : 3
created : 5
earth : 10
God : 4
heaven : 7
In : 1
the : 2, 6, 9
This is what is stored in the database, but minimized to combine words that
appear with varying case, which minimizes the number of comparisons, as we
can see if the words match as lowercase and if so, then compare their correct case
only if the user happens to be doing a case-sensitive search.
As the database is loaded into memory, an inverse table is created mapping
position back to individual words:
1: in
2: the
3: beginning
4: god
5: created
156