FreeREG logo

Reading and Transcribing Parish Registers.


There are two key aspects with the conversion of parish register information into a record for FreeREG. The first is how to read the register and the second is how to transcribe the information for FreeREG.

These matters are not as straightforward as they might appear at first glance. Firstly, the registers are in most cases several hundred years old and have not been kept in the best of environments; so they are often faded, covered in water marks and occasionally damaged. Secondly they will have been photographed, for few of us get to actually handle the original documents today. The process of photography has introduced issues of focus and alignment that will make reading more difficult. (The registers are usually small bound documents that do want to lie flat for their pictures to be taken!) Thirdly, the original records were written with different instruments; usually a quill pen of some form that gives distinct characteristics to the letters and words. Fourthly, the education and writing skills of the vicar or person writing the record varies significantly; sometimes there will be good penmanship other times one wonders. Lastly, the alphabet and graphics used for that alphabet are different from those in use today and have changed over the past 500 years. All of these factors means that there will be significant uncertainies in what you see and can interpret from a register.

As a result, we have developed a set of transcription rules that are designed to allow permit you to convey the maximum amount of information into the record and allow the search engine to extract the best solution from the information base. (Just one word of caution, the search engine is not currently able to make use of all of the transcription rules but it does remain under development). So lets explore these two aspects in the reverse order.

Transcription Rules.

For those of you who have transcribed for other projects the first and last rule for transcribing is well known. You transcribe what you read; errors and all!!!!! It is to be left to the researcher to make a correction or adjustment for his or her own purpose. Your job, as job one, is to tell it as it is; nothing more, nothing less. (It is worth noting that this is a change from the original specification for FreeREG where transcribers were encouraged to convert the old texts into the modern usage) In the next section on reading the registers we will try to assist you in extracting the maximum amount of useful information.

Uncertain Character Format (UCF)

Some common types of uncertainty that you are likely to encounter in your first few batches of transcription, and the technique to use for each of them, are given in the table below. The section after the table describes each of the formats.

Uncertainty Which Uncertain Character Format to Use
Can't tell if it's an l or a t Use the [lt] style of UCF
Can tell how many letters I can't read Use the _ style of UCF, one _ for each letter
I think I can read the letter Use the [x_] style of UCF, where x is what you think the letter is
It's 2 or 3 letters I can't read Use the _{2,3} style of UCF
Don't know how many letters I can't read Use the * style of UCF
Not sure if that's a letter or an ink blob Use the _{0,1} style of UCF
Not sure of the word transcribed Use the ? style of UCF

_ (Underscore) A single uncertain character. It could be anything but is definitely one character. It can be repeated for each uncertain character.
* (Asterisk) Several adjacent uncertain characters. A single * is used when there are 1 or more adjacent uncertain characters. It is not used immediately before or after a _ or another *.
Note: If it is clear there is a space, then * * is used to represent 2 words, neither of which can be read.
[abc] A single character that could be any one of the contained characters and only those characters. There must be at least two characters between the brackets.
For example, [79] would mean either a 7 or a 9, whereas [C_] would mean a C or some other character.
{min,max} Repeat count - the preceding character occurs somewhere between min and max times. max may be omitted, meaning there is no upper limit. So _{1,} would be equivalent to *, and _{0,1} means that it is unclear if there is any character.
? Sometimes you will have the situation where all of the characters have been read but you remain uncertain of the word. In this case append a ? at the end of the word e.g. RACHARD? The most frequent place where a ? is used is with transcription that have been donated from other systems and are being converted for entry into FreeREG.

Note: Using a single * is preferable to spending a long time trying to decide the min and max values to use in the _{min,max} format, which is more precise.

Technical note: Although this UCF format has many similarities to regular expressions (e.g. Perl, Unix) it is not identical and in particular there is no escape mechanism.

Reading a Register

Your first reaction on looking at a register, especially the older ones, may well be to ask yourself how am I ever going to make sense of this. Your second reaction may be to ask yourself why am I doing this. Your third reaction might be to throw up your hands and walk away. Please don't. You are engaged in one of the most important activities designed to help all of us research our forbearers. So please bear with it. In the following sections we will try to help you make sense of what you see. After a while you will come to recognize that old writing and surprise yourself at how good you have become. Also don't forget you can use that Uncertain Character Format to deal with the problem entries and move on.

The Alphabet and its Graphical Representation

One of the biggest issues is how to read 16th century writing. Well I am no expert and there are several resources on the internet that you may want to have a look at.

National Archives Tutorial
Scottish Handwriting
Genealogy Handwriting
Old Handwriting

In the following paragraphs I will highlight some of the issues as I see them. Then if you want to go and have a look at some of those other resources.The following table gives an excellent rendition of the early alphabet and how people of different backgrounds wrote their text.


The first important thing to notice is that there were no separate characters for u and v. From the 1630s onwards, printers started to use the u letter-form (or 'graph') to denote the vowel, and the v graph to denote the consonant. Before this time there was only one recognized letter of the alphabet, which could be written or printed in two ways. This is why the letter w is not called 'double-u' and not 'double-v'. Printers before the 1630s used v initially (at the start of a word) and u medially. Practice in manuscript was never this consistent, with u and v graphs being used for both consonant and vowel, both initially and medially. Ambiguities caused by this system can make life difficult. It is important that you don't lose information by deciding too soon whether a u or v graph encountered is the vowel or consonant. Your job in transcribing is to report exactly what is there in the register, so u and v forms must be distinguished from each other where possible and not silently or unconsciously brought into line with modern practice.

A different case is the letters i and j. As late as the nineteenth century, some still insisted that j was just a variant form of the letter i, which could represent both a vowel and a consonant. But many tried to use the j form for a consonant and the i for a vowel. You may even find j suffixed to a name, such as 'Walterj'. Again, your job is to record what you see, which will in most cases be a letter i or j.

The 's' is especially problematic. It has both long 's' and short 's" forms. The long 's' is usually clear at the start of a word eg (Samuel). But don't get the long 's' and 'f' mixed up in a word eg (Bush) and (Harrison). Normally the 'f' will have a cross stroke, even if it's hardly noticeable, and the context will make it clear whether it is a long 's' or an 'f'.

The terminal 's' tends to fall between the two forms. See for example, , and . Also look at the capital 'H' in the last case.

Within a word the double is is written with a long 's' followed by a short 's'; looking like an fs.

Remember that in secretary hand the lowercase 'c' looks exactly like a modern day 'r'.

The lower case 'e' tends to not have a central stroke, so can look more like a 'c', or an 'o' if it is biting with the next letter. See for example,

Also note the use of double 'f' which is a capital 'F'. See for example, and . It would be easy to mistake these as a modern 'H'.

There are two forms of lower case 'r', the '2' shaped one which occurs after 'o', and the long 'r' which descends below the line. The long 'r' can consist of no more than a single down stroke, with no horizontal stroke at all. This can make it quite hard to distinguish, particularly when combined with a preceding 'e'.

Use of 'es' for genitive, rather than apostrophe and 's'. For example, kinges . It may look like there is an apostrophe after the 'e', but what you can see is actually part of the letter 'e', called a 'horn'. See for example, (Reynoldes). Note also the nature of the capital 'R' in this example.

The abbreviation sign that means characters have been omitted. This is a dash over the preceding vowel(s). The context will make it clear which letter(s) it is. See for example, and . Note also the nature of the capital 'R' in both these examples. In these cases the missing letters are inserted. (This is a deviation from normal practice caused by our use of the Uncertain Character Format) but it is advisable to put in the notes the original entry without the abbreviation sign.

Hints and Tips.

  1. Please do not put comments such a "illegible", "unreadable", etc. to indicate unclear characters in the Notes field.
  2. Don't forget that possibly the most important information is Surname, Forenames, and Date. So place your effort there and not in the other fields. With as much of this information as possible, many researchers can access a fiche quickly to make a judgement on other uncertain data themselves.
  3. Don't spend too long trying to puzzle out or guess a troublesome entry. Keep sight that we have several 100 million rows to enter. Better to flag the character or field as uncertain, and go on to enter the next !!
  4. Similarly, please don't aim for perfection and decide to give up the data input as too hard after only a few pages. Do the best you can, and let our data validation and future users correct any mistakes. Don't forget that the errors you catch yourself making will be only a small part of the errors you actually make. We are but Human - and even the source data has mistakes.
Most transcribing problems will be solved by a visit to our Transcribers'Knowledge Base, which can be accessed by clicking here and there is also page of Frequently Asked Questions about the project as a whole which can be accessed by clicking here

FreeREG Home Page

If you experience errors while using FreeREG please report them to the Webmasters, indicating what you were doing (i.e. submitting, searching), the URL of the page involved and the exact time (and timezone) where it occurred. If you have suggestions on improvements to FreeREG please make them to the FreeREG Executive. If you believe there to be a problem with any of the data entered into the database please contact the Corrections Coordinators. In your email please include information on the county and place of the record.
© 1998-2015 The website, its layout, search engine, and database are copyright by the Trustees of FreeBMD, a charity registered in England and Wales, Number 1096940.
We make no warranty whatsoever as to the accuracy or completeness of the FreeREG data. Use of the FreeREG website is conditional upon acceptance of the Terms and Conditions