There are many different metrics for measuring text complexity. The two I find most interesting are Lexiles and ATOS Book Levels, because there are online tools that give these measures for many popular children's books. (The Lexile scheme seems to have better coverage of American books and the ATOS one of British books.)
These metrics are largely based on vocabulary and simple measures like sentence length, and don't take into account the age appropriateness of the content or the length of the book, though the ATOS rater gives a separate "Reading Interest" age range and a wordcount, and Lexiles include markers such as AD (usually read by adults to children), NC (complex but aimed at younger children) and HL (simple but aimed at older children).
Here are ATOS Book Levels and Lexiles for some books Helen has read or I've read to her:
- The Owl Who Was Afraid of the Dark - BL 3.6 - 550L
- The Cat Mummy - BL 4.1
- Fortunately the Milk - BL 4.3 - 680L
- Ottoline and the Yellow Cat - BL 4.3 - 760L
- Ice Bear (Davies) - BL 4.4 - AD 800L
- Esio Trot - BL 4.4 - 840L
- A Bear Called Paddington - BL 4.7 - 750L
- Matilda - BL 5.0 - 840L
- Swallows and Amazons - BL 5.1 - 800L
- Adventures of Achilles (Lupton/Morden) - BL 5.2 - 770L
- Finn Family Moomintroll - BL 5.2 - AD 770L
- Princess Mirror-Belle and the Magic Shoes - BL 5.3 - 770L
- The Children of Green Knowe - BL 5.3 - 880L
- The Worst Witch Strikes Again - BL 5.8 - 1000L
- She Persisted Around the World - BL 5.9 - NC 1080L
- Poo: A Natural History of the Unmentionable - BL 6.1 - NC 1180L
- The Story of Money (Jenkins) - BL 6.4
- The Hobbit - BL 6.6 - 1000L
You can see that the two metrics offer different orders for some books. As another illustration of this, Harry Potter and the Goblet of Fire and A Wizard of Earthsea are both BL 6.7, but the first is 880L and the second 1150L. There are some anomalous results when complexity doesn't match age appropriateness at all. Swallows and Amazons and Matilda may have similarly complex language, but the former is much slower-paced. I'm not convinced the language in the Princess Mirror-Belle books is anywhere near complex as that in The Children of Green Knowe. And neither Ice Bear, She Persisted, Poo or The Story of Money seem to me as difficult as their ratings suggest. (Perhaps because the whole point of non-fiction books is often to explain new concepts and the vocabulary associated with them, and coming across a new word along with an explanation of its meaning is not at all like coming across a new word mid-narrative.)
But the numbers do confirm some of my intuitions. Popular children's books from a century ago, for example, are often more difficult than one expects — The Call of the Wild (BL 8.0 - 1120L), The Wind in the Willows (BL 8.4 - 1140L) and Treasure Island (BL 8.3 - 1070L), for example — which may, along with copyrights having lapsed, explain why they are so often abridged.
In any event, these tools should be used with caution, perhaps for comparisons of books in the same genres and styles — or working out which of an author's books it might be best to start with. They may help parents (or teachers) who are trying to vary what they offer children, perhaps supplementing graded readers, but they can't replace a librarian or an experienced teacher. Apart from any other concerns, these metrics offer no guide to quality!