Lab: International Standard Book Numbers
This lab is adapted from an assignment by Peter-Michael Osera, Samuel Rebelsky, and Nicole Eikmeier at Grinnell College.
Reminder: you and your partner are a team! You should not move forward to one activity until you are both comfortable with the previous activity.
Introduction and Setup
In this lab activity, we’ll build on our skills with strings and functions, and combine them with new skills related to loops and lists. Along the way, we’ll introduce the map-filter-reduce paradigm for handling sets of data.
International Standard Book Numbers
Our task this week is to determine the validity of an International Standard Book Number (ISBN). An ISBN is a unique number that identifies a book. You may have seen these numbers inside or on the back of many of the books you read. ISBNs often come with barcodes that make it easy for librarians to scan them.
Anatomy of an ISBN
A 10-digit ISBN number has two parts:
- 9 identifier digits, which are all digits from 0 to 9.
- 1 check digit. The purpose of the check digit is to confirm the accuracy of the identifier digits. If the check digit does not match the identifier digits, then this may identify a typo in the 9 identifier digits. The check digit can be either a digit between 0 and 9 or the letter X.
For example, consider the ISBN 1438946685
. In this ISBN, the first 9 digits 143894668
contain information about the book, while the last digit 5
is the check digit. Another possible ISBN is 486910583X
, since the last digit is allowed to be an X
.
The Check Digit
The final digit of the ISBN is the check digit. The check digit is calculated according to a specific algorithm, based on the first 9 digits. Its purpose is to guard against possible typos when recording ISBN numbers. If the check digit doesn’t match the first 9 numbers, then it’s very likely that someone made a typo when recording the ISBN. In this case, we say that the ISBN is corrupted. It is likely necessary to check and correct the ISBN.
Here’s how check digits are calculated. Suppose that the first 9 digits of an ISBN are 143894668
.
- In left-to-right order, multiply the first digit by 10, the second digit by 9, the third digit by 8, and so on. The rightmost-digit is multiplied by 2. We then add up those terms, obtaining a sum that I’ll call
s
. In our example, we would do:
= 10×1 + 9×4 + 8×3 + 7×8 + 6×9 + 5×4 + 4×6 + 3×6 + 2×8 = 258 s
- Next, we take this sum (call it
s
) and computes
modulo11
. Recall thatmodulo
computes theremainder
of the division of two numbers. You can computes
modulo11
using the syntaxs % 11
. In our example,258 % 11 = 5
- We then subtract this result from 11 to obtain a result
d
:
= 11 - 5 = 6 d
- Almost there!
- If
d == 11
, then thecheck_digit
is0
. - If
d == 10
, then thecheck_digit
is `X``. - Otherwise, the
check_digit
isd
.
- If
In our case, d = 6
, and so check_digit
has value 6
. So, the full ISBN would be 1438946686
, with 6
being the check_digit
.
Suppose now that we’re given a string like "203740588X"
. We’d like to figure out whether this string is a correct ISBN, a corrupted ISBN (check digit doesn’t match), or not an ISBN at all.
Processing Batches of Data
One of the main powers of computation is to process batches of data. In this part of the lab activity, we’ll practice using the map-filter-reduce paradigm to process data. Put simply:
- Map operations take lists of data and produce lists of the same length containing new information.
- Filter operations take lists of data and produce smaller lists, based on some criterion.
- Reduce operations summarize lists of data into a single value.
Suppose now that you are librarian and you have received a large batch of ISBN numbers in a data file. However, you’re not sure which ones are valid! Some of the data are actually not ISBNs at all, while others are ISBNs that may be corrupted in some way. Here’s an example of your input data:
["1438946686",
"5638723730",
"203740588X",
"2037405886",
"hello",
"563ZXY3730",
"ISB NUMBER",
.
.
. ]
Some of these are valid ISBNs, some are corrupted ISBNs with incorrect check digits, and others are not ISBNs at all.
Map Operations
A map operation consists in applying the same function to each entry of a list, resulting in a new list of the same length. Let’s write a map operation to classify each ISBN in a list.
Filter Operations
A filter operation takes a list as input and creates a new list by choosing some items from that list according to a criterion. The easiest way to do this is with an if
statement. Filter operations almost always result in lists that are smaller than their original inputs.
Reduce Operations
A reduce operation takes a list as input and produces a single value, such as a number or string. A simple example of a reduce operation is computing the length of a list.
© Philip Claude Caplan, Andrea Vaccari, and Phil Chodrow, 2022