Product Classification in the Retail Industry
We encounter barcodes almost everywhere we go. They are on our mail, on books, on our drivers license, on coupons and loyalty cards, and on the things we purchase. We see them so often that we hardly even notice them. But what are barcodes really? And how are they used?
This article dives into different examples of these codes, and what meaning can be derived from them. We’ll be focusing on the retail industry.
Ever seen one of these before?
This is a Universal Product Code (UPC). They exist on almost all products that we buy. Here’s one on a chocolate bar…mmmm.
When utilizing a scanner, UPCs allow retailers to quickly and accurately identify an item. This has obvious benefits to inventory tracking, and also speeds up checkout times at the point of sale. UPCs have been around since the 70’s.
Do you want to own a UPC? You can easily get one…
GS1 is the company that controls and issues UPCs. To get a UPC, you must first register as a company with GS1. Registration costs start at $250, with annual fees as low as $50. The more products you sell, the more UPCs you’ll need, and the higher the costs with GS1. For example, a 12-oz can of Coca-Cola needs a different UPC than a 16-oz bottle of Coke, as does a 6-pack of 12-oz cans, a 12-pack, a 24-can case, and so on. It can add up! Here’s a little graphic to determining how many UPCs your company would need.
Essentially, you got three different orange shirts and three different blue shirts. So you got 6 UPCs.
What’s in a UPC?
The UPC is made up of 3 parts:
- Company Prefix: a 6–10 digit number assigned by GS1. The bigger the number here, the less digits you have left for identifying individual products/items. For example, a 10-digit Company Prefix only leaves a single digit left for identifying products. Smaller companies who only need to identify a single product would be issued a 10-digit Company Prefix, while companies with a lot of products get a smaller prefix. Coca-Cola has a 6-digit Company Prefix (049000).
- Item Reference: a reference to a specific product/item.
- Check Digit: a calculated number used to check for errors in the barcode. This little guy has its own Wikipedia page if you’re curious about the calculation.
To complicate matters, the very first digit of a UPC can sometimes categorize the type of UPC we are seeing. This first digit is called the Number System Digit. If this digit starts with 0,1,6,7,8 or 9, then we are seeing a standard UPC like in the example above. Here’s all the possibilities.
- 0–1, 6–9: Standard UPC made up of Company Prefix, Item Reference, and Check Digit
- 2: Used for products where the price is dependent on weight. Often includes meats, prepared foods, and packaged produce. The format for these is 2-XXXXX-CPPPP-C.
• 2 — the prefix indicating this is a random weight Type 2 UPC.
• XXXXX — 5 digits to indicate the unique item in that store. This is not a universal identifier across retailers.
• CPPPP — a check digit, followed by 4 digits for the price of the item. 0199 is $1.99. 2049 is $20.49.
• C — the final Check Digit. Check out the mushrooms below. They cost $3.15. Do you notice the ‘0315’ digits in the UPC that represent the price?
- 3: Used for drugs intended for human use. This includes both prescription and over-the-counter drugs. These start with a ‘3’ and end with a Check Digit. The 10 digits in the middle is known as National Drug Code (NDC). The NDC is regulated by the FDA.
- 4: Reserved for use by retailers. These UPCs are not regulated by GS1 and are not universally unique.
- 5: Reserved for coupons, although UPCs prefixed with a ‘5’ have been phased out since 2011.
In additional to getting a UPC from GS1, there are also UPC resellers out there (and also many scams posing as resellers). Legit resellers will own a Company Prefix with GS1, and then sell ownership to some of the Product IDs (Item References) within that Company Prefix. Most large retailers require that you have a UPC issued from GS1 rather than a reseller.
One legit use of UPC codes that are not issued directly by GS1 are those with the company prefix ‘033383’. This company prefix is managed by the produce industry in the US and Canada, and is known as The Generic UPC. There are over a 12,000 individual produce designations within this company prefix. These generic UPCs are just like other standard UPCs…there is the 6-digit company prefix (033383), a five-digit item reference, and a check digit. To use them, you must purchase an annual subscription from the Produce Marketing Association.
Global Trade Item Numbers (GTIN)
Just to open up the world of barcodes a little for you, another name for a UPC is GTIN-12. The ‘12’ stands for 12 digits. There are a few other flavors of Global Trade Item Numbers (GTIN), all of them controlled by GS1. GTIN-12 is used to represent the 12-digit UPC barcodes we see in the United States.
GTIN-13: a 13-digit barcode used predominantly outside North America.
- Also known as EAN-13 (European Article Number). Since it’s been widely adopted outside of Europe, it has been rebranded as the International Article Number (IAN).
- The first three digits identify the country. For example 744 is for Costa Rica, and 778–779 are for Argentina. Here is a complete list. After the country code, it is similar to a GTIN-12 in that there is a manufacturer code, a product code, and then a check digit.
- GTIN-13 codes that start with a ‘0’ are actually just GTIN-12 codes that have been converted to GTIN-13. You can convert a GTIN-12 to a GTIN-13 by just adding a ‘0’ in front of the GTIN-12.
GTIN-14: a 14-digit barcode often used to label boxes/cartons of product containing the same item. The format for this is the same as GTIN-13, with the addition of a single prefix digit called the ‘Indicator’. This ‘Indicator’ digit is used to designate different levels of packaging. See the image below of a GTIN-14.
GTIN-8: an 8-digit representation of the GTIN-12 or GTIN-13. It almost seems impossible that you could convert a 12-digit UPC to an 8-digit code and back again. But there is a way!
- Also referred to as UPC-E
- Used when there is limited printing space
- The conversion to GTIN-8 only exists/works for GTIN-12 codes that start with a ‘0’
- How this conversion between 12-digit and 8-digit codes is a bit complicated. Here is a snippet of python code that shows how to do the conversion programmatically.
Price Lookup (PLU) codes exist for the same reason as UPCs, but are used for produce items specifically. Since produce comes from the earth, with many different companies growing the same homogeneous product, there is no company designation on a PLU. The PLU merely designates an exact type of fruit/veggie.
PLUs identify a specific type of produce, and there are 1,461 PLU codes currently defined. Take a look here to help you visualize what these are.
PLU codes are 4–5 digits long. They utilize numbers in the range of 3000–4999. Organic products are prefixed with a ‘9’.
- Anything starting with a ‘9’ indicates an organic product, followed by the 4-digit code for its non-organic substitute. For example, 3071 is a Granny Smith Apple. 93071 is an Organic Granny Smith Apple.
- Once PLU codes in the 3000–4999 range are exhausted, they will expand to the 83000–84999 range. Unlike the ‘9’, the preceding ‘8’ will have no significance.
PLU codes are used globally. But unfortunately, this doesn’t always make them unique identifiers. The reason for this is that some PLUs are designated for “Retailer Assignment”. In fact, of the 1400+ existing PLUs, over 400 of them are “Retailer Assigned.”
- PLUs in the 3170–3270, and 4460–4469 ranges can be assigned to any produce item by the retailer.
- The other “Retailer Assigned” PLUs are earmarked for certain types of produce. For example, 4193–4217 are for retailer assignment of Apples. 4219 is for Apricots. 4275–4278 for Grapes. 4785–4789 for Squash. And many, many more ranges for other things.
You’re probably used to seeing stickers on your produce. Here’s a couple produce items I encountered recently.
One interesting twist here is the barcode on the apples. Or is that two barcodes on the sticker??
In fact, this is known as an omnidirectional stacked barcode, which is a fancy name for a barcode that can be scanned in any direction. Our friends over at GS1 created this barcode and call it the GS1 DataBar. It contains a GTIN-14 code (that’s 14-digits in there). Lots of produce items have these now. The benefit is that it encodes more information. Specifically, GS1 DataBar encodes the same information as GTIN-13 or GTIN-12 codes (so a company prefix, an item reference, and a check digit). It does not contain packaging info like a lot of other GTIN-14 codes.
Interestingly, the Item Reference in the GS1 DataBar is not necessarily the PLU. The owner of the code is free to assign a different value to the Item Reference if they’d like to. In order to tie the GS1 DataBar and the PLU together, GS1 has created a place where producers can upload their GS1 DataBar code along with its corresponding PLU. Retailers can then download this information from GS1, and then use this to identify the PLU when the item is scanned. GS1 calls this upload tool “DataBar Online”.
The Stock Keeping Unit (SKU) is an alphanumeric string of variable length that is assigned to a product by an individual seller or retailer. So a single UPC could go by many different SKUs, depending on how many places sell it and how those individual sellers categorize their inventory. These are not universal codes.
SKUs exist for internal management of a product. They are not meant to be an externally facing identifier like the UPC.
You may wonder why sellers don’t just always use the UPC when referencing a product. Well, what if they also wanted to create sub-categories for a single UPC, like for date purchased, location purchased, condition, or warehouse location? This is where a SKU comes in…a product will have one UPC, but can have many different SKUs.