PHP

Identifying key attributes in ER modeling

In our introduction to ER modeling, we noted some of the implicit constraints of our model, including that there is only one customer #37 and one wine that we refer to as #168. In the model design so far, we haven't considered how to uniquely identify each entity.

Uniqueness is an important constraint. When a customer places an order, you must be able to uniquely identify that customer and associate the unique order with that unique customer. You also need to be able to uniquely identify the wines the customer purchases. In fact, all entities must be uniquely identifiable; this is true for all relational databases.

The next step is to identify the attributes or sets of attributes that uniquely identify an entity. Begin with the customer. A surname (or any combination of names) doesn't uniquely identify a customer. A surname, firstname, initial, and a complete address may work, although there are some cases where children and parents share the same name and address.

A less complicated approach for unique identification-and a common one that's guaranteed to work-is to add an identifier number (ID) attribute to the entity. A short unique identifier also leads to better database performance, as discussed in Chapter 3. Using this approach, assign ID #1 to the first customer, ID #2 to the second customer, and so on. In the model, this new attribute is underlinedto indicate that it uniquely identifies the customer as shown in Figure C-5.

Figure C-5. The customer entity with all attributes; the primary key is shown underlined
figs/wda_ac05.gif

You can take the same approach with wine as for customers-for the same reasons-and add an ID field.

For wineries and regions, the name is most likely unique or, at least, it can be made so. However, for simplicity, you should also use the ID attribute approach to prevent any ambiguity or need for the winestore administrator to create unique names for wineries or regions. The same argument can be applied to grape varieties.

Orders can also be dealt with by a unique ID, as can items and inventory. However, the uniqueness of this ID may be questionable. To illustrate, consider an example. You can number each order across the whole system uniquely, beginning with the system's first order #1. Alternatively, you can combine the customer ID with an order ID and begin each different customer's orders with order ID #1. The combination of customer ID and order ID is still unique, e.g., customer #37, order #1 is different from customer #15, order #1. This latter scheme is an example of a full participation relationship by a weak entity: an order isn't possible without a customer (hence, the term full participation) and the customer ID forms part of the order entity's unique identifier, hence the term weak entity.

You can use the scheme of full participation by a weak entity for orders; the other approach of numbering orders across the whole collection also works well. An advantage of this scheme is that the order number is more meaningful to the user-for example, a user can tell from their order number how many orders they have placed-and the order number provides a convenient counting tool for reporting. Participation is discussed briefly in the next section and weak entities are discussed in more detail later in Section C.2.1.7.

You can follow similar arguments with items. An item can be uniquely numbered across the whole system or can be numbered from #1 within an order. Again, this depends on the participation and, as with orders, we follow the latter approach. The same applies for inventory, which is numbered within a wine.

Because customer and users have a one-to-one relationship, the customer ID can be used to uniquely identify a user. Therefore, the users entity has full participation as a weak entity in the relationship with customer.