Outline
- Extended E/R Model
- E/R Model Design Principals
- In-Class Exercise
- E/R Model Constraints
Introduction
- On Monday, we were describing the Entity-Relationship (E/R) model and E/R diagrams as a high level data model.
- We said this data model was useful in figuring out how to organize our data.
- The idea was once we have an E/R Diagram for our database, we then map it in procedural manner to the relational model and from there do some final sanity checks like is it in 3NF before implementing the result in a database.
- We went over how to draw Entity Sets, Attributes, and Binary and Multiway Relationships.
- We discussed many-one and one-one relationships and how to indicate these in our diagrams.
- We discussed how a relationship might have more than one copy of the same entity and how to indicate this in a diagram.
- Finally, we said how to convert multiway relationships into aset of binary relations involving a newly added entity set.
- We begin today by looking at a common extension to the basic E/R model to allow subclassing of entity sets.
Subclasses in the E/R Model
- Given an entity set there might be a large number of entities in it that share additional properties besides the attributes all members shares.
- If so, it often makes sense to create a sub-class of the entity set with these additional properties.
- We connect the subclass entity set with the original entity set using an isa relationship as we do in the above figure. (Sometimes this is diagrammed by just having a `in` symbol on the edge).
- Isa relationships are always one-one even though we usually don't put the arrows at the end of the edges.
- In the above, Cartoons would have all the attributes that a Movies entity does, but would also participate in the Voices relationship which says star voices the voice of a character.
- We do allow the creation of sub-subclasses using isa relationships, we require the subgraph of E/R relationships to forms a tree. I.e., we don't allow multiple inheritance.
- The book doesn't distinguish our basic E/R model from the model with sub-classes, but many books refer to the model allowing sub-classes as the Extended E/R model.
Design Principles for E/R Diagrams
- We now have many of the basic components of E/R-diagrams.
- Over the next several slides we consider design pricniples which tend to yield better E/R diagrams...
Faithfulness
- The design should be faithful to the underlying concepts of the data.
- Entity sets and attributes should reflect the real-world.
- So a Stars entity set shouldn't have a number-of-cylinders attribute, but an Automobile entity set might.
- If we have a relationship, it should make sense in terms of the real-world entities involved.
- For example, StarsIn should be a many-many relationship because in the real world, a star can be in many movies and a movie can have many stars.
Avoiding Redundancy
- The diagram should only express each item it represents once.
- This will avoid the redundancy anomalies we have already considered.
- There are several new mechanisms for redundancy in E/R diagrams.
- For example:
- If we have an Owns relationship between Movies and Studios.
- We might also give Movies a studioName attribute.
- This could lead to redundancy because:
- It is a repetition of a fact already represented by the Own relationship, and this repetition will persist after we run our E/R to relational model algorithm that we discuss later.
- One could also have an update anomaly where we change the Owns relation but not this attribute or vice versa.
Simplicity Counts
- It is important to try to keep the elements of the design to the smallest number necessary.
- For example, consider the E/R diagram:
- The two relationships Represents, Owns as well as the entity Holdings replace the single relationship owns we had before.
- The fact that Represents is one-one means we could determine from a movie the unique studio that own it, but this is more complicated than just having a single relationship, and would tend to be prone to error.
Choosing the Right Relationship
- In creating an E/R model, we often first try to figure out the entity sets and then the relationships between then.
- If we had `n` entity sets, we could have as many as `(n(n-1))/2` possible binary relationships between them.
- Most of these would probably not be useful and even even be derivable from other existing relationships.
- So we want to pick the right relationships to avoid redundancy anomalies.
- For example, last day we introduced the Contracts 3-way relationship between movies, stars, and studios.
- If we have this relationship, do we still need a StarsIn and a Owns relationship?
- Each may or may not be redundant depending on the intent of the people wishing to store the data.
- I.e., A StarsIn pair can be deduce from a contract for a star to appear in a movie of a studio. On the other hand, it might be possible for a star to appear in a movie without there having been a known contract. Do the people who will use the DB need to model this possibility?
Picking the Right Kind of Element
- Attributes are simpler to implement than entities sets which can be simpler than relationships.
- We may have a choice how to represent things as one of these three types.
- Our default should be choose the simpler element as long as it doesn't lead to anomalies.
- For example, suppose we got rid of the Studios entity set and and just added studioName and studioAddress to Movies.
- This might cause problems because we repeat the studio address for each movie made by a studio. This could cause both redundancy and update issues.
- On the other hand, if the people who are going to use the DB don't need the studio address recorded, then it might be safe just to avoid the Studios entity set.
Replacing an Entity Set with Attributes
- Suppose `E` is an entity set.
- If `E` satisfies the following, it may be safely replaced with attributes belonging to other relations:
- All relationships in which `E` is involved must have arrows entering `E`.
- If `E` has more than one attribute, then no attribute depends on the other attribute. That is, the only key for `E` is all its attributes
- No relationship involves `E` more than once.
- If these conditions are met, we can replace an Entity with attributes belonging to another relation as follows:
- If there is a many-one relationship from `R(F,E)`, then remove `R` and make the attributes of `E` attributes of `F`, renaming as needed if there are conflicts with existing attributes.
- If there is a multiway relationship `R` with an arror to `E`, make the attributes of `E` attributes of `R` instead and delete `E` and the arrow to `E`.
In-Class Exercise
- For each of the conditions of the last slide make a simple E/R diagram which satisfies the condition and a simple diagram that does not.
- Post your solution to the Oct. 17 In-Class Exercise Thread.
Constraints in the E/R Model
- We now look at how the E/R model expresses constraints.
- It can express both key constraint diagrammatically as well as referential constraints...
Keys in the E/R Model
- A key for an entity set `E` is a set `K` of one or more attributes such that, given any two elements `e_1` and `e_2` of `E`, `e_1` and `e_2` cannot have identical values for each of the attributes in `K`.
- Every entity set must have a key.
- In some cases, we will allow the key to live in a different entity set. For example, we might do this with entities arising in isa relationships.
- There will typically be more than one possible key for an Entity set, in which cases, we will usually privilege one as the "primary key".
- When an entity set is in an isa-hierarchy, we require that the root entity set for this tree to have all the attributes needed for a key.
- We use underlines on attributes in an E/R diagram to indicate the primary key of an entity set. For example, title year in Movies in the diagram above.
Referential Integrity
- The arrow notation we have used to indicate many-one and one-one relations, says if that a entity is involved in a most one tuple, but does not require all members of the many side to be involved with any entity of the one side.
- For example, we said that Owns(Movies, Studies) was many one to Studios.
- This says that any movie can be own by at most one studio, but not that every movie must be owned at all.
- If we want to say that every movie is referenced at least once as well, we use a rounded arrow like above.
- This is called a referential constraint.
Degree Constraints
- We can also indicate the cardinality of the participation in a relationship using an inequality on the edge as in the above.
- The above diagram says that at most 10 stars can appear in single movie.
- This kind of constraint is called a degree constraint.