Typically, we supplement the process model with a data model - something geared to very precisely describe all the data elements in a system and the relationships between them.
For modeling the data, we use an entity relation diagram, or ERD. As with the DFDs, there is generally a logical model, produced during the analysis phase, and a physical model produced during the design phase.
In creating an ERD, we first group all the information in the system into logical units. For example, if the system was for booking appointments at an auto repair shop, we might decide that the logical units are
In doing this, we are creating an entity name for the logical group of data, and a list of attributes associated with the entity.
We can then show relationships between different entities by drawing labelled lines between them.
We usually subdivide the entities into three subgroups: the associative entities, independent entities, and dependent entities.
Independent entities can exist on their own: e.g. a client entity can exist independent of the other entities we create.
Associative entities typically exist to store information about many-to-many relationships between two other entities. For example, suppose we have an entity to represent the actual problem in a car, another entity to represent a symptom, and we wish to also represent the severity of a symptom. The severity can't really be made an attribute of symptoms, since it's different depending on which problem is causing the symptom. It also can't be made an attribute of the problem, since then all the problem symptoms would have the same severity ... The solution is to make the severity an entity on its own, one which is "attached" to the relationship between the problem and the symptom:
+--------+ |SEVERITY| +--------+ | +-------+ | +-------+ |PROBLEM|---+-----|SYMPTOM| +-------+ +-------+Dependent entities cannot exist without certain other entities. Many associative entities fal into this category: the idea of severity makes no sense if not related to a symptom.
The attribues are usually listed within the entity block, e.g.
+-----------+ |APPOINTMENT| +-----------+ | start time| | duration | | location | +-----------+
For example, suppose we have a car dealership, and we include one entity called VENDOR and another called CAR:
+------+ |VENDOR| +------+ | | sells | +---+ |CAR| +---+
Relationships have cardinality, which may be one-to-one, one-to-many, or many-to-many, indicating how many different types of entity may hold the relationship to other entities. For example, one mother may have many children, but each child has only one mother. The one-to-many relationship is indicated as shown below:
+------+ |mother| +------+ | | /|\ +-----+ |child| +-----+Observe that a single line is used to indicate one, while a "fork" is used to indicate many.
Relationships also have modality, indicating the minimum number of relationships that must exist for the data to be valid. For example, in our auto-repair case we don't include people who have never had an appointment with our shop, i.e. to be a valid client you must have had at least one appointment -- therefore the modality is 1. Typically we are most interested in tracking whether the modality is 0 or 1 (i.e. does the relationship have to hold in at least one case for the data to be valid?). The modality is usually represented by annotating the relationship with a 0 or 1.
Data Design Questions
Some of the questions you may ask yourself when trying to create a data model include:
(Note: I've expressed these questions in the context of a business system, but similar ideas hold for whatever kind of system you are trying to model.)
Building ERDs
The ERDs are typically established by going through the process model to identify all the relevant entities.
Once the entities have been established we can go through and attach appropriate attributes, and identify relationships between the entities.
This is very much an iterative process: refining our description of exactly what the data elements are, what properties they have, and how they are related.
Some common guidelines when building your ERDs include
The Data Dictionary
Once again, the ERD is a tool to clarify the structure and relationships of data within the system, but the diagram by itself does not convey everything about the data.
We need to include supporting text (often using standard forms for each entity and/or relationship) describing all the relevant details.
It is common practice to supplement your ERDs with a full data dictionary, which is essentially a lookup table with precise information on each data item in the system: its name, where it is produced, where it is used, relationships to other data items, a clear and intuitive text description, and a full set of cross references to appropriate DFDs and ERDs.
Example ERD
The (very simple) ERD below considers the logical data necessary in a prescription billing system. It assumes there are five logical data entities we are interested in: doctors, patients, insurers, treatments, and prescriptions.
In the example shown, each of the relationships is one-to-many (indicated by the 1 or N on the connections representing relationships).
the fields below are pretty sparsely described - this might be the ERD after a "first pass". On subsequent passes the descriptions would become much more detailed.
In terms of supporting text for such an ERD, we might include something like:
Overview:
Our system, designed to maintain prescription billing information, models the interaction of patients with their doctors and insurers, and the associated treatments and prescriptions.
The diagram above provides an intuitive view of the model, with further details provided in the descriptions below. Additional information on the data involved can be found in the Data Dictionary (include page number or link).
Entities:
Our system includes five data entities, representing the doctors, their patients, the insurers of the patients, the patients' treatments, and the associated prescriptions. Each entity is described below:
ID: ERDEnt001 Description: Doctors are provincially registered, practicing physicians, and by the time of treatment each patient must have one doctor as their named physician. Each doctor can have one or more patients.Note/oversimplificiation #1: this doesn't handle cases where patients see multiple specialists and get prescriptions from each
Key attribute: Physician's Provincial Registered ID Other attributes: Physician name, address, phone, email, specializations Relationships: Doctor treats Patient Attribute Constraints: Physician's Provincial Registered ID: a 16-digit number Physician's name: a text string Physician's address: a street address, city, country, and postal/zip code Physician's phone number: a country code, area code, local number, and extension Physician's specializations: a text string
ID: ERDEnt002 Description: The Insurer is any one of a number of insurance agencies providing patients with cost coverage for medical treatments. The system assumes each patient has one such covering agency, and agencies may provide coverage for many patients.Note/oversimplificiation #2: this doesn't handle cases where patients have no coverage, or have coverage under multiple plans
Key attribute: Insurance Carrier ID Other attributes: Carrier name, address, phone number Relationships: Insurer insures Patient Attribute Constraints: Insurance Carrier ID: a 32 character alphanumeric string Carrier Name: a text string Carrier Address: a street address, city, country, and postal/zip code Carrier Phone Number: a country code, area code, local number, and extension
ID: ERDEnt003 Description: The patient is an individual obtaining treatment (involving a prescription) from a Doctor, where a portion of the cost may be covered by their Insurer. Each patient is assumed to have one Insurer and one Doctor. Key attribute: Patient Medical ID Other attributes: Patient name, address, phone, date, insurance plan type Relationships: Doctor treats Patient Insurer Insurer insures Patient Treatment is given to Patient Attribute Constraints: Patient Medical ID: a 10-digit number Patient Name: a text string Patient Address: a street address, city, country, and postal/zip code Patient Phone Number: a country code, area code, local number, and extension Date: Month, day, and year (mm/dd/yyyy) Insurance Plan Type: a text string (name), plus a 3-digit carrier identifier, 7-digit group identifier, and 6-digit client identifier
ID: ERDEnt004 Description: A Treatment is one course of action chosen by a Doctor for a Patient in handling a particular medical issue. A Patient may be undergoing multiple treatments for different medical conditions, and each treatment may be associated with one or more prescriptions. Key attribute: Treatment Code ID Other attributes: Patient name, product name, date, symptoms, description Relationships: Treatment is given to Patient Attribute Constraints: Left as an exercise for the reader
ID: ERDEnt005 Description: A Prescription is a single script prescribed by a Doctor for a Patient in the process of a specific Treatment. Key attribute: Product ID Code Other attributes: Product name, strength, dosage, manufacturer, amount Relationships: Treatment includes Prescription Attribute Constraints: Left as an exercise for the reader
Relationships:
Our system includes four relationships: a doctor treats a patient, a patient is insured by an insurer, a treatment is given to a patient, and a treatment includes a prescription. Each relationship is described below:
ID: ERDRel001 Description: Each Doctor treates one or more Patients, each Patient is treated by exactly one Doctor. Entities: Doctor (treats) Patient Cardinality: One Doctor to one or more Patients
ID: ERDRel002 Description: Each Insurer insures one or more Patients, each Patient is insured by exactly one Insurer. Entities: Insurer (insures) Patient Cardinality: One Insurer to one or more Patients
ID: ERDRel003 Description: each Treatment is given to one Patient, a Patient may be given one or more Treatments. Entities: Treatment (is given to) Patient Cardinality: One or more Treatments to one Patient
ID: ERDRel004 Description: Each Treatment includes one or more Prescriptions, each Prescription is included in exactly one Treatment. Entities: Treatment (includes) Prescription Cardinality: One Treatment includes one or more Prescriptions
Example ERD
As a more complex example, suppose we were to consider an ERD for campus timetabling and room booking along the following lines:
A room can be used for multiple lecture sessions of a course offering at the same time if all are taught by the same instructor, otherwise a room cannot be used for different sections (of any courses) at the same time.
The system we plan on developing will take care of timetabling all the courses and selecting all the rooms each year after a list of course offerings, sections, and instructors is determined.
For the ERD, we are simply concerned with how to effectively model all the information listed above, so the rest of our system can use it efficiently.