Logical Data Model (Relational Schema Diagram)

1. Overview

A Logical Data Model (LDM) — also called a Logical Schema or Relational Schema Diagram — is the intermediate artifact that sits between the abstract conceptual ERD and the concrete physical database schema. It describes what tables, columns, keys, and relationships the database will have, in enough detail to be unambiguous, but without committing to a specific DBMS product, indexing strategy, or storage layout.

An LDM is DBMS-independent. You should be able to hand the same logical model to a PostgreSQL, MySQL, Oracle, SQL Server, or Snowflake implementer and all of them can produce a valid physical schema from it. That neutrality is the whole point — it is the last place where the design is still portable.

What a Logical Data Model captures:

Tables (derived from conceptual entities)
Columns (derived from conceptual attributes) with generic datatypes
Primary keys, foreign keys, unique constraints, NOT NULL constraints
Cardinality of every relationship, with M:N resolved into junction tables
Normalization (typically to 3NF or BCNF)

What a Logical Data Model deliberately omits:

Vendor-specific datatypes (NUMBER(10,2), NVARCHAR(MAX), JSONB)
Indexes (beyond the implied PK/unique indexes)
Partitioning, sharding, tablespaces, filegroups
Denormalization for performance
Storage parameters, collations, compression

In short, the LDM answers the question "what is the structure of the data?" The physical model answers "how do we store and access it efficiently on this specific platform?"

2. The Three Levels of Data Modeling

Data modeling is conventionally split into three progressively more detailed levels. The LDM is the middle one.

┌────────────────────────────────────────────────────────────────────┐
│  CONCEPTUAL                                                         │
│  ───────────                                                        │
│  Audience: business analysts, domain experts, executives            │
│  Contains: entities and relationships, named in business terms      │
│  Omits: attributes (mostly), keys, datatypes, cardinality detail    │
│  Example: "A Customer places Orders. An Order contains Products."   │
└────────────────────────────────────────────────────────────────────┘
                                 │
                                 ▼
┌────────────────────────────────────────────────────────────────────┐
│  LOGICAL  (this page)                                               │
│  ────────                                                           │
│  Audience: data architects, application developers, DBAs            │
│  Contains: tables, columns, PKs, FKs, constraints, cardinality,     │
│            junction tables, generic datatypes, normalization        │
│  Omits: indexes, partitions, vendor datatypes, storage details      │
│  Example: customers(customer_id PK, email UNIQUE NOT NULL, ...)     │
└────────────────────────────────────────────────────────────────────┘
                                 │
                                 ▼
┌────────────────────────────────────────────────────────────────────┐
│  PHYSICAL                                                           │
│  ─────────                                                          │
│  Audience: DBAs, platform engineers, performance engineers          │
│  Contains: vendor datatypes, indexes, partitions, tablespaces,      │
│            collations, materialized views, storage parameters,      │
│            denormalizations for performance                         │
│  Example: CREATE TABLE customers (...) TABLESPACE ts_hot            │
│           PARTITION BY RANGE (created_at) ...                       │
└────────────────────────────────────────────────────────────────────┘

Progression by example — the customer concept at each level:

Conceptual:
    ┌──────────┐
    │ Customer │
    └──────────┘

Logical:
    customers
    ─────────────────────────────
    customer_id    PK
    email          UNIQUE, NOT NULL
    first_name     NOT NULL
    last_name      NOT NULL
    created_at     TIMESTAMP

Physical (PostgreSQL):
    CREATE TABLE customers (
        customer_id  BIGSERIAL PRIMARY KEY,
        email        CITEXT NOT NULL UNIQUE,
        first_name   VARCHAR(50) NOT NULL,
        last_name    VARCHAR(50) NOT NULL,
        created_at   TIMESTAMPTZ NOT NULL DEFAULT NOW()
    ) PARTITION BY RANGE (created_at);

    CREATE INDEX idx_customers_last_name ON customers (last_name);

3. What a Logical Data Model Contains

Tables. Each conceptual entity becomes a table. Names are typically nouns in snake_case. Pick singular (customer) or plural (customers) and apply it consistently across the schema.

Columns. Each conceptual attribute becomes a column. Column names should be descriptive and follow the same casing convention as tables. Avoid reserved words (user, order, date) — prefer app_user, orders, order_date.

Primary keys (PK). Every table must have a primary key. In diagrams it is annotated PK after the column name or shown with an underline. Primary keys are implicitly UNIQUE and NOT NULL.

Foreign keys (FK). Columns that reference the PK of another table. Annotated FK and drawn with a line between tables. The logical model specifies the referenced table and column; referential actions (ON DELETE CASCADE, etc.) are also logical decisions because they encode business rules.

Unique constraints. Mark candidate/alternate keys other than the PK — for example, email on a users table.

NOT NULL constraints. Whether a column is required is a business rule, not a storage decision, so it belongs in the logical model.

Cardinality. For every relationship, both ends are labelled:

1:1 — one row on each side. Rare; often a candidate for merging or a subtype.
1:N — the "many" side carries a foreign key to the "one" side. The most common relationship.
M:N — must be resolved in the logical model by introducing a junction table (also called a bridge, linking, or associative table).

Generic datatypes. The logical model uses datatype categories, not vendor products. More on this in section 11.

4. Transforming an ERD into a Logical Data Model

The LDM is produced by mechanically transforming the conceptual ERD through a set of well-defined rules. The procedure is:

Step 1: Each entity becomes a table.

ERD entity:  ┌──────────┐          LDM table:  customers
             │ Customer │                       ────────────
             └──────────┘                       customer_id PK
                                                ...

Step 2: Each attribute becomes a column. Simple attributes map directly. Composite attributes (e.g. Address = Street + City + State + Zip) are typically flattened into multiple columns (street, city, state, zip) unless they genuinely deserve their own table. Multi-valued attributes (e.g. a customer with multiple phone numbers) must become a separate table to satisfy 1NF.

Step 3: Every table gets a primary key. If the ERD does not name one, introduce a surrogate <table>_id key.

Step 4: 1:N relationships become FKs on the "many" side. A customer has many orders, so orders.customer_id references customers.customer_id — the FK lives on orders.

Step 5: M:N relationships become junction tables. A product can belong to many categories and a category can contain many products — introduce product_categories(product_id, category_id).

Step 6: Weak entities get composite keys. A weak entity depends on a strong entity for identity. Its PK is the strong entity's PK plus a discriminating attribute. Example: order_items(order_id, line_number).

Step 7: Handle generalization / inheritance. Choose one of three strategies (section 6 of the ERD page discusses these in SQL; here we name the patterns):

Single Table (table per hierarchy): one table for the supertype with all subtype columns plus a type discriminator. Cheap, but lots of nullable columns.
Class Table (table per subtype plus a supertype table): a base table for shared columns, separate tables for each subtype. Clean normalization, more joins.
Concrete Table (table per concrete class): no supertype table; each subtype is a standalone table duplicating shared columns. Simple queries per subtype, painful to query across types.

Step 8: Decide what to do with derived attributes. Options:

Drop them from the logical model and compute at query time. Preferred when the derivation is cheap (full_name = first_name || ' ' || last_name) or the source values change often.
Keep them as regular columns and update via application/trigger/stored procedure. Use when the computation is expensive and reads dominate writes.
Defer to physical: implement as a computed/generated column or materialized view at the physical level.

5. Notation

Several notations exist. All of them express the same information — tables, columns, keys, and cardinality — with different graphical conventions.

Crow's Foot (Information Engineering) — the dominant notation. Tables are boxes with a header row for the name and one row per column. PKs are usually shown at the top or underlined; FKs are marked FK. Cardinality is shown by the line endings: a single bar means "one", a fork (crow's foot) means "many", and a circle means "optional" (zero).

  one mandatory ─────────||────
  one optional  ─────────|o────
  many mandatory ────────|{────
  many optional  ────────o{────

Example: one customer has many orders (customer is mandatory, order is optional)

┌──────────────┐                           ┌──────────────┐
│  customers   │||─────────────────────o{ │   orders     │
└──────────────┘  "one customer has      └──────────────┘
                   zero or more orders"

IDEF1X. A government-style notation used in defense, aerospace, and some enterprise data-modeling tools (ERwin). Distinguishes identifying relationships (child's PK includes parent's PK, drawn with a solid line and rounded-corner child box) from non-identifying relationships (dashed line, square-corner box). More precise than Crow's Foot but less common outside formal-methods shops.

UML Data Modeling Profile. Borrows from UML class diagrams: tables as classes, columns as attributes with stereotypes (<<PK>>, <<FK>>), relationships as associations with multiplicity (0..*, 1). Common when the database is one artifact in a broader UML model.

Which to use. Crow's Foot for almost everything. IDEF1X if the organization mandates it or if you are using ERwin. UML if the rest of the design is UML. The information content is identical; the choice is cosmetic and organizational.

6. Worked Example: E-Commerce

Conceptual ERD. Customers place Orders. Each Order contains line items referring to Products. Products belong to one or more Categories. "Product belongs to Category" is many-to-many.

┌──────────┐           ┌───────┐           ┌──────────┐
│ Customer │──places──<│ Order │>──contains─│ Product  │>──classified─< Category
└──────────┘           └───────┘           └──────────┘

Logical Data Model. Applying the transformation rules:

customers, orders, products, categories — one table per entity.
order_items — the line-item concept; a weak entity that depends on orders. Composite PK is (order_id, line_number).
product_categories — junction table resolving the M:N between products and categories. Composite PK is (product_id, category_id).

ASCII schema diagram:

┌────────────────────────┐
│       customers        │
├────────────────────────┤
│ customer_id     PK     │◄──────────┐
│ email   UNIQUE  NN     │           │
│ first_name      NN     │           │
│ last_name       NN     │           │
│ created_at             │           │
└────────────────────────┘           │
                                     │ 1
                                     │
                                     │ N
┌────────────────────────┐           │
│        orders          │           │
├────────────────────────┤           │
│ order_id        PK     │◄──┐       │
│ customer_id FK  NN  ───┼───┼───────┘
│ order_date      NN     │   │
│ status          NN     │   │
│ total_amount    NN     │   │
└────────────────────────┘   │ 1
                             │
                             │ N
┌────────────────────────┐   │       ┌────────────────────────┐
│     order_items        │   │       │       products         │
├────────────────────────┤   │       ├────────────────────────┤
│ order_id    PK,FK  ────┼───┘       │ product_id      PK     │◄──┐
│ line_number PK         │           │ sku   UNIQUE    NN     │   │
│ product_id  FK     NN ─┼──────────►│ name            NN     │   │
│ quantity           NN  │           │ unit_price      NN     │   │
│ unit_price         NN  │           │ description            │   │
└────────────────────────┘           └────────────────────────┘   │
                                                                  │ N
                                                                  │
                                     ┌────────────────────────┐   │
                                     │   product_categories   │   │
                                     ├────────────────────────┤   │
                                     │ product_id  PK,FK  ────┼───┘
                                     │ category_id PK,FK  ────┼───┐
                                     └────────────────────────┘   │
                                                                  │
                                                                  │ N
                                                                  │
                                     ┌────────────────────────┐   │
                                     │      categories        │   │
                                     ├────────────────────────┤   │
                                     │ category_id     PK     │◄──┘
                                     │ name   UNIQUE   NN     │
                                     │ parent_category_id FK  │
                                     └────────────────────────┘

Legend:  PK = primary key     FK = foreign key
         NN = NOT NULL        UNIQUE = unique constraint
         ◄── / ──► = foreign-key reference direction

Equivalent SQL DDL. Generic SQL, no vendor extensions:

CREATE TABLE customers (
    customer_id    INTEGER      PRIMARY KEY,
    email          VARCHAR(255) NOT NULL UNIQUE,
    first_name     VARCHAR(100) NOT NULL,
    last_name      VARCHAR(100) NOT NULL,
    created_at     TIMESTAMP    NOT NULL
);

CREATE TABLE categories (
    category_id         INTEGER      PRIMARY KEY,
    name                VARCHAR(100) NOT NULL UNIQUE,
    parent_category_id  INTEGER,
    FOREIGN KEY (parent_category_id) REFERENCES categories (category_id)
);

CREATE TABLE products (
    product_id   INTEGER       PRIMARY KEY,
    sku          VARCHAR(50)   NOT NULL UNIQUE,
    name         VARCHAR(200)  NOT NULL,
    description  TEXT,
    unit_price   DECIMAL(12,2) NOT NULL,
    CHECK (unit_price >= 0)
);

CREATE TABLE product_categories (
    product_id   INTEGER NOT NULL,
    category_id  INTEGER NOT NULL,
    PRIMARY KEY (product_id, category_id),
    FOREIGN KEY (product_id)  REFERENCES products   (product_id)  ON DELETE CASCADE,
    FOREIGN KEY (category_id) REFERENCES categories (category_id) ON DELETE CASCADE
);

CREATE TABLE orders (
    order_id     INTEGER       PRIMARY KEY,
    customer_id  INTEGER       NOT NULL,
    order_date   TIMESTAMP     NOT NULL,
    status       VARCHAR(20)   NOT NULL,
    total_amount DECIMAL(12,2) NOT NULL,
    FOREIGN KEY (customer_id) REFERENCES customers (customer_id),
    CHECK (status IN ('pending', 'paid', 'shipped', 'delivered', 'cancelled')),
    CHECK (total_amount >= 0)
);

CREATE TABLE order_items (
    order_id    INTEGER       NOT NULL,
    line_number INTEGER       NOT NULL,
    product_id  INTEGER       NOT NULL,
    quantity    INTEGER       NOT NULL,
    unit_price  DECIMAL(12,2) NOT NULL,
    PRIMARY KEY (order_id, line_number),
    FOREIGN KEY (order_id)   REFERENCES orders   (order_id) ON DELETE CASCADE,
    FOREIGN KEY (product_id) REFERENCES products (product_id),
    CHECK (quantity > 0),
    CHECK (unit_price >= 0)
);

Commentary. Notice what this DDL does not contain: no CREATE INDEX statements beyond the implied PK/unique indexes, no partitioning, no vendor-specific types (SERIAL, IDENTITY, JSONB), no tablespace or filegroup declarations. Those decisions belong to the physical model. At the logical level we care only that the data is structurally correct, normalized, and enforces the business rules.

7. Worked Example: Library Management

Scenario. A library has members who borrow books. Each loan is recorded. Books have authors (a book can have several authors, an author can write many books — classic M:N). Each book is published by exactly one publisher.

Relationships:

Publisher → Book: 1:N (a publisher publishes many books; each book has one publisher).
Book ↔ Author: M:N — resolved with a book_authors junction table.
Member → Loan: 1:N (a member has many loans; each loan is for one member).
Book → Loan: 1:N (a book can be loaned many times over its life; a loan is for one book).

ASCII schema diagram:

┌────────────────────────┐          ┌────────────────────────┐
│      publishers        │          │       authors          │
├────────────────────────┤          ├────────────────────────┤
│ publisher_id   PK      │          │ author_id      PK      │
│ name    UNIQUE NN      │          │ first_name     NN      │
│ country                │          │ last_name      NN      │
└──────────┬─────────────┘          │ birth_year             │
           │ 1                      └──────────┬─────────────┘
           │                                   │ N
           │ N                                 │
           ▼                                   │
┌────────────────────────┐                     │
│         books          │                     │
├────────────────────────┤                     │
│ book_id        PK      │◄──┐                 │
│ isbn    UNIQUE NN      │   │                 │
│ title          NN      │   │                 │
│ publisher_id FK NN     │   │                 │
│ publication_year       │   │                 │
└──────────┬─────────────┘   │                 │
           │ 1               │                 │
           │                 │                 │
           │ N               │                 │
┌──────────▼─────────────┐   │   ┌─────────────▼──────────────┐
│        loans           │   │   │      book_authors          │
├────────────────────────┤   │   ├────────────────────────────┤
│ loan_id        PK      │   └──►│ book_id     PK, FK ◄───────│ (FK to books)
│ member_id   FK NN      │       │ author_id   PK, FK ◄───────│ (FK to authors)
│ book_id     FK NN      │       │ author_order                │
│ loan_date      NN      │       └────────────────────────────┘
│ due_date       NN      │
│ return_date            │
└──────────┬─────────────┘
           │ N
           │
           │ 1
           ▼
┌────────────────────────┐
│       members          │
├────────────────────────┤
│ member_id      PK      │
│ email   UNIQUE NN      │
│ first_name     NN      │
│ last_name      NN      │
│ joined_at      NN      │
└────────────────────────┘

Equivalent SQL DDL:

CREATE TABLE publishers (
    publisher_id INTEGER      PRIMARY KEY,
    name         VARCHAR(200) NOT NULL UNIQUE,
    country      VARCHAR(100)
);

CREATE TABLE authors (
    author_id   INTEGER      PRIMARY KEY,
    first_name  VARCHAR(100) NOT NULL,
    last_name   VARCHAR(100) NOT NULL,
    birth_year  INTEGER
);

CREATE TABLE books (
    book_id           INTEGER      PRIMARY KEY,
    isbn              VARCHAR(20)  NOT NULL UNIQUE,
    title             VARCHAR(300) NOT NULL,
    publisher_id      INTEGER      NOT NULL,
    publication_year  INTEGER,
    FOREIGN KEY (publisher_id) REFERENCES publishers (publisher_id)
);

CREATE TABLE book_authors (
    book_id      INTEGER NOT NULL,
    author_id    INTEGER NOT NULL,
    author_order INTEGER NOT NULL DEFAULT 1,
    PRIMARY KEY (book_id, author_id),
    FOREIGN KEY (book_id)   REFERENCES books   (book_id)   ON DELETE CASCADE,
    FOREIGN KEY (author_id) REFERENCES authors (author_id) ON DELETE RESTRICT
);

CREATE TABLE members (
    member_id   INTEGER      PRIMARY KEY,
    email       VARCHAR(255) NOT NULL UNIQUE,
    first_name  VARCHAR(100) NOT NULL,
    last_name   VARCHAR(100) NOT NULL,
    joined_at   TIMESTAMP    NOT NULL
);

CREATE TABLE loans (
    loan_id     INTEGER   PRIMARY KEY,
    member_id   INTEGER   NOT NULL,
    book_id     INTEGER   NOT NULL,
    loan_date   DATE      NOT NULL,
    due_date    DATE      NOT NULL,
    return_date DATE,
    FOREIGN KEY (member_id) REFERENCES members (member_id),
    FOREIGN KEY (book_id)   REFERENCES books   (book_id),
    CHECK (due_date >= loan_date),
    CHECK (return_date IS NULL OR return_date >= loan_date)
);

Commentary on book_authors. The junction table resolves the M:N. Note the extra column author_order: when a book has multiple authors, the order they appear on the cover matters (Knuth is first author on The Art of Computer Programming). This is a classic reason to give the junction table its own columns — a pure junction table would only have the two FKs. The composite PK (book_id, author_id) prevents listing the same author twice on the same book.

8. Junction, Bridge, and Linking Tables

When the conceptual model has an M:N relationship, the logical model must break it into two 1:N relationships via a third table. This third table has several interchangeable names — junction table, bridge table, linking table, associative table, or cross-reference table. They all mean the same thing.

Why they exist. Relational databases cannot directly represent an M:N relationship. A single FK column can only point at one row; an M:N needs each row on each side to be able to point at many rows on the other. Introducing a third table where each row represents one pairing solves this: the junction table has 1:N relationships with each of the two parents.

Naming conventions. Two common patterns:

Concatenated parent names — products_categories, book_authors, student_courses. Unambiguous but sometimes awkward. Pick an order and stick with it (alphabetical, or parent-then-child).
Domain-meaningful name — enrollments instead of student_courses, order_items instead of orders_products, memberships instead of users_groups. Preferred when the relationship itself is a first-class domain concept with its own attributes (grade, line number, role). This is the case whenever the junction table has any column beyond the two FKs.

Primary key choice. Two options:

Composite PK of the two FKs. Semantically honest — it directly expresses "each pair appears at most once". No extra column. Referenced by foreign keys as a two-column FK. Preferred for pure junction tables.

CREATE TABLE product_categories (
    product_id   INTEGER NOT NULL,
    category_id  INTEGER NOT NULL,
    PRIMARY KEY (product_id, category_id),
    FOREIGN KEY (product_id)  REFERENCES products   (product_id),
    FOREIGN KEY (category_id) REFERENCES categories (category_id)
);

Surrogate single-column PK plus a unique constraint on the pair. Useful when other tables need to reference individual rows of the junction table, or when ORMs and tooling struggle with composite keys.

CREATE TABLE product_categories (
    product_category_id  INTEGER NOT NULL PRIMARY KEY,
    product_id           INTEGER NOT NULL,
    category_id          INTEGER NOT NULL,
    UNIQUE (product_id, category_id),
    FOREIGN KEY (product_id)  REFERENCES products   (product_id),
    FOREIGN KEY (category_id) REFERENCES categories (category_id)
);

Both are correct. The composite PK is more honest about the relationship's identity; the surrogate PK is more uniform across the schema. Teams usually pick one and apply it everywhere.

9. Keys in the Logical Model

The logical model makes several explicit decisions about keys that the conceptual ERD glosses over.

Natural keys. A natural key is an attribute (or set) that exists in the business domain and happens to be unique: an ISBN, an email address, a country ISO code, a social security number. Advantages: no extra column, joins can sometimes be elided. Disadvantages: they can change (a person's email changes, a country's ISO code is reassigned), they can be long (a full URL), they may not be known at insert time, and PII natural keys raise privacy and compliance concerns.

Surrogate keys. A surrogate key is a meaningless synthetic identifier — an auto-incremented integer, a UUID, a ULID. It has no business meaning, cannot change, and is always compact and available. Advantages: stable across all business changes, uniform datatype across the schema, safe to expose. Disadvantages: adds a column that does not appear in the business domain, and requires a unique constraint on the natural key to prevent duplicates.

Recommendation. Use surrogate keys as PKs, and add UNIQUE constraints on the natural keys that also uniquely identify rows. This gives you the stability of surrogates with the integrity of naturals.

CREATE TABLE books (
    book_id  INTEGER     PRIMARY KEY,      -- surrogate
    isbn     VARCHAR(20) NOT NULL UNIQUE,  -- natural; also enforced unique
    title    VARCHAR(300) NOT NULL
);

UUID vs auto-increment. Auto-increment (sequence, SERIAL, IDENTITY) gives compact 4- or 8-byte keys, but requires a round-trip to the database to allocate and leaks ordering and rate information. UUIDs (16 bytes) can be generated client-side, do not leak information, and are safe to merge across systems, but bloat indexes and cause B-tree fragmentation unless you use a time-ordered variant (UUIDv7, ULID). Choose UUIDs when the system is distributed or IDs are exposed in URLs; choose auto-increment for internal, single-writer systems.

Composite keys. When no single attribute uniquely identifies a row but a combination does, use a composite key. Classic examples: junction tables ((product_id, category_id)), weak-entity children ((order_id, line_number)), effective-dated dimensions ((customer_id, effective_date)). A composite PK can still be surrogate-style (INTEGER columns) — "composite" refers to the number of columns, not their meaning.

Candidate and alternate keys. A candidate key is any minimal set of columns that uniquely identifies a row. One candidate key is chosen as the PK; the others are alternate keys and should be declared with UNIQUE constraints. Documenting them in the logical model ensures the physical model enforces them.

CREATE TABLE employees (
    employee_id        INTEGER     PRIMARY KEY,   -- chosen PK
    ssn                VARCHAR(11) NOT NULL UNIQUE, -- alternate key
    employee_badge_num VARCHAR(10) NOT NULL UNIQUE, -- alternate key
    email              VARCHAR(255) NOT NULL UNIQUE -- alternate key
);

10. Normalization Considerations

A well-formed logical model is typically normalized to Third Normal Form (3NF) or Boyce-Codd Normal Form (BCNF). Normalization eliminates three classes of anomalies:

Insertion anomalies — you cannot record a fact because another unrelated fact is missing.
Update anomalies — the same fact is stored in many rows and must be updated in all of them.
Deletion anomalies — deleting one fact inadvertently deletes another.

The short definitions, in order of increasing strictness:

1NF — every column holds a single atomic value; no repeating groups.
2NF — 1NF plus no non-key attribute depends on only part of a composite PK.
3NF — 2NF plus no non-key attribute depends on another non-key attribute (no transitive dependencies).
BCNF — every determinant is a candidate key. In practice, 3NF and BCNF coincide almost always.

Normalization is important enough to deserve its own page and is not treated exhaustively here. The point for the logical model is: it should be normalized at this stage. Any denormalization (copying customer_name onto orders to avoid a join, materializing order_total rather than summing line items) is a physical decision justified by measured performance, not a logical-modeling choice.

11. Generic Datatypes at the Logical Level

The logical model uses datatype categories, not vendor-specific products. The categories are standard across SQL dialects:

Category       Examples of vendor mappings
──────────     ────────────────────────────────────────────────────
integer        INTEGER, INT, BIGINT, SMALLINT
decimal        DECIMAL(p,s), NUMERIC(p,s)
float          REAL, DOUBLE PRECISION, FLOAT
varchar        VARCHAR(n), CHARACTER VARYING(n)
char           CHAR(n), CHARACTER(n)
text           TEXT, CLOB, NVARCHAR(MAX)
boolean        BOOLEAN, BIT, TINYINT(1)
date           DATE
time           TIME
timestamp      TIMESTAMP, DATETIME, TIMESTAMPTZ, DATETIME2
binary         BLOB, BYTEA, VARBINARY(MAX)
uuid           UUID, UNIQUEIDENTIFIER, CHAR(36)

At the logical level, write: unit_price DECIMAL(12,2), email VARCHAR(255), created_at TIMESTAMP, is_active BOOLEAN.

At the logical level, do not write: unit_price NUMBER(12,2) (Oracle-specific), body NVARCHAR(MAX) (SQL Server-specific), metadata JSONB (PostgreSQL-specific), id BIGSERIAL (PostgreSQL-specific).

The datatype categories carry the design intent: a monetary amount needs exact decimal arithmetic, an email fits in a bounded variable-length string, a timestamp needs date-and-time precision. Mapping those categories to vendor types is the job of the physical model.

Length and precision still matter at the logical level. VARCHAR(50) versus VARCHAR(500) is a business rule about how long a name can be, not a storage choice — so it does belong in the logical model. What you leave to physical is the vendor name of the type, not its size.

12. From Logical to Physical — What Gets Added

The step from LDM to physical schema is where vendor-specific and performance-driven decisions come in. The following move from "absent in the LDM" to "present in the physical schema":

Indexing. Beyond the implicit PK and UNIQUE indexes, the physical schema adds secondary indexes based on actual query patterns: composite indexes on frequent filter combinations, covering indexes for hot queries, partial indexes on filtered subsets, expression indexes on computed values.

-- Physical only:
CREATE INDEX idx_orders_customer_date ON orders (customer_id, order_date DESC);
CREATE INDEX idx_products_name_trgm   ON products USING gin (name gin_trgm_ops);
CREATE INDEX idx_active_users         ON users (email) WHERE is_active = TRUE;

Partitioning. Splitting a large table across multiple physical segments for manageability and query performance — range partitioning on time, hash partitioning on tenant ID, list partitioning on region. Pure physical concern.

-- Physical only:
CREATE TABLE events (
    event_id   BIGINT,
    event_time TIMESTAMPTZ NOT NULL,
    payload    JSONB
) PARTITION BY RANGE (event_time);

Denormalization for performance. Copying customer_name onto orders, pre-computing order_total, maintaining materialized aggregate tables. Each denormalization trades write cost and consistency risk for read speed — a decision based on measured production workload, not design aesthetics.

Vendor datatype selection. TIMESTAMP becomes TIMESTAMPTZ (PostgreSQL) or DATETIME2(7) (SQL Server) or TIMESTAMP WITH TIME ZONE (Oracle). VARCHAR(255) becomes VARCHAR2(255 CHAR) on Oracle. TEXT becomes NVARCHAR(MAX) on SQL Server. INTEGER PRIMARY KEY becomes BIGSERIAL or BIGINT IDENTITY(1,1).

Storage parameters and physical layout. Tablespaces, filegroups, fillfactor, compression, CLUSTER-ed tables, columnstore vs rowstore, LOB storage options. Entirely physical.

Additional physical artifacts:

Views and materialized views
Triggers and stored procedures
Sequences and identity strategies
Collations and character sets
Security grants and row-level security policies
Replication and backup configuration

A healthy workflow keeps the logical model as a living artifact — the canonical source for what the database holds — and lets the physical schema evolve separately as performance and platform requirements change.

13. Tools

ERwin Data Modeler. The traditional enterprise data-modeling tool. Strong in regulated industries (finance, defense, healthcare). Supports IDEF1X and Crow's Foot; can forward-engineer to every major RDBMS and reverse-engineer from existing schemas. Commercial, seat-licensed.

SAP PowerDesigner. Enterprise competitor to ERwin. Supports conceptual, logical, and physical models with explicit transformations between them, plus UML, business-process, and information-architecture modeling. Commercial.

dbdiagram.io. Text-first web tool using its own compact DSL (DBML). Produces clean Crow's Foot diagrams quickly, exports SQL, and versions cleanly in Git because the source is text. Free tier available; paid for teams.

Table customers {
  customer_id integer [pk]
  email       varchar [unique, not null]
  first_name  varchar [not null]
  last_name   varchar [not null]
  created_at  timestamp
}

Table orders {
  order_id    integer [pk]
  customer_id integer [not null, ref: > customers.customer_id]
  order_date  timestamp [not null]
  status      varchar [not null]
}

DBeaver. Open-source universal database client. Its ER Diagram view auto-generates Crow's Foot diagrams from a live connection — useful for reverse-engineering an existing database into a logical view. Free community edition; paid Pro edition.

Lucidchart / draw.io (diagrams.net). General-purpose diagramming tools with entity-relationship shape libraries. Good for quick, collaborative, slide-deck-quality diagrams. Weak on forward- or reverse-engineering SQL.

Mermaid erDiagram. Text-based diagrams that render inline in Markdown, GitHub, GitLab, Notion, and Obsidian. Ideal for keeping the logical model in the same repo as the code. Limited styling, no forward-engineering, but unbeatable for living documentation.

```mermaid
erDiagram
    CUSTOMERS ||--o{ ORDERS : places
    ORDERS    ||--|{ ORDER_ITEMS : contains
    PRODUCTS  ||--o{ ORDER_ITEMS : "appears in"
    CUSTOMERS {
        int customer_id PK
        string email UNIQUE
        string first_name
        string last_name
    }
    ORDERS {
        int order_id PK
        int customer_id FK
        timestamp order_date
        string status
    }
```

SchemaSpy. Reverse-engineers a live database into browsable HTML documentation with diagrams. Useful for existing systems that were never formally modeled.

Liquibase / Flyway. Not modeling tools per se, but schema-migration tools that treat the DDL itself as the versioned source of truth. In pipelines that use these tools, the logical model often exists only implicitly in the latest migration set — an approach that works well for small teams and suffers in larger ones where up-front design is valuable.

14. Best Practices

Consistent naming. Pick a convention and apply it everywhere:

Case: snake_case for tables and columns. Avoid camelCase or PascalCase — SQL identifiers are case-folded on most platforms and mixed case causes grief.
Singular vs plural: both are defensible. Plural (customers, orders) reads naturally in joins ("select from customers") and aligns with collection semantics in ORMs. Singular (customer, order) matches the ERD entity and avoids reserved-word collisions (order is reserved in SQL, so plural orders sidesteps the problem — a common reason pluralists win). Pick one, write it in the team style guide, never mix.
No abbreviations: customer_address, not cust_addr. Disk is cheap; cognitive load is not.
No Hungarian notation: tbl_customers, int_customer_id — don't.

Foreign-key naming. The FK column should be named after the referenced PK with the referenced table's singular name as a prefix: customers.customer_id ← orders.customer_id. When a table needs multiple FKs to the same parent, use role-based prefixes: shipping_address_id, billing_address_id, both pointing at addresses.address_id. The FK constraint itself should have an explicit, predictable name: fk_orders_customer_id, fk_order_items_product_id. Auto-generated constraint names make error messages and migration scripts hard to read.

Audit columns. Almost every table benefits from a standard set of audit columns:

created_at  TIMESTAMP   NOT NULL,
created_by  VARCHAR(100),
updated_at  TIMESTAMP   NOT NULL,
updated_by  VARCHAR(100)

Define them once in the logical model and apply consistently. In the physical model, populate updated_at via a trigger or an ON UPDATE default, not by trusting the application.

Soft deletes vs hard deletes. A hard delete removes the row from the table. A soft delete flips a deleted_at TIMESTAMP (or is_deleted BOOLEAN) column and keeps the row. Trade-offs:

Hard deletes are simple, respect referential integrity and storage, and comply with right-to-be-forgotten regulations. Every row in the table is live data.
Soft deletes preserve history, support "undo", and allow references from other tables to remain valid. Every query must remember to filter WHERE deleted_at IS NULL, unique constraints get tricky (a user's email must be unique among active users, not all users), and compliance with deletion requests becomes a procedural problem.

Decide early and apply uniformly. Mixing soft and hard deletes across a schema leads to subtle bugs where a soft-deleted parent still has "live" children.

Use NOT NULL aggressively. Nullable columns are a contagion — every query that touches them must handle three-valued logic (NULL = NULL is UNKNOWN). Make NOT NULL the default and add nullability only for columns where "unknown" or "not applicable" is a real state. For genuinely optional columns, consider whether a separate table might be cleaner (especially for large optional text like biography).

Always declare referential actions. ON DELETE CASCADE, ON DELETE RESTRICT, ON DELETE SET NULL encode business rules about what happens to children when a parent disappears. Leaving the default (NO ACTION on most platforms) is a decision too — document it.

Reserve the junction-table name for actual junctions. If a table has any column beyond the two FKs — a quantity, a grade, a role, a timestamp — it is a first-class entity, not a junction. Name it accordingly (enrollments, memberships, order_items), and treat it with the same discipline as any other table.

Validate the logical model against queries. Before signing off on an LDM, walk through the expected queries. Can each one be answered? Are the required joins reasonable? If a common query needs a five-way join through four junction tables, the model may be over-normalized or missing an entity. If a common query needs a column that does not exist, the model is incomplete.

Keep the logical model under version control. Whether it is a .erwin file, a schema.dbml, a model.puml, or a collection of Mermaid blocks in Markdown, the logical model belongs in Git next to the code. Review changes to it in pull requests. Treat it as documentation that is required to stay in sync with the physical schema, not a one-time deliverable from a pre-project design phase.

Conclusion

The Logical Data Model is the unsung workhorse of database design. The conceptual ERD is glamorous and the physical schema is what ends up in production, but the LDM is where the business requirements meet the relational rules and produce something concrete and implementable. Done well, it yields a normalized, DBMS-independent specification that any competent DBA can implement on any platform; done poorly, the errors propagate into indexes, queries, application code, and eventually production incidents.

Spend the time at the logical layer. Name things consistently. Decide keys deliberately. Resolve M:N relationships explicitly. Enforce NOT NULL, UNIQUE, and referential integrity because the data demands it, not because the platform makes it easy. The rest — indexes, partitions, vendor datatypes, denormalization for performance — flows from a sound logical model.