top of page
Search

#JourneyToCTA Diaries Part 7 - Normalization vs De-Normalization

  • Shreyas Dhond
  • 1 day ago
  • 7 min read

In the previous blog, we discussed the design strategy for architecting a data model that aligns with business needs, leverages Salesforce-native capabilities, and remains scalable as the system grows.


A key part of this strategy is first aligning business entities conceptually and translating them into a logical blueprint. Only after the logical model is clearly defined should those entities be mapped to Salesforce constructs such as core data models for Sales and Service, supporting data models, and where necessary, custom objects. This approach ensures that the platform implementation reflects the business domain rather than forcing business processes to fit predefined technical structures.


In this blog, we will focus on an important aspect of logical data modelling: normalization versus denormalization. We will explore the role of the three normal forms in data model design, the benefits they provide in maintaining data integrity and reducing redundancy, and how these principles support different architectural use cases.


As part of this discussion, we will also examine the common relationship types available in Salesforce and how normalized and denormalized patterns are represented within the platform’s data model. Understanding how these concepts translate into Salesforce relationships is essential for designing solutions that balance data integrity, performance, and scalability.


Normalization


In simple terms, normalization is the process of organizing data efficiently based on its logical relationships. The primary goal of normalization is to reduce data redundancy within a data model. Redundant data not only consumes unnecessary storage but can also negatively impact system performance. It increases the amount of data that must be processed during create, read, update, and delete (CRUD) operations and can slow down queries as datasets grow.


Normalization is structured around a set of principles known as normal forms. For Salesforce architects, the three most relevant forms to understand in practice are First Normal Form (1NF), Second Normal Form (2NF), and Third Normal Form (3NF). Each form introduces rules that progressively improve data integrity and reduce duplication.


First Normal Form (1NF)


A database is considered to be in First Normal Form when all fields contain atomic values, meaning the data stored in each field cannot be logically divided into smaller meaningful components.


A common example is an address. If an address is stored as a single string such as “123 Main St, New York, NY, 321212, USA” the value is considered non-atomic, because it combines multiple elements into a single field. In contrast, if the address is broken down into separate components—street, city, state, postal code, and country—each field contains a single atomic value and therefore satisfies the 1NF requirement.

Customer ID

Address

1

902 Marquette Avenue, Minneapolis, MN, 123232, USA

2

123 Main St, New York, NY, 321212, USA

3

415 Mission Street, 3rd Floor, San Francisco, California 94105, United States

Another rule of 1NF is that tables should not contain repeating groups of similar data. Each column should represent a single attribute rather than multiple occurrences of the same type of information.


A simple Salesforce example can be seen with address fields on the Account object. The object contains both Billing Address and Shipping Address, which are essentially repeated structures representing the same type of information. Because of these repeating groups, the Account object would not strictly satisfy 1NF under traditional database normalization rules.

Account ID

Account Name

Billing Street

Billing City

Billing State

Billing Country

Shipping Street

Shipping City

Shipping State

Shipping Country

001A0000001XYZ

Acme Corp

123 Main St

New York

NY

USA

456 Market St

Boston

MA

USA

001A0000002XYZ

Global Telecom

78 King St

London

London

UK

78 King St

London

London

UK

001A0000003XYZ

Northern Energy

900 Bay St

Toronto

ON

Canada

25 Industrial Rd

Calgary

AB

Canada

To fully normalize this design, address information would be separated into a dedicated Address entity with its own unique identifier. The Account record would then reference the appropriate address records using lookup relationships. This approach eliminates duplication and allows a single address record to be reused across multiple entities when appropriate.

Account ID

Account Name

Billing Address ID

Shipping Address ID

001A0000001XYZ

Acme Corp

ADD001

AD002

001A0000002XYZ

Global Telecom

ADD003

ADD004

001A0000003XYZ

Northern Energy

ADD005

ADD006

Address ID

Address Type

Street

City

State

Country

ADD001

Billing

123 Main St

New York

NY

USA

ADD002

Shipping

456 Market St

Boston

MA

USA

ADD003

Billing

78 King St

London

London

UK

ADD004

Shipping

78 King St

London

London

UK

ADD005

Billing

900 Bay St

Toronto

ON

Canada

ADD006

Shipping

25 Industrial Rd

Calgary

AB

Canada

While strict normalization like this is not always implemented in Salesforce due to usability and performance considerations, understanding these principles helps architects make informed decisions when designing scalable data models.


Second Normal Form (2NF)


A database is considered to be in Second Normal Form (2NF) when it already satisfies the requirements of First Normal Form (1NF) and all non-key (non-prime) attributes are fully functionally dependent on the primary key.


In simpler terms, this means that every non-key field in a table must depend entirely on the table’s primary key and not on part of the key or on another attribute.


Continuing with the previous Account and Address example, the Account object includes columns such as Billing Address and Shipping Address. In this structure, the type of address (Billing or Shipping) is implicitly embedded in the column name itself. At the same time, the address entity may also capture the address type as a field in the Address table.


This creates a form of redundancy because the address type becomes indirectly tied to the Account record through the column definition, rather than being determined solely by the Address record itself.


To bring the design into Second Normal Form, the schema needs to be adjusted so that address type is stored only within the Address table and not embedded in the Account table structure. The Account table would simply reference address records, while the Address table would include an Address Type field (for example, Billing, Shipping, or Mailing).

Account ID

Account Name

001A0000001XYZ

Acme Corp

001A0000002XYZ

Global Telecom

001A0000003XYZ

Northern Energy

Address ID

Account ID

Address Type

Street

City

State

Country

ADD001

001A0000001XYZ

Billing

123 Main St

New York

NY

USA

ADD002

001A0000001XYZ

Shipping

456 Market St

Boston

MA

USA

ADD003

001A0000002XYZ

Billing

78 King St

London

London

UK

ADD004

001A0000002XYZ

Shipping

78 King St

London

London

UK

ADD005

001A0000003XYZ

Billing

900 Bay St

Toronto

ON

Canada

ADD006

001A0000003XYZ

Shipping

25 Industrial Rd

Calgary

AB

Canada

With this design, the address type becomes dependent only on the Address record, eliminating the partial dependency and ensuring the model satisfies 2NF.


Third Normal Form (3NF)


A database is considered in the third normal form if it is already 2NF and a table doesn't have transitively dependent attributes. For example in the above if the account table has attributes that are already captured in the address table it would be in violation of 3NF.

Account ID

Account Name

Billing State

Shipping State

001A0000001XYZ

Acme Corp

NY

MA

001A0000002XYZ

Global Telecom

London

London

001A0000003XYZ

Northern Energy

ON

AB


Denormalization


Denormalization can be considered the opposite of normalization. While normalization focuses on reducing redundancy and organizing data into well-structured relational tables, denormalization intentionally introduces controlled redundancy into the data model.


The primary goal of denormalization is to improve performance and simplify data access, particularly for read-heavy operations such as reporting, analytics, and complex queries. By storing related data together, the system can reduce the need for joins or multi-object queries, which can improve query performance and reduce processing overhead during data retrieval. The below table is a denormalized version of the above Account table easy to perform analytics but hard to keep updated if address changes occur in separate address table.

Account ID

Account Name

Billing Street

Billing City

Billing State

Billing Postal Code

Billing Country

Shipping Street

Shipping City

Shipping State

Shipping Postal Code

Shipping Country

001A0000001XYZ

Acme Corp

123 Main St

New York

NY

10001

USA

456 Market St

Boston

MA

02108

USA

001A0000002XYZ

Global Telecom

78 King St

London

London

EC1A 1BB

UK

78 King St

London

London

EC1A 1BB

UK

001A0000003XYZ

Northern Energy

900 Bay St

Toronto

ON

M5S 2B1

Canada

25 Industrial Rd

Calgary

AB

T2P 1J9

Canada

However, denormalization introduces trade-offs. Because the same data may exist in multiple places, there is an additional responsibility to keep redundant data consistent and synchronized. Updates to a record may require changes across several objects or fields to maintain data integrity. This adds complexity to write operations and may require automation, triggers, or integration logic to ensure the replicated data remains aligned.


In practice, denormalization is often used selectively. Architects balance the benefits of faster reads and simplified reporting against the operational overhead required to maintain consistency. When applied thoughtfully, denormalization can significantly improve system performance for high-volume transactional systems and analytics-driven use cases.

Normalization vs Denormalization


Normalization

  • Organizes data based on logical relationships to reduce redundancy.

  • Data is separated into multiple related tables/objects.

  • Improves data integrity and consistency by ensuring each piece of data exists in only one place.

  • Changes to data are easier to manage because updates occur in a single location.

  • Requires joins or relationships to retrieve related data.

  • Based on normal forms (1NF, 2NF, 3NF) that progressively remove redundancy and enforce functional dependencies.

  • Well suited for transactional systems where data accuracy and consistency are critical.


Denormalization

  • Intentionally introduces controlled redundancy into the data model.

  • Stores related data within the same table/object to simplify data retrieval.

  • Improves read performance and query efficiency, especially for reporting and analytics.

  • Reduces the need for complex joins across objects.

  • Requires additional logic or automation to keep redundant data synchronized.

  • Increases storage usage and update complexity.

  • Often used in reporting, analytics, and high-read performance scenarios.


Architectural Perspective (Salesforce Context)

  • Normalization helps maintain clean data models and integrity.

  • Denormalization helps improve performance and reporting efficiency.

  • Effective Salesforce architecture typically requires a balance of both approaches depending on the use case.



 
 
 

Comments


Copyright © 2024 SFDCShred

bottom of page