#JourneyToCTA Diaries Part 7 - Normalization vs De-Normalization
- Shreyas Dhond
- 1 day ago
- 7 min read
In the previous blog, we discussed the design strategy for architecting a data model that aligns with business needs, leverages Salesforce-native capabilities, and remains scalable as the system grows.
A key part of this strategy is first aligning business entities conceptually and translating them into a logical blueprint. Only after the logical model is clearly defined should those entities be mapped to Salesforce constructs such as core data models for Sales and Service, supporting data models, and where necessary, custom objects. This approach ensures that the platform implementation reflects the business domain rather than forcing business processes to fit predefined technical structures.
In this blog, we will focus on an important aspect of logical data modelling: normalization versus denormalization. We will explore the role of the three normal forms in data model design, the benefits they provide in maintaining data integrity and reducing redundancy, and how these principles support different architectural use cases.
As part of this discussion, we will also examine the common relationship types available in Salesforce and how normalized and denormalized patterns are represented within the platform’s data model. Understanding how these concepts translate into Salesforce relationships is essential for designing solutions that balance data integrity, performance, and scalability.
Normalization
In simple terms, normalization is the process of organizing data efficiently based on its logical relationships. The primary goal of normalization is to reduce data redundancy within a data model. Redundant data not only consumes unnecessary storage but can also negatively impact system performance. It increases the amount of data that must be processed during create, read, update, and delete (CRUD) operations and can slow down queries as datasets grow.
Normalization is structured around a set of principles known as normal forms. For Salesforce architects, the three most relevant forms to understand in practice are First Normal Form (1NF), Second Normal Form (2NF), and Third Normal Form (3NF). Each form introduces rules that progressively improve data integrity and reduce duplication.
First Normal Form (1NF)
A database is considered to be in First Normal Form when all fields contain atomic values, meaning the data stored in each field cannot be logically divided into smaller meaningful components.
A common example is an address. If an address is stored as a single string such as “123 Main St, New York, NY, 321212, USA” the value is considered non-atomic, because it combines multiple elements into a single field. In contrast, if the address is broken down into separate components—street, city, state, postal code, and country—each field contains a single atomic value and therefore satisfies the 1NF requirement.
Customer ID | Address |
1 | 902 Marquette Avenue, Minneapolis, MN, 123232, USA |
2 | 123 Main St, New York, NY, 321212, USA |
3 | 415 Mission Street, 3rd Floor, San Francisco, California 94105, United States |
Another rule of 1NF is that tables should not contain repeating groups of similar data. Each column should represent a single attribute rather than multiple occurrences of the same type of information.
A simple Salesforce example can be seen with address fields on the Account object. The object contains both Billing Address and Shipping Address, which are essentially repeated structures representing the same type of information. Because of these repeating groups, the Account object would not strictly satisfy 1NF under traditional database normalization rules.
Account ID | Account Name | Billing Street | Billing City | Billing State | Billing Country | Shipping Street | Shipping City | Shipping State | Shipping Country |
001A0000001XYZ | Acme Corp | 123 Main St | New York | NY | USA | 456 Market St | Boston | MA | USA |
001A0000002XYZ | Global Telecom | 78 King St | London | London | UK | 78 King St | London | London | UK |
001A0000003XYZ | Northern Energy | 900 Bay St | Toronto | ON | Canada | 25 Industrial Rd | Calgary | AB | Canada |
To fully normalize this design, address information would be separated into a dedicated Address entity with its own unique identifier. The Account record would then reference the appropriate address records using lookup relationships. This approach eliminates duplication and allows a single address record to be reused across multiple entities when appropriate.
Account ID | Account Name | Billing Address ID | Shipping Address ID |
001A0000001XYZ | Acme Corp | ADD001 | AD002 |
001A0000002XYZ | Global Telecom | ADD003 | ADD004 |
001A0000003XYZ | Northern Energy | ADD005 | ADD006 |
Address ID | Address Type | Street | City | State | Country |
ADD001 | Billing | 123 Main St | New York | NY | USA |
ADD002 | Shipping | 456 Market St | Boston | MA | USA |
ADD003 | Billing | 78 King St | London | London | UK |
ADD004 | Shipping | 78 King St | London | London | UK |
ADD005 | Billing | 900 Bay St | Toronto | ON | Canada |
ADD006 | Shipping | 25 Industrial Rd | Calgary | AB | Canada |
While strict normalization like this is not always implemented in Salesforce due to usability and performance considerations, understanding these principles helps architects make informed decisions when designing scalable data models.
Second Normal Form (2NF)
A database is considered to be in Second Normal Form (2NF) when it already satisfies the requirements of First Normal Form (1NF) and all non-key (non-prime) attributes are fully functionally dependent on the primary key.
In simpler terms, this means that every non-key field in a table must depend entirely on the table’s primary key and not on part of the key or on another attribute.
Continuing with the previous Account and Address example, the Account object includes columns such as Billing Address and Shipping Address. In this structure, the type of address (Billing or Shipping) is implicitly embedded in the column name itself. At the same time, the address entity may also capture the address type as a field in the Address table.
This creates a form of redundancy because the address type becomes indirectly tied to the Account record through the column definition, rather than being determined solely by the Address record itself.
To bring the design into Second Normal Form, the schema needs to be adjusted so that address type is stored only within the Address table and not embedded in the Account table structure. The Account table would simply reference address records, while the Address table would include an Address Type field (for example, Billing, Shipping, or Mailing).
Account ID | Account Name |
001A0000001XYZ | Acme Corp |
001A0000002XYZ | Global Telecom |
001A0000003XYZ | Northern Energy |
Address ID | Account ID | Address Type | Street | City | State | Country |
ADD001 | 001A0000001XYZ | Billing | 123 Main St | New York | NY | USA |
ADD002 | 001A0000001XYZ | Shipping | 456 Market St | Boston | MA | USA |
ADD003 | 001A0000002XYZ | Billing | 78 King St | London | London | UK |
ADD004 | 001A0000002XYZ | Shipping | 78 King St | London | London | UK |
ADD005 | 001A0000003XYZ | Billing | 900 Bay St | Toronto | ON | Canada |
ADD006 | 001A0000003XYZ | Shipping | 25 Industrial Rd | Calgary | AB | Canada |
With this design, the address type becomes dependent only on the Address record, eliminating the partial dependency and ensuring the model satisfies 2NF.
Third Normal Form (3NF)
A database is considered in the third normal form if it is already 2NF and a table doesn't have transitively dependent attributes. For example in the above if the account table has attributes that are already captured in the address table it would be in violation of 3NF.
Account ID | Account Name | Billing State | Shipping State |
001A0000001XYZ | Acme Corp | NY | MA |
001A0000002XYZ | Global Telecom | London | London |
001A0000003XYZ | Northern Energy | ON | AB |
Denormalization
Denormalization can be considered the opposite of normalization. While normalization focuses on reducing redundancy and organizing data into well-structured relational tables, denormalization intentionally introduces controlled redundancy into the data model.
The primary goal of denormalization is to improve performance and simplify data access, particularly for read-heavy operations such as reporting, analytics, and complex queries. By storing related data together, the system can reduce the need for joins or multi-object queries, which can improve query performance and reduce processing overhead during data retrieval. The below table is a denormalized version of the above Account table easy to perform analytics but hard to keep updated if address changes occur in separate address table.
Account ID | Account Name | Billing Street | Billing City | Billing State | Billing Postal Code | Billing Country | Shipping Street | Shipping City | Shipping State | Shipping Postal Code | Shipping Country |
001A0000001XYZ | Acme Corp | 123 Main St | New York | NY | 10001 | USA | 456 Market St | Boston | MA | 02108 | USA |
001A0000002XYZ | Global Telecom | 78 King St | London | London | EC1A 1BB | UK | 78 King St | London | London | EC1A 1BB | UK |
001A0000003XYZ | Northern Energy | 900 Bay St | Toronto | ON | M5S 2B1 | Canada | 25 Industrial Rd | Calgary | AB | T2P 1J9 | Canada |
However, denormalization introduces trade-offs. Because the same data may exist in multiple places, there is an additional responsibility to keep redundant data consistent and synchronized. Updates to a record may require changes across several objects or fields to maintain data integrity. This adds complexity to write operations and may require automation, triggers, or integration logic to ensure the replicated data remains aligned.
In practice, denormalization is often used selectively. Architects balance the benefits of faster reads and simplified reporting against the operational overhead required to maintain consistency. When applied thoughtfully, denormalization can significantly improve system performance for high-volume transactional systems and analytics-driven use cases.
Normalization vs Denormalization
Normalization
Organizes data based on logical relationships to reduce redundancy.
Data is separated into multiple related tables/objects.
Improves data integrity and consistency by ensuring each piece of data exists in only one place.
Changes to data are easier to manage because updates occur in a single location.
Requires joins or relationships to retrieve related data.
Based on normal forms (1NF, 2NF, 3NF) that progressively remove redundancy and enforce functional dependencies.
Well suited for transactional systems where data accuracy and consistency are critical.
Denormalization
Intentionally introduces controlled redundancy into the data model.
Stores related data within the same table/object to simplify data retrieval.
Improves read performance and query efficiency, especially for reporting and analytics.
Reduces the need for complex joins across objects.
Requires additional logic or automation to keep redundant data synchronized.
Increases storage usage and update complexity.
Often used in reporting, analytics, and high-read performance scenarios.
Architectural Perspective (Salesforce Context)
Normalization helps maintain clean data models and integrity.
Denormalization helps improve performance and reporting efficiency.
Effective Salesforce architecture typically requires a balance of both approaches depending on the use case.






Comments