Every database fundamentally relies on tables to structure data. To enhance and customize data accessibility, databases employ the concept of 'Views'. In essence, using Views allows administrators to limit a user's access to only the data they're intended to see within a table. There are different types of views in PostgreSQL, with Views and Materialized Views being the primary categories. They can be distinguished by their unique characteristics and attributes.
In this article, we will explore the key differences between Views and Materialized Views. But first, let's establish a foundational understanding of both, complete with examples and their respective limitations.π
What is a View? π
A view in PostgreSQL is a virtual table created from the results of a SELECT operation. A view, like a true table, has rows and columns. A view's fields are taken from one or more actual tables in the database. Views may be used to contain the logic of filtering, sorting, joining, and so on as well as to simplify complicated queries. This implies that we may treat the view like a table, but it doesn't store data on its own instead, it derives data from the underlying tables.
Why use Views?π€
Simplification: Views can hide the complexity of the underlying database schema. Users can be given access to the data without exposing the underlying table's complexity.
Security: Using views, we can restrict the degree of exposure of the underlying tables to the end user and thus ensure some level of data security.
Reusability: Views can encapsulate frequently used, complicated queries, allowing developers to reuse them without having to rewrite the query.
Creating a View
CREATE VIEW view_name AS
SELECT column1, column2, ...
FROM table_name
WHERE condition;
Example: Suppose we have a table named employees
and we want to create a view that only shows the id
and name
columns for employees in the department HR
.
CREATE VIEW hr_employees AS
SELECT id, name
FROM employees
WHERE department = 'HR';
Accessing a View
We can query a view in the same manner as we would with a table:
SELECT * FROM hr_employees;
Modifying a View
To modify an existing view, we can use the CREATE OR REPLACE VIEW
statement:
CREATE OR REPLACE VIEW hr_employees AS
SELECT id, name, position
FROM employees
WHERE department = 'HR';
Deleting a View
We can drop (delete) a view using the DROP VIEW
statement:
DROP VIEW hr_employees;
Limitations of Views
Not all views are updatable in PostgreSQL. There are specific rules on which views can be updated.
Views are not optimized for performance. Since they are essentially saved queries, the underlying query runs every time you access the view.
In summary, views are a powerful feature in PostgreSQL, and they can greatly assist in both development and database administration tasks. They encapsulate query logic and can provide a level of abstraction over the raw tables, helping ensure cleaner, more readable SQL, as well as data security.
What is a Materialized View? π
A Materialized View (MV) is a database object that contains the results of a query and can be updated as needed from the original base tables. It's like a snapshot of a particular query, stored physically on disk, and can be refreshed either on demand or at regular intervals.
Why use Materialized Views?π€
Performance: Since data is already stored in the MV, accessing it can be much quicker than rerunning the underlying query, especially if the original query is complex or the base tables are very large.
Network Efficiency: If our MV resides in a different location than the original database, it can reduce the need to access the original database across a network.
Creating a Materialized View
CREATE MATERIALIZED VIEW view_name
AS
query
WITH [NO] DATA;
Example: Suppose we have a table named sales
and we want to create an MV with two columns: product_name
and total_sales.
CREATE MATERIALIZED VIEW total_sales_by_product
REFRESH FAST
AS
SELECT product_name, SUM(amount) AS total_sales
FROM sales
GROUP BY product_name;
The REFRESH FAST
option tells the database to perform an incremental refresh. This means only the rows that have changed since the last refresh will be updated.
Refreshing data for materialized views
To load data into a materialized view, you use the REFRESH MATERIALIZED VIEW
statement as shown below:
REFRESH MATERIALIZED VIEW view_name;
When we refresh data for a materialized view, PostgreSQL locks the entire table therefore we cannot query data against it. To avoid this, we can use the CONCURRENTLY
option.
REFRESH MATERIALIZED VIEW CONCURRENTLY view_name;
With CONCURRENTLY
option, PostgreSQL creates a temporary updated version of the materialized view, compares two versions, and performs INSERT
and UPDATE
only the differences.
Notice: The
CONCURRENTLY
option is only available from PostgreSQL 9.4.
Removing materialized views
Removing a materialized view is pretty straightforward as we have done for tables or views. This is done using the following statement:
DROP MATERIALIZED VIEW view_name;
Limitations of Materialized Views
Materialized Views (MVs) offer many advantages, especially in performance optimization. However, they come with their own set of limitations and complexities, which can vary based on the database management system (DBMS). Here are some general limitations and considerations:
Storage Overhead: Since MVs store a snapshot of the query result, they occupy additional storage space. If the underlying data is large, the MV can also be sizable.
Maintenance Overhead: Refreshing MVs, especially complex ones or those built on large datasets, can be resource-intensive. Regular refreshes can cause performance hits.
Initialization Time: The initial creation of a materialized view, especially on a large dataset, can be time-consuming.
Backup and Recovery: MVs add another layer of complexity to the database backup and recovery process.
View vs Materialized view
Now, that we have a basic understanding of the view and materialized view now let's look into the differences.
View | MATERIALIZED VIEW |
Dynamic Query Execution: A view is essentially a stored query that dynamically retrieves data directly from the underlying tables each time it is queried. It doesn't store any data itself; it merely provides a way to query data from one or multiple tables with a single SQL statement. | Static Query Execution: A materialized view stores the result of the query physically, and it needs to be refreshed to get the updated data from the underlying tables. |
Storage: Does not store data physically, itβs just a virtual table. It takes up no additional storage space (except for the metadata and query text). | Storage: Stores data physically, which means it occupies storage space. The data is a snapshot of the underlying query at the time of the last refresh. |
Performance: There may be a performance overhead during retrieval since the underlying query is executed every time the view is accessed. | Performance: This can provide performance benefits for complex queries since the data is pre-computed and stored. This makes data retrieval faster, at the expense of potentially serving stale data. |
Data Freshness: Always provides the most current data since it queries the underlying tables directly. | Data Freshness: Data can become stale between refreshes. You need to refresh the materialized view to get the latest data from the underlying tables. |
Maintenance: Generally requires less maintenance since it always reflects changes in the underlying tables. | Maintenance: Requires more maintenance, particularly in managing the refresh strategy to ensure data freshness. |
Usage: Often used to simplify queries, encapsulate complex queries, or restrict access to specific columns or rows in a table. | Usage: Useful in scenarios where the data doesn't change frequently and read performance is a priority over data freshness. |
Conclusion π
In conclusion, the choice between VIEW and MATERIALIZED VIEW in PostgreSQL hinges on data storage and retrieval tactics. A VIEW provides real-time data access, potentially sacrificing query speed for data freshness, particularly with complex queries. Meanwhile, a MATERIALIZED VIEW improves query performance by storing data snapshots, albeit potentially serving outdated data until refreshed. Thus, database system design should weigh the importance of data freshness against query performance to select the optimal view type.
Hope this article helped you to understand the main differences !!! PEACE π