

Incremental loading, the process of updating a target table with only the newest or modified data since the last load, provides several advantages over full data loads:
Stream:
Merge: This command takes two inputs: a target table and a stream of data. It compares the data in the stream with the data in the target table based on a specified join condition and performs the following actions:
Let’s see the how we can implement incremental load using Streams and merge command with an example:1. Create Source and Target tables as shown below. Here I am using RandomTrees as database and Sales as Information schema.---- cretaing source and target tables -----
CREATE OR REPLACE TABLE SALES(
sale_id INT PRIMARY KEY,
sale_date DATE,
customer_id INT,
product_id INT,
quantity INT,
total_amount DECIMAL(10,2)
);
CREATE OR REPLACE TABLE SALES_TARGET(
sale_id INT PRIMARY KEY,
sale_date DATE,
customer_id INT,
product_id INT,
quantity INT,
total_amount DECIMAL(10,2),
inserted_date TIMESTAMP,
updated_date TIMESTAMP
);2. Create Stream object on top of Source table i.e. Sales using below queryCREATE OR REPLACE STREAM SALES_STREAM ON TABLE SALES;

3. Now implement the following merge command to implement incremental load using SCD type-1 concept and stream. The Merge command zero rows inserted/deleted/updated as there is no data in stream yet.INSERT INTO sales (sale_id, sale_date, customer_id, product_id, quantity, total_amount) VALUES
(1, '2024-04-22', 1, 101, 2, 50.00),
(2, '2024-04-22', 2, 102, 1, 30.50),
(3,'2024-04-23', 1, 103, 3, 75.30),
(4,'2024-04-23', 3, 101, 1, 25.00),
(5,'2024-04-24', 2, 104, 2, 46.80);
select * from sales_stream;

4. Let’s insert 5 rows into source table and see how stream gets data loaded.

5. Now execute merge command to see whether these 5 records will be inserted or not into target table.update sales set total_amount = 48.80 where sale_id =5;
DELETE from sales where sale_id = 1;
select * from sales_stream;

6. Let’s see whether updated and deleted records in source will be replicated or not in target table.

7. Now let’s verify the concept with insert and update commands.update sales set total_amount = 40.50 where product_id= 102;
INSERT INTO sales (sale_id, sale_date, customer_id, product_id, quantity, total_amount) VALUES
(6,'2024-04-20', 4, 105, 1, 15.1),
(7,'2024-04-20', 4, 104, 1, 23.4);
select * from sales_stream;


Conclusion:Snowflake's Stream and Merge features offer powerful tools for implementing efficient incremental loading strategies. This approach can significantly improve performance and reduce the overall processing time, making it ideal for managing large, frequently changing datasets.