Redshift: Loops and Conditional Expressions
When we need to iterate through the data or implement logic conditionally, SQL’s looping and conditional expressions come in particularly handy. In this post, I’ll explain how to use them in Redshift SQL.
Redshift SQL - PL/pgsql
Redshift implements stored procedures using the PostgreSQL procedural language (PL/pgsql). Control structures in pgsqlpl can be utilised to build vital business logics for flexible data manipulation. Redshift can also use majority of plpgsql statements, but not all of them. For future reference, I’m listing the Redshift supported control structures in this post.
Control Structures in Redshift
Redshift’s control structures are usually divided into two categories:
- Conditional
- Iterative
Conditional Control Structures
A task can be carried out using conditional control structures under specific circumstances. In conditional control structures, IF statements are commonly utilised. These operate similarly to general-purpose programming languages. If a value in a table needs to be updated based on a conditional flow, for instance, IF statements can be utilised in this situation. In Redshift’s plpgsql, there are three types of conditional control structures available:
-
IF - THEN - END IF
-
IF - THEN - ELSE - END IF
-
IF - THEN - ELSIF - THEN - ELSE - END IF
CASE vs IF in SQL
CASE expressions and IF statements are similar. CASE can be substituted with IF, but not the other way around. While CASE is an expression that is used to return data values, IF is a statement that is frequently used for logic flow. Although it has several limitations, CASE is essentially like a function with built-in IF logic. CASE is limited in comparison to what an IF statement may accomplish.
Syntax for IF statements
Note: The following syntaxes are taken from the Redshift documentation.
IF-THEN-END IF
IF-THEN-ELSE-END IF
IF-THEN-ELSIF-THEN-ELSE-END IF
Example
Let’s say your teammate inserted a few rows into “open_jobs” table with wrong “profession”s as either ‘Data Analyst’ or ‘Data Scientist’ instead of ‘Data Engineer’. You have to update the “profession” of those rows to ‘Data Engineer’. If no wrong rows are there, then you have to re-insert the rows of ‘Data Engineer’ profession. If rows are there, raise a notice.
Iterative Control Structures
The execution flow of plpgsql commands can be managed with the use of iterative control statements. The following iterative control structures are supported by Redshift:
- LOOP
- EXIT
- FOR
- WHILE
- CONTINUE
LOOP and EXIT are keywords used in plpgsql procedures for starting and ending the loop respectively. WHILE and FOR loops operate similarly to general-purpose programming languages.
Lets see simple examples using LOOPS in Redshift SQL.
LOOP - EXIT - WHEN
output:
CALL redshift_simple_loop();
INFO: WAITING: 0
INFO: WAITING: 1
INFO: WAITING: 2
INFO: WAITING: 3
INFO: WAITING: 4
INFO: WAITING: 5
INFO: WAITING: 6
INFO: WAITING: 7
INFO: WAITING: 8
INFO: WAITING: 9
INFO: WAITING: 10
WHILE LOOP
FOR LOOP
output:
INFO: RRR(year: 2022)
INFO: Bahubali 2(year: 2017)
INFO: Bahubali(year: 2015)
Fibonacci using LOOPs in UDFs
I explained about using SQL UDFs with Redshift in my previous blog. We may execute sophisticated logics like computing the Fibonacci number for each value in a table column in SQL by using LOOPs in UDFs.
Further Information: