Generating SQL with Perl 5
Background
SQL can be tedious to write. This is probably why things like ORMs (Object-Relational Mappers) were invented, but that's besides the point. ;-)
Specifically, I am going to show examples of generating INSERT
and DELETE
statements. :-)
INSERT
and DELETE
statements can become tedious when one wants to add or remove more than 1 row (eg. using VALUES
).
The following examples are using my favorite relational database software, PostgreSQL ("Postgres"), with the assumption that a Postgres instance lives at some_host
, with user some_user
, who has rights to database some_db
, which contains a table called users
.
A simple, single INSERT
Assuming that the users
table has a name
column, we can add a single
user, "Alice," via a one-liner with the command-line client for Postgres like so:
psql -c "INSERT INTO users VALUES('Alice');"
Multiple INSERTs
What if there are several users to be added?
Simple loop
Once I had verified that command for a single INSERT, I wanted to try using a shell loop in GNU Bash to add several users from a text file at users.txt
.
The first iteration of this idea was:
$ for name in $(cat users.txt); do \ psql -c "INSERT INTO users VALUES('$name');" \ -h some_host some_db some_user; \ done
...which works, but is inefficient: a database connection is made for each user-to-be-added.
If users.txt
contains "Bob" and "Carol" (delimited by newlines), the result of the loop is that 2 commands are effectively run:
$ psql -c "INSERT INTO users VALUES('Bob');" \ -h some_host some_db some_user; \ $ psql -c "INSERT INTO users VALUES('Carol');" \ -h some_host some_db some_user; \
Turning the loop inside-out
Keep these psql(1)
features in mind:
- The 'c' short option is meant for providing a SQL command, inline
- The 'f' short option is meant for providing a path to a file of commands to execute
The next iteration of this idea makes only 1 database connection, because only 1 command is effectively run:
$ psql -f \ <(for name in $(cat users.txt); do \ echo "INSERT INTO users VALUES('$name');"; \ done) \ -h some_host some_db some_user
...but this is still very inelegant: an INSERT
statement is generated for each user!
<(for name ...)
is a very odd filename, yes? That's because it's a named pipe (a Linux concept).
A very practical use of this feature could be to execute several SQL files together, eg. DDL to set up tables:
$ psql -f <(cat path/to/ddl/*.sql) \ -h some_host some_db some_user
A large, generated INSERT
We don't want multiple INSERT
statements: what we really want is a single INSERT
statement that actually takes advantage of the VALUES
clause to include several values.
Perl program
Time to get a scripting language involved. :-D Perl to the rescue!
Consider the following hypothetical Perl program, insert_users
(the \047
trick will come in handy later...thanks Stack Overflow answer!):
#!/usr/bin/perl local $/; binmode STDIN; $_ = <STDIN>; $sql = "INSERT INTO users (name) VALUES "; @names_to_insert = split /\n/, $_; $inserts = join ", ", map { "(\047$_\047)" } @names_to_insert; $sql .= $inserts; $sql .= ";"; syswrite STDOUT, $sql;
If the content of a file with 3 names ("Carlos," "Charlie," and "Chuck") is piped to this program (eg. cat users.txt | insert_users
), the output will be just what is desired:
INSERT INTO users (name) VALUES ('Carlos'), ('Charlie'), ('Chuck');
Perl one-liner
In an attempt to keep this as a copy-and-paste solution, here is the program as a one-liner (well, it's more like a "multi-liner"):
cat users.txt | \ perl -e 'local $/; binmode STDIN; $_ = <STDIN>;' \ -e '$sql = "INSERT INTO users (name) VALUES ";' \ -e '@names_to_insert = split /\n/, $_;' \ -e '$inserts = join ", ", map { "(\047$_\047)" } @names_to_insert;' \ -e '$sql .= $inserts;' \ -e '$sql .= ";";' \ -e 'syswrite STDOUT, $sql;'
Named pipe + Perl one-liner
Finally, use all of the ideas explored to execute the generated SQL:
$ psql -f \ <(cat users.txt | \ perl -e 'local $/; binmode STDIN; $_ = <STDIN>;' \ -e '$sql = "INSERT INTO users (name) VALUES ";' \ -e '@names_to_insert = split /\n/, $_;' \ -e '$inserts = join ", ", map { "(\047$_\047)" } @names_to_insert;' \ -e '$sql .= $inserts;' \ -e '$sql .= ";";' \ -e 'syswrite STDOUT, $sql;') \ -h some_host some_db some_user
Bonus: Generated DELETE statement
Invert the idea to remove users.
Note that in Postgres, the syntax for using the VALUES
clause with INSERT
:
INSERT...VALUES ('foo'), ('bar')...
...is not the same as using the IN
clause with DELETE
(note the location of parentheses):
DELETE...IN ('foo', 'bar')...
...so the map
and string concatenation bits of the multi-liner need to be tweaked:
$ psql -f \ <(cat users.txt | \ perl -e 'local $/; binmode STDIN; $_ = <STDIN>;' \ -e '$sql = "DELETE FROM users WHERE name IN ";' \ -e '@names_to_delete = split /\n/, $_;' \ -e '$deletes = join ", ", map { "\047$_\047" } @names_to_delete;' \ -e '$sql .= "(${deletes})";' \ -e '$sql .= ";";' \ -e 'syswrite STDOUT, $sql;') \ -h some_host some_db some_user