Generating SQL with Perl 5
Background
SQL can be tedious to write. This is probably why things like ORMs (Object-Relational Mappers) were invented, but that's besides the point. ;-)
Specifically, I am going to show examples of generating INSERT and DELETE statements. :-)
INSERT and DELETE statements can become tedious when one wants to add or remove more than 1 row (eg. using VALUES).
The following examples are using my favorite relational database software, PostgreSQL ("Postgres"), with the assumption that a Postgres instance lives at some_host, with user some_user, who has rights to database some_db, which contains a table called users.
A simple, single INSERT
Assuming that the users table has a name column, we can add a single
user, "Alice," via a one-liner with the command-line client for Postgres like so:
psql -c "INSERT INTO users VALUES('Alice');"
Multiple INSERTs
What if there are several users to be added?
Simple loop
Once I had verified that command for a single INSERT, I wanted to try using a shell loop in GNU Bash to add several users from a text file at users.txt.
The first iteration of this idea was:
$ for name in $(cat users.txt); do \
psql -c "INSERT INTO users VALUES('$name');" \
-h some_host some_db some_user; \
done
...which works, but is inefficient: a database connection is made for each user-to-be-added.
If users.txt contains "Bob" and "Carol" (delimited by newlines), the result of the loop is that 2 commands are effectively run:
$ psql -c "INSERT INTO users VALUES('Bob');" \
-h some_host some_db some_user; \
$ psql -c "INSERT INTO users VALUES('Carol');" \
-h some_host some_db some_user; \
Turning the loop inside-out
Keep these psql(1) features in mind:
- The 'c' short option is meant for providing a SQL command, inline
- The 'f' short option is meant for providing a path to a file of commands to execute
The next iteration of this idea makes only 1 database connection, because only 1 command is effectively run:
$ psql -f \
<(for name in $(cat users.txt); do \
echo "INSERT INTO users VALUES('$name');"; \
done) \
-h some_host some_db some_user
...but this is still very inelegant: an INSERT statement is generated for each user!
<(for name ...) is a very odd filename, yes? That's because it's a named pipe (a Linux concept).
A very practical use of this feature could be to execute several SQL files together, eg. DDL to set up tables:
$ psql -f <(cat path/to/ddl/*.sql) \ -h some_host some_db some_user
A large, generated INSERT
We don't want multiple INSERT statements: what we really want is a single INSERT statement that actually takes advantage of the VALUES clause to include several values.
Perl program
Time to get a scripting language involved. :-D Perl to the rescue!
Consider the following hypothetical Perl program, insert_users (the \047 trick will come in handy later...thanks Stack Overflow answer!):
#!/usr/bin/perl
local $/;
binmode STDIN;
$_ = <STDIN>;
$sql = "INSERT INTO users (name) VALUES ";
@names_to_insert = split /\n/, $_;
$inserts = join ", ", map { "(\047$_\047)" } @names_to_insert;
$sql .= $inserts;
$sql .= ";";
syswrite STDOUT, $sql;
If the content of a file with 3 names ("Carlos," "Charlie," and "Chuck") is piped to this program (eg. cat users.txt | insert_users), the output will be just what is desired:
INSERT INTO users (name) VALUES ('Carlos'), ('Charlie'), ('Chuck');
Perl one-liner
In an attempt to keep this as a copy-and-paste solution, here is the program as a one-liner (well, it's more like a "multi-liner"):
cat users.txt | \
perl -e 'local $/; binmode STDIN; $_ = <STDIN>;' \
-e '$sql = "INSERT INTO users (name) VALUES ";' \
-e '@names_to_insert = split /\n/, $_;' \
-e '$inserts = join ", ", map { "(\047$_\047)" } @names_to_insert;' \
-e '$sql .= $inserts;' \
-e '$sql .= ";";' \
-e 'syswrite STDOUT, $sql;'
Named pipe + Perl one-liner
Finally, use all of the ideas explored to execute the generated SQL:
$ psql -f \
<(cat users.txt | \
perl -e 'local $/; binmode STDIN; $_ = <STDIN>;' \
-e '$sql = "INSERT INTO users (name) VALUES ";' \
-e '@names_to_insert = split /\n/, $_;' \
-e '$inserts = join ", ", map { "(\047$_\047)" } @names_to_insert;' \
-e '$sql .= $inserts;' \
-e '$sql .= ";";' \
-e 'syswrite STDOUT, $sql;') \
-h some_host some_db some_user
Bonus: Generated DELETE statement
Invert the idea to remove users.
Note that in Postgres, the syntax for using the VALUES clause with INSERT:
INSERT...VALUES ('foo'), ('bar')...
...is not the same as using the IN clause with DELETE (note the location of parentheses):
DELETE...IN ('foo', 'bar')...
...so the map and string concatenation bits of the multi-liner need to be tweaked:
$ psql -f \
<(cat users.txt | \
perl -e 'local $/; binmode STDIN; $_ = <STDIN>;' \
-e '$sql = "DELETE FROM users WHERE name IN ";' \
-e '@names_to_delete = split /\n/, $_;' \
-e '$deletes = join ", ", map { "\047$_\047" } @names_to_delete;' \
-e '$sql .= "(${deletes})";' \
-e '$sql .= ";";' \
-e 'syswrite STDOUT, $sql;') \
-h some_host some_db some_user